Michael Barbarelli wrote:
What I do is run each entry in the hits collection through a
home-rolled levenstein distance algorithm to obtain a score. Then I
sort by score.
On Sep 8, 2009 9:44 PM, "Paul Taylor" <paul_t...@fastmail.fm
<mailto:paul_t...@fastmail.fm>> wrote:
Is there way to get complete start end matches to be first in the list
We use Lucene to search song albums titles typically one to ten words
long. If the user enter something like 'foo bar' everything that
contains foo bar is returned with max score , thats fine but it would
be better if an exact match is right at the top. Also although an OR
Search has been entered would be great if that it ranked matches
where both words are together higher than when they are not , but
still return results that only match one condirtion.
Ideally giving results in this order
* Foo Bar (exact match)
* The Foo Bar Somethings (substring - exact match)
* Bar Foo (all terms match)
* Bar Baz and the Foo (substring - all terms match)
* Foo (some terms match)
* Foo Something (substring - some terms match)
Is there something I can do in Lucene, or some way I can modify the
query (as entered by the user) to get results better aproaching this
Paul
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
<mailto:java-user-unsubscr...@lucene.apache.org>
For additional commands, e-mail: java-user-h...@lucene.apache.org
<mailto:java-user-h...@lucene.apache.org>
Thats sounds like the right algorithm but cannot this be done within
Lucene. The trouble is say I get a 1000 hits, I only want the first 10
but if I openly apply the algorithm to the first ten it might miss out
on the 11th which should really be the 5th, but if have to get all 1000
docs and apply algorithm its going to be a bit of an overhead.
Code excerpt might make it clearer:
TopScoreDocCollector collector =
TopScoreDocCollector.create(offset + limit, true);
searcher.search(parser.parse(query), collector);
Results results = new Results();
TopDocs topDocs = collector.topDocs();
results.offset = offset;
results.totalHits = topDocs.totalHits;
ScoreDoc docs[] = topDocs.scoreDocs;
float maxScore = topDocs.getMaxScore();
for (int i = offset; i < docs.length; i++) {
Result result = new Result();
result.score = docs[i].score / maxScore;
result.doc = new MbDocument(searcher.doc(docs[i].doc));
results.results.add(result);
}
return results;
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org