: java-user@lucene.apache.org
Sent: Thu, 19 May, 2011 14:36:56
Subject: Re: Ranking docs with all terms higher
A little test shows that Mike is correct and lucene does already do this.
With norms (default)
nacho foo bar, score=0.8660254
foo bar bar, score=0.46461558
nacho nacho nacho nacho, score
A little test shows that Mike is correct and lucene does already do this.
With norms (default)
nacho foo bar, score=0.8660254
foo bar bar, score=0.46461558
nacho nacho nacho nacho, score=0.19245009
Without norms
nacho foo bar, score=1.7320508
foo bar bar, score=0.92923117
nacho nacho nacho
I believe Lucene already does this, with the 'coord' factor in
BooleanQuery, which is on by default (ie, if you just "new
BooleanQuery()").
Ie your doc c will get a coord factor of 1.0, doc b gets 0.666..., doc
a gets 0..
That said, if the term freq is high enough (ie doc a has nacho 4
times)
Let's say I have the query
(nacho OR foo OR bar)
and some documents (single field with norms off)
doc a: nacho nacho nacho nacho
doc b: foo bar bar
doc c: nacho foo bar
I'm interested in all of these documents but I would like c to score the
highest since it contains all of the search terms, b to