Ops... I just realized that before moving to 2.9 I was getting straight the
list of Hits from the query. That was normalizing the score to 1
Now with 2.9 the Search that returns the Hits is deprecated, so I used one
that returns TopDocs using the following code (hope it's the right way to
use the new API, didn't find anything only about this)

TopDocs hits = searcher.Search(query, max);
int length = hits.scoreDocs.Length;
for (int i = 0; i < length; i++)
{
CreateResult(searcher.Doc(hits.scoreDocs[i].doc),hits.scoreDocs[i].score);
}

The problem is that TopDocs is an "expert" and seems like it's not
normalizing the score anymore.

Simone

On Wed, Jan 13, 2010 at 3:20 AM, Simone Chiaretta <
[email protected]> wrote:

> Hi all,
> I was trying to come out with a way not to show to the user results that
> are not pertinent with the query, and with then I mean showing only the top
> results.
>
> But the problem is that for some queries I might have 50 good results,
> while for other only the first 3 are relevant.
>
> With Lucene.net 2.3 I noticed that the top result was always around 1 as
> score (0.9 - 1.3) so I filtered out all the one below 0.1
> Now with Lucene.net 2.9 some queries have 14 as top score, and other have
> just 3.
>
> A quick approach would be just increasing the limit to 0.5.
> Another approach could be computing the top score and then setting the
> lower limit as percentage of this.
> The one is filtering based on the "long tail": usually I've the first
> results with high scores (for example 12, 11, 8, 6) and then dropping to a
> something like 4-5 hits around 2, and the slowly going to 0.7 and then
> suddenly dropping to 0.3. Not the the searches have the exact same scores,
> but the pattern is pretty consistent.
>
> Did anyone have that problem? How did you solve it?
> Simone
>
> --
> Simone Chiaretta
> Microsoft MVP ASP.NET - ASPInsider
> Blog: http://codeclimber.net.nz
> RSS: http://feeds2.feedburner.com/codeclimber
> twitter: @simonech
>
> Any sufficiently advanced technology is indistinguishable from magic
> "Life is short, play hard"
>



-- 
Simone Chiaretta
Microsoft MVP ASP.NET - ASPInsider
Blog: http://codeclimber.net.nz
RSS: http://feeds2.feedburner.com/codeclimber
twitter: @simonech

Any sufficiently advanced technology is indistinguishable from magic
"Life is short, play hard"

Reply via email to