Re: Whither Query Norm?

Jake Mannix Fri, 20 Nov 2009 08:15:27 -0800

The fact Lucene Similarity is most decidely *not* cosine similarity, but
strongly resembles it with the queryNorm() in there, means that I personally
would certainly like to see this called out, at least in the documentation.


As for performance, is the queryNorm() called ever in any loops?  It's all
set up in the construction of the Weight, right?  Which means that by the
time you're doing scoring, all the weighting factors are already factored
into one?  What's the performance issue which would be saved here?

  -jake

On Fri, Nov 20, 2009 at 7:56 AM, Grant Ingersoll <[email protected]>wrote:

> For a long time now, we've been telling people not to compare scores across
> queries, yet we maintain the queryNorm() code as an attempt to do this and
> the javadocs even promote it.  I'm in the process of researching this some
> more (references welcomed), but wanted to hear what people think about it
> here.  I haven't profiled it just yet, but it seems like a good chunk of
> wasted computation to me (loops, divisions and square roots).  At a minimum,
> I think we might be able to refactor the callback mechanism for it just as
> we did for the collectors, such that we push of the actual calculation of
> the sum of squares into Similarity, instead of just doing 1/sqrt(sumSqs).
>  That way, when people want to override queryNorm() to return 1, they are
> saving more than just the 1/sqrt calculation.  I haven't tested it yet, but
> wanted to find out what others think.
>
> Thoughts?
>
> -Grant
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: Whither Query Norm?

Reply via email to