Usefulness of Similarity.queryNorm()

Marvin Humphrey Tue, 12 Feb 2008 09:09:26 -0800

Greets,

What would the consequences be of eliminating Similarity.queryNorm()?I cargo-culted that method when porting, but now I'm going through andtrying to refactor for simplicity's sake. If I can zap it, I'd like to.


First, the theoretical angle:

According to the Similarity docs, queryNorm() doesn't impact documentranking, since it is applied as a multiplier to the scores of allmatching docs. I don't see how it's all that useful, then.

How important is it that discrete queries, or queries againstdifferent indexes, produce scores within a "comparable" range? Itseems to me that if you need that, you can always performnormalization after the search completes by setting the top score to1.0 and increasing/decreasing other scores proportionately. Are thereany cases where that solution wouldn't be adequate?


Second, the implementation angle:

Is it really true that document ranking is unaffected by queryNorm()?It seems to me that when multi-level boolean queries are normalized,clauses having different IDFs would end up with differentmultipliers. Maybe I'm wrong -- it's always hard to wrap your headaround recursion -- but is the assertion that ranking is unaffected adocumentation glitch?


Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Usefulness of Similarity.queryNorm()

Reply via email to