Greets,
What would the consequences be of eliminating Similarity.queryNorm()?
I cargo-culted that method when porting, but now I'm going through and
trying to refactor for simplicity's sake. If I can zap it, I'd like to.
First, the theoretical angle:
According to the Similarity docs, queryNorm() doesn't impact document
ranking, since it is applied as a multiplier to the scores of all
matching docs. I don't see how it's all that useful, then.
How important is it that discrete queries, or queries against
different indexes, produce scores within a "comparable" range? It
seems to me that if you need that, you can always perform
normalization after the search completes by setting the top score to
1.0 and increasing/decreasing other scores proportionately. Are there
any cases where that solution wouldn't be adequate?
Second, the implementation angle:
Is it really true that document ranking is unaffected by queryNorm()?
It seems to me that when multi-level boolean queries are normalized,
clauses having different IDFs would end up with different
multipliers. Maybe I'm wrong -- it's always hard to wrap your head
around recursion -- but is the assertion that ranking is unaffected a
documentation glitch?
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]