Re: Using full norms (Was: Bubbling up newer records)

Michael McCandless Tue, 20 Jan 2009 15:06:52 -0800


Jiří Kuhn wrote:

Hello,

>
> Michael McCandless wrote:
>
> The upcoming Lucene in Action revision (now available onlinethrough Manning's MEAP) has a basic example of this (boosting byrecency) in the Advanced Search chapter, using function queries.
>
I have never used function queries before, but it was very easy toboost more recent documents with help of FieldScoreQuery.

Actually FieldScoreQuery is a function query, which pulls the doc'sscore from an indexed (NOT_ANALYZED) field statically (ie, notdependent on the query).

This may be quite common usage. The result is based on computationduring search time but the same result would be accomplished usingdocument boost during indexing time (and certainly faster with lessmemory used). But there is a difference - document boost is used tocompute document's norm value which is stored with precision loss(float encoded as byte).

It's also hard to take recency into account during indexing becausewhat's recent changes quickly with the passage of time (ie you'd haveto re-index frequently).

The question: Is still really an issue to encode norms as bytes? Dowe lose less than we gain?
Can someone imagine any real disadvantages of storing norms as full4-bytes float? Nowadays?

I think using only 1 byte is still important. There have been anumber of threads about the costly RAM usage of norms. But otherthreads have lamented the quantization... under LUCENE-1231 (column-stride fields) there's been discussion about switching norms over to asimple column-stride field; this way you'd be free to choose howevermany bytes you'd like to use... but that's a ways off.


Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Using full norms (Was: Bubbling up newer records)

Reply via email to