Jiří Kuhn wrote:

Hello,

>
> Michael McCandless wrote:
>
> The upcoming Lucene in Action revision (now available online through Manning's MEAP) has a basic example of this (boosting by recency) in the Advanced Search chapter, using function queries.
>
I have never used function queries before, but it was very easy to boost more recent documents with help of FieldScoreQuery.

Actually FieldScoreQuery is a function query, which pulls the doc's score from an indexed (NOT_ANALYZED) field statically (ie, not dependent on the query).

This may be quite common usage. The result is based on computation during search time but the same result would be accomplished using document boost during indexing time (and certainly faster with less memory used). But there is a difference - document boost is used to compute document's norm value which is stored with precision loss (float encoded as byte).

It's also hard to take recency into account during indexing because what's recent changes quickly with the passage of time (ie you'd have to re-index frequently).

The question: Is still really an issue to encode norms as bytes? Do we lose less than we gain?

Can someone imagine any real disadvantages of storing norms as full 4-bytes float? Nowadays?

I think using only 1 byte is still important. There have been a number of threads about the costly RAM usage of norms. But other threads have lamented the quantization... under LUCENE-1231 (column- stride fields) there's been discussion about switching norms over to a simple column-stride field; this way you'd be free to choose however many bytes you'd like to use... but that's a ways off.

Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to