Jiří Kuhn wrote:
Hello,
>
> Michael McCandless wrote:
>
> The upcoming Lucene in Action revision (now available online
through Manning's MEAP) has a basic example of this (boosting by
recency) in the Advanced Search chapter, using function queries.
>
I have never used function queries before, but it was very easy to
boost more recent documents with help of FieldScoreQuery.
Actually FieldScoreQuery is a function query, which pulls the doc's
score from an indexed (NOT_ANALYZED) field statically (ie, not
dependent on the query).
This may be quite common usage. The result is based on computation
during search time but the same result would be accomplished using
document boost during indexing time (and certainly faster with less
memory used). But there is a difference - document boost is used to
compute document's norm value which is stored with precision loss
(float encoded as byte).
It's also hard to take recency into account during indexing because
what's recent changes quickly with the passage of time (ie you'd have
to re-index frequently).
The question: Is still really an issue to encode norms as bytes? Do
we lose less than we gain?
Can someone imagine any real disadvantages of storing norms as full
4-bytes float? Nowadays?
I think using only 1 byte is still important. There have been a
number of threads about the costly RAM usage of norms. But other
threads have lamented the quantization... under LUCENE-1231 (column-
stride fields) there's been discussion about switching norms over to a
simple column-stride field; this way you'd be free to choose however
many bytes you'd like to use... but that's a ways off.
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org