-1

This would be an incompatible change that could break lots of folks. Also, the range of values that you represent in your one-byte float format is less useful to most Lucene applications. Negative values are rarely used, and normalizing values to be between 0 and 1 is not always easy.

Can you please describe more about what you're trying to achieve? There are lots of other ways of efficiently implementing date-sorted search results. For example, you can add the documents to the index in chronological order, then use a HitFilter which collects the documents with the highest document id. That is very efficient and requires no changes to Lucene.

Cheers,

Doug

Nick Smith wrote:
Hi Luceners!

I am misusing the document score for date sorting (I display news
headlines in a chronological list).

As the document score is ultimately encoded as a byte the maximum
possible number of values is 256 minus the special value of 0
(document not found).

In the current implementation; all negative float values get
rounded up to zero by Similarity.floatToByte() and the method
Similarity.byteToFloat() returns only values in the range of
1 to 127 values that are greater than the decode for the
next lower byte value.

i.e. Similarity.byteToFloat(byteVal+1) > Similarity.byteToFloat(byteVal)

For my application having 255 possible scores from searches was better
than 127 so....

I have patched the Similarity class to encode negative floats into
the negative byte values and to decode the negative byte values back
into negative floats.

The encoding of the positive values are unchanged by this patch.

Could this version please be checked into CVS by someone with commit
rights?  Or is there are a more formal procedure to submitting patches,
say via the Bugzilla?

Many Thanks,

Nick Smith


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to