Hi András,
Thats a good catch! Do you want to correct that javadoc mistake and create a patch? https://wiki.apache.org/lucene-java/HowToContribute If you don't have a jira account, anyone can create it. https://issues.apache.org/jira/browse/lucene Ahmet On Thursday, March 5, 2015 11:15 AM, András Péteri <apet...@b2international.com> wrote: Sorry, I also got it wrong in the previous message. :) It goes 0.89f -> 123 -> 0.875f. On Thu, Mar 5, 2015 at 10:08 AM, András Péteri <apet...@b2international.com> wrote: > Hi Andrew, > > If you are using Lucene 3.6.1, you can take a look at the method which > creates a single byte value out of the received float using bit > manipulation at [1]. There is also a 256-element decoder table in > Similarity, where each byte corresponds to a decoded float value > computed by [2]. > > The first method encodes 0.89f to byte 123. 123 is decoded to 0.85f > via the second method, so it seems that the documentation is incorrect > in this regard. > > [1] > https://github.com/apache/lucene-solr/blob/lucene_solr_3_6_1/lucene/core/src/java/org/apache/lucene/util/SmallFloat.java#L75 > [2] > https://github.com/apache/lucene-solr/blob/lucene_solr_3_6_1/lucene/core/src/java/org/apache/lucene/util/SmallFloat.java#L88 > > On Thu, Mar 5, 2015 at 3:45 AM, wangdong <hrdxwa...@gmail.com> wrote: >> thank you for your disscussion. >> >> I am a junior user of lucene, so i am not**familiar with some deep concept >> you mentioned. >> my question is simple. I just want to know how to get 0.75 from >> decode(encode(0.89)) in offical document. >> >> why not 0.875? (0.875=0.5+0.25+0.125) >> >> thanks >> andrew >> >> 在 2015/3/4 22:54, Adrien Grand 写道: >>> >>> Norms and doc values are indeed using the same API. However >>> implementations differ a bit (eg. norms are stored in memory and use >>> different compression schemes). >>> >>> The precision loss is up to the similarity. You could write a >>> similarity impl which keeps full float precision, but scoring being >>> fuzzy anyway this would multiply your memory needs for norms by 4 >>> while not really improving the quality of the scores of your >>> documents. This precision loss is the right trade-off for most >>> use-cases. >>> >>> On Wed, Mar 4, 2015 at 3:04 PM, Ahmet Arslan <iori...@yahoo.com.invalid> >>> wrote: >>>> >>>> Hi Adrien, >>>> >>>> I read somewhere that norms are stored using docValues. >>>> In my understanding, docvalues can store lossless float values. >>>> So the question is, why are still several decode/encode methods exist in >>>> similarity implementations? >>>> Intuitively switching to docvalues for norms should prevent precision >>>> loss thing. >>>> >>>> Ahmet >>>> >>>> >>>> On Wednesday, March 4, 2015 3:22 PM, Adrien Grand <jpou...@gmail.com> >>>> wrote: >>>> Hi, >>>> >>>> Floats require 32 bits but norms are encoded on a single byte. So >>>> there is a precision loss when encoding float values into a single >>>> byte. In your example, 0.75 and 0.89 are sufficiently close to each >>>> other so that they are encoded to the same byte. >>>> >>>> >>>> On Wed, Mar 4, 2015 at 4:48 AM, wangdong <hrdxwa...@gmail.com> wrote: >>>>> >>>>> I read the article about the scoring section in lucene as follows: >>>>> >>>>> Encoding and decoding of the resulted float norm in a single byte are >>>>> done >>>>> by the static methods of the class Similarity:encodeNorm() >>>>> >>>>> <http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/search/Similarity.html#encodeNorm%28float%29>anddecodeNorm() >>>>> >>>>> <http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/search/Similarity.html#decodeNorm%28byte%29>. >>>>> Due to loss of precision, it is not guaranteed that decode(encode(x)) = >>>>> x, >>>>> e.g. decode(encode(0.89)) = 0.75. At scoring (search) time, this norm is >>>>> brought into the score of document as*norm(t, d)*, as shown by the >>>>> formula >>>>> inSimilarity >>>>> >>>>> <http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/search/Similarity.html>. >>>>> >>>>> I can not understand the formula decode(encode(0.89)) = 0.75 >>>>> how can i get the 0.75 from the left. >>>>> >>>>> Is anyone can help me ? >>>>> thanks ahead! >>>>> >>>>> andrew >>>> >>>> >>>> >>>> -- >>>> Adrien >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>> >>> >>> >> > > -- > András -- Péteri András --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org