[ https://issues.apache.org/jira/browse/LUCENE-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780897#action_12780897 ]
Lance Norskog edited comment on LUCENE-1360 at 11/21/09 3:03 AM: ----------------------------------------------------------------- This is a graph of the standard norms against the results of this patch. The orange/red dots at the left are the elevated values for boosting short documents. Both displays show the norms after the 8-bit encode/decode process, rather than raw 1/x. Here is the code for the generator: {code} public class FloatEncode { private static float ARR[] = { 0.0f, 1.5f, 1.25f, 1.0f, 0.875f, 0.75f, 0.625f, 0.5f, 0.4375f, 0.375f, 0.3125f}; public static void main(String[] args) { for(int i = 1; i < 100; i++) { float f = i; f = 1/f; byte b = SmallFloat.floatToByte315(f); float f2 = SmallFloat.byte315ToFloat(b); float ff = f2; if (i < ARR.length) ff = ARR[i]; System.out.println(i + "," + f2 + "," + ff); } } } {code} (Please pretend I named it LUCENE-1360 instead of LUCENE-1380.) was (Author: lancenorskog): This is a graph of the standard norms against the results of this patch. The orange/red dots at the left are the elevated values for boosting short documents. Both displays show the norms after the 8-bit encode/decode process, rather than raw 1/x. Here is the code for the generator: {code} public class FloatEncode { private static float ARR[] = { 0.0f, 1.5f, 1.25f, 1.0f, 0.875f, 0.75f, 0.625f, 0.5f, 0.4375f, 0.375f, 0.3125f}; public static void main(String[] args) { for(int i = 1; i < 100; i++) { float f = i; f = 1/f; byte b = SmallFloat.floatToByte315(f); float f2 = SmallFloat.byte315ToFloat(b); float ff = f2; if (i < ARR.length) ff = ARR[i]; System.out.println(i + "," + f2 + "," + ff); } } } {code} > A Similarity class which has unique length norms for numTerms <= 10 > ------------------------------------------------------------------- > > Key: LUCENE-1360 > URL: https://issues.apache.org/jira/browse/LUCENE-1360 > Project: Lucene - Java > Issue Type: Improvement > Reporter: Sean Timm > Assignee: Otis Gospodnetic > Priority: Trivial > Attachments: LUCENE-1380 visualization.pdf, > ShortFieldNormSimilarity.java > > > A Similarity class which extends DefaultSimilarity and simply overrides > lengthNorm. lengthNorm is implemented as a lookup for numTerms <= 10, else > as {{1/sqrt(numTerms)}}. This is to avoid term counts below 11 from having > the same lengthNorm after stored as a single byte in the index. > This is useful if your search is only on short fields such as titles or > product descriptions. > See mailing list discussion: > http://www.nabble.com/How-to-boost-the-score-higher-in-case-user-query-matches-entire-field-value-than-just-some-words-within-a-field-td19079221.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org