document boost

Mike Grafton Wed, 30 Jan 2008 10:47:50 -0800

Hello folks,

We're trying to use Lucene's scoring to do a fairly basic thing: give a
document (in this case, we index "articles") a boost based on an integer
value that we know at index-time.  We want the  document boost to affect the
final document score linearly.


We thought that assigning a document boost based on this value would do the
trick, but the behavior we're seeing doesn't match what we expect given the
online documentation.  In fact, we see that a linear increase in document
boost yields an exponential increase in the 'fieldNorm' component of the
score for each term of the query that matched the document.    Here's a
small table of values that relate the document boost we pass in to the
fieldNorm contribution returned by Lucene:

boost  fieldNorm
1.0    0.3125    =  (5/16, 2^-1.678)
2.0    20.0      =  (2^4.3219, 2^1 * 10)
3.0    256.0     =  (2^8)
4.0    1280.0    =  (2^10.3219, 2^7 * 10)
5.0    5120.0    =  (2^12.3219, 2^9 * 10)
6.0    16384.0   =  (2^14.0)
7.0    40960.0   =  (2^15.3219, 2^12 * 10)
8.0    81920.0   =  (2^16.3219, 2^13 * 10)
10.0   327680.0  =  (2^18.3219, 2^15 * 10)

This example is using a query with two terms against a document that
contains those terms and a few others, in one searchable field.

Is this the way document boost is supposed to work?  Or have we
misconfigured something? If we cannot use document boost to affect scoring
linearly, is there some other technique we can use?

By the way, we're using SOLR to access Lucene.  We can give more information
if necessary, such as our SOLR schema.xml, if folks think that would help
explain things.  Let us know what other information we can provide.

Thanks,
Mike

document boost

Reply via email to