Yonik Seeley wrote:
On Sun, Oct 11, 2009 at 6:04 PM, Lance Norskog <goks...@gmail.com> wrote:
And the other important
thing to know about boost values is that the dynamic range is about
6-8 bits

That's an index-time boost - an 8 bit float with 5 bits of mantissa
and 3 bits of exponent.
Query time boosts are normal 32 bit floats.

To be more specific: index-time float encoding does not permit negative numbers (see SmallFloat), but query-time boosts can be negative, and they DO affect the score - see below. BTW, standard Collectors collect only results with positive scores, so if you want to collect results with negative scores as well then you need to use a custom Collector.

-----------------------------------------------
BeanShell 2.0b4 - by Pat Niemeyer (p...@pat.net)
bsh % import org.apache.lucene.search.*;
bsh % import org.apache.lucene.index.*;
bsh % import org.apache.lucene.store.*;
bsh % import org.apache.lucene.document.*;
bsh % import org.apache.lucene.analysis.*;
bsh % tq = new TermQuery(new Term("a", "b"));
bsh % print(tq);
a:b
bsh % tq.setBoost(-1);
bsh % print(tq);
a:b^-1.0
bsh % q = new BooleanQuery();
bsh % tq1 = new TermQuery(new Term("a", "c"));
bsh % tq1.setBoost(10);
bsh % q.add(tq1, BooleanClause.Occur.SHOULD);
bsh % q.add(tq, BooleanClause.Occur.SHOULD);
bsh % print(q);
a:c^10.0 a:b^-1.0
bsh % dir = new RAMDirectory();
bsh % w = new IndexWriter(dir, new WhitespaceAnalyzer());
bsh % doc = new Document();
bsh % doc.add(new Field("a", "b c d", Field.Store.YES, Field.Index.ANALYZED));
bsh % w.addDocument(doc);
bsh % w.close();
bsh % r = IndexReader.open(dir);
bsh % is = new IndexSearcher(r);
bsh % td = is.search(q, 10);
bsh % sd = td.scoreDocs;
bsh % print(sd.length);
1
bsh % print(is.explain(q, 0));
0.1373985 = (MATCH) sum of:
  0.15266499 = (MATCH) weight(a:c^10.0 in 0), product of:
    0.99503726 = queryWeight(a:c^10.0), product of:
      10.0 = boost
      0.30685282 = idf(docFreq=1, numDocs=1)
      0.32427183 = queryNorm
    0.15342641 = (MATCH) fieldWeight(a:c in 0), product of:
      1.0 = tf(termFreq(a:c)=1)
      0.30685282 = idf(docFreq=1, numDocs=1)
      0.5 = fieldNorm(field=a, doc=0)
  -0.0152664995 = (MATCH) weight(a:b^-1.0 in 0), product of:
    -0.099503726 = queryWeight(a:b^-1.0), product of:
      -1.0 = boost
      0.30685282 = idf(docFreq=1, numDocs=1)
      0.32427183 = queryNorm
    0.15342641 = (MATCH) fieldWeight(a:b in 0), product of:
      1.0 = tf(termFreq(a:b)=1)
      0.30685282 = idf(docFreq=1, numDocs=1)
      0.5 = fieldNorm(field=a, doc=0)

bsh %


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to