Re: Rich positions (was "boosting fields")

Marvin Humphrey Thu, 27 Apr 2006 13:51:20 -0700

Now that I think about it, putting the score-multiplier into theFreqFile does offer a benefit I hadn't considered before. It makesit possible to tie the score multiplier to a term within a doc,rather than a field within a doc.

Say you have a doc with a "body" field that's 1000 terms long, with 3instances of "foo", right near the top. Say you have another docwith a "body" that's 1000 terms long, and that it also has 3instances of "foo" but they're buried near the end, and therefore notas important.

Under the current implementation, these two docs will scoreidentically against a query for "body:foo", as the freq is identicaland so is the output of lengthNorm(1000). But if you stuff the scoremultiplier into the FreqFile, a sophisticated indexing app couldassign a higher score multiplier to the term "body:foo" in the firstdoc.

Associating a score multiplier with each position (a.k.a.WEIGHT_PER_POSITION, BOOST_PER_POSITION) would achieve the same end,but at the expense of much more processing per document, as the scorewould have to be built up position by position. OTOH, it wouldproduce more accurate results for queries where only certainpositions ought to be considered, such as phrase queries.


Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Rich positions (was "boosting fields")

Reply via email to