Re: boosting based on number of terms matched?

2011-02-25 Thread Chris Hostetter

: I'm using the edismax handler, although my question is probably the same for
: dismax. When the user types a long query, I use the mm parameter so that
: only 75% of terms need to match. This works fine, however, sometimes documents
: that only match 75% of the terms show up higher in my results than documents
: that match 100%. I'd like to set a boost so that documents that match 100%
: will be much more likely to be put ahead of documents that only match 75%. Can
: anyone give me a pointer of how to do this? Thanks,

this is essentially the default behavior -- mm just sets a minimum 
number of clauses to be considered a match, but the coord factor still 
applies and penalizes docs based on how many clauses they don't match.

if you are seeing docs that match fewer terms score higher then docs 
matching more terms it is likely because of the boosts you already have 
specified (either in the qf, or maybe using the bf), but the discrepency 
could be based on other standard scoring factors as well (lengthNorm, 
index time doc boosts, the IDF of the terms, etc...

this is where it beocmes neccessary to start looking at score explanations 
and really thinking through the data.


-Hoss


boosting based on number of terms matched?

2011-02-24 Thread DarkNovaNick
I'm using the edismax handler, although my question is probably the same  
for dismax. When the user types a long query, I use the mm parameter so  
that only 75% of terms need to match. This works fine, however, sometimes  
documents that only match 75% of the terms show up higher in my results  
than documents that match 100%. I'd like to set a boost so that documents  
that match 100% will be much more likely to be put ahead of documents that  
only match 75%. Can anyone give me a pointer of how to do this? Thanks,


Nick