Re: 'Down' boosting shorter docs
Another approach is to change the document length normalization formula. See Similarity.lengthNorm() in Lucene. wunder On Oct 15, 2009, at 12:45 AM, Andrea D'Ippolito wrote: I've read (correct me if I'm wrong) that a solution to achieve that is overboost all the other fields. but I guess this works easily only if u have few fields indexed ;) bye 2009/10/15 Simon Wistow si...@thegestalt.org Our index has some items in it which basically contain a title and a single word body. If the user searches for a word in the title (especially if title is of itself only oen word) then that doc will get scored quite highly, despite the fact that, in this case, it's not really relevant. I've tried something like qf=title^2.0 content^0.5 bf=num_pages but that disproportionally boosts long documents to the detriment of relevancy bf=product(num_pages,0.05) has no effect but bf=product(num_pages,0.06) has a bunch of long documents which don't seem to return any highlighted fields plus the short document with only the query in the title which is progress in that it's almost exactly the opposite of what I want. Any suggestions? Am I going to need to reindex and add the length in bytes or characters of the document? Simon
'Down' boosting shorter docs
Our index has some items in it which basically contain a title and a single word body. If the user searches for a word in the title (especially if title is of itself only oen word) then that doc will get scored quite highly, despite the fact that, in this case, it's not really relevant. I've tried something like qf=title^2.0 content^0.5 bf=num_pages but that disproportionally boosts long documents to the detriment of relevancy bf=product(num_pages,0.05) has no effect but bf=product(num_pages,0.06) has a bunch of long documents which don't seem to return any highlighted fields plus the short document with only the query in the title which is progress in that it's almost exactly the opposite of what I want. Any suggestions? Am I going to need to reindex and add the length in bytes or characters of the document? Simon
Re: 'Down' boosting shorter docs
A multiplicative boost may work better than one added in: http://lucene.apache.org/solr/api/org/apache/solr/search/BoostQParserPlugin.html -Yonik http://www.lucidimagination.com On Wed, Oct 14, 2009 at 7:21 PM, Simon Wistow si...@thegestalt.org wrote: Our index has some items in it which basically contain a title and a single word body. If the user searches for a word in the title (especially if title is of itself only oen word) then that doc will get scored quite highly, despite the fact that, in this case, it's not really relevant. I've tried something like qf=title^2.0 content^0.5 bf=num_pages but that disproportionally boosts long documents to the detriment of relevancy bf=product(num_pages,0.05) has no effect but bf=product(num_pages,0.06) has a bunch of long documents which don't seem to return any highlighted fields plus the short document with only the query in the title which is progress in that it's almost exactly the opposite of what I want. Any suggestions? Am I going to need to reindex and add the length in bytes or characters of the document? Simon