I've seen references to score filtering in the list archives with frange
being the suggested solution, but I have a slightly different problem that I
don't think frange will solve. I basically want to drop a portion of the
results based on their score in relation to the other scores in the result
set. I've found that some queries produce poor results because they are
matching solely based on a field with a very low boost (a product
description in my case). Looking at the scores it's very obvious when the
result set transitions from good matches to just those pulled in by the
description.

I've come up with a solution on the client side of things, but need to move
this to running within solr because it doesn't play well with facets (facet
data is still returned for products that I'm stripping out). The basic
approach is to keep a running average of the highest scores, and when a
document's score is off by an order of magnitude drop it and everything else
(assuming everything is sorted by score desc). This approach seems to work
well because in some cases when users just enter 'long tail' terms I want
results to still be returned, which a static lower bound in frange won't
accommodate.

Does anyone have any suggestions for an approach to this? It doesn't look
like a filter has access to the scores. It doesn't look like I can subclass
SolrIndexSearcher as a number of its methods are private and can't be
overridden. It doesn't look like I can modify the ResponseBuilder's results
docset after the query but before faceting is applied because I don't have
access to the scorer (at least in a SearchComponent). I'm out of ideas for
now.

Thanks for any assistance,
      Bryan

Reply via email to