On Fri, Nov 23, 2012 at 8:00 AM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote:
> Robert, am I right that stablishing the perf test is the first necessary > step, rather than the implementation itself. > Right, the best way to do this is to extend luceneutil ( http://code.google.com/a/apache-extras.org/p/luceneutil) to test this case. Keep in mind that I'd also be interested to see how BooleanScorer compares to BooleanScorer2 for this situation. I already mentioned on the solr list (nobody replied) that solr *never* gets BooleanScorer, but from time to time I hear solr users complaining about BooleanScorer2's performance for min-should-match So when trying to improve the performance of min-should-match, I think a very early step should be to see if we already have a better performing alternative that is just not being used: if thats the case then the best solution is to fix Solr's collectors to be able to cope with BooleanScorer. Intuitively I think its going to be like everything else, BS1 is better in some situations, BS2 in others. > Also, (don't really important but let me mention) what I'm really looking > for is the disjunction query with an user supplied verification strategy, > where minShouldMatch is just one of the way to verify match. > I don't think our concrete scorers should have such a hook: they should be as dead simple as possible. If you want to do this, I recommend just extending the abstract DisjunctionScorer (Currently DisjunctionSum and DisjunctionMax extend this, as I suggested we should think about splitting out a MinShouldMatchScorer as well: its confusing that pure disjunctions are all mixed up with min-should-match and the algorithms should actually work differently).