[jira] [Commented] (LUCENE-7957) ConjunctionScorer.getChildren does not return all children

Adrien Grand (JIRA) Tue, 05 Sep 2017 08:39:48 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153853#comment-16153853
 ]


Adrien Grand commented on LUCENE-7957:
--------------------------------------

Thanks for the details.

bq. I think I hope we don't remove this API.

I see how it could be useful, like in your case. But at the same time I see 
multiple issues with this API:
 - With some scorers, being able to track the matching sub scorers would be 
additional overhead, eg. BooleanScorer today can't tell you which clauses 
matched a given document.
 - How do you check "did clause X or Y match?", you have to iterate over all 
scorers and see whether the one you are interested in is there?
 - Is it ok to perform heavy rewrites that make the scorer tree look very 
different from the query tree, or even make the clause you are interested in 
impossible to find? eg. by inlining nested disjunctions/conjunctions, rewriting 
a TermInSetQuery to a BooleanQuery, splitting a bbox query that crosses the 
dateline into 2 non-crossing bbox queries, rewriting a TermQuery to a 
MatchAllDocsQuery because scores are not needed and docFreq == maxDoc, etc.

I tried to think about how to address these issues but I don't have a good 
solution, especially about the last point: I think it would be a pity that an 
improvement to query execution be seen as a regression because it makes it 
harder to identify a matching clause.

I'd rather like this use-case to be addressed by consuming queries twice, once 
by Lucene so that it can build an efficient iterator, and once from a 
FilterScorer so that the score can be customized depending on whether a 
particular query matches.

I haven't removed it because there was disagreement every time I suggested that 
it should be removed but I don't see any way that we could support it as-is 
realistically. To me it's also interesting that it is the first time this bug 
is reported even though it has existed for almost two entire major versions 
(since 5.1) and affects one of our main scorers.

However we still expose this API, so +1 to fixing.

> ConjunctionScorer.getChildren does not return all children
> ----------------------------------------------------------
>
>                 Key: LUCENE-7957
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7957
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: master (8.0), 7.1
>
>
> Today it returns all scoring children and misses the `FILTER` clauses; I 
> think we just need to save the incoming `required` parameter to the ctor and 
> iterate over those in `getChildren` since `scorers` is a subset of `required`?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-7957) ConjunctionScorer.getChildren does not return all children

Reply via email to