[
https://issues.apache.org/jira/browse/LUCENE-8204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558370#comment-16558370
]
Jim Ferenczi commented on LUCENE-8204:
--------------------------------------
{quote}
Could we somehow merge optIsRequiredBlock and optIsRequiredSegment to have
fewer variables to take care of? For instance could we somehow set
upTo=NO_MORE_DOCS so that optIsRequiredBlock=true's effect lasts til the end of
the segment instead of optIsRequiredSegment?
{quote}
I've done that in my first attempt but the benchmark showed no improvement for
the HighHigh case. The current patch can skip blocks even when the disjunction
is required on the entire segment so setting upTo to NO_MORE_DOCS would disable
this optim.
{quote}
advanceTarget does target = reqApproximation.advance(upTo + 1) and then
moveToNextBlock(target). Should we just do target = upTo+1 to avoid reading
postings? There might not be any matches in the next block and calling
advance() forces the postings reader to decompress the block, while I would
expect advanceTarget() to only advance the target based on impacts?
{quote}
I didn't know what to do here so I choose to use advance but I agree that
advanceTarget should only use impacts. I tested this change and it improves the
benchmark by a nice margin (nice call ;) ):
{noformat}
TaskQPS lucene_baseline StdDevQPS lucene_candidate StdDev
Pct diff
HighMed 48.81 (0.0%) 52.29 (0.0%) 7.1% ( 7%
- 7%)
HighHigh 14.47 (0.0%) 23.82 (0.0%) 64.6% ( 64%
- 64%)
HighLow 132.44 (0.0%) 312.50 (0.0%) 135.9% ( 135%
- 135%)
{noformat}
I'll modify the patch with this change.
{quote}
advanceShallow should check that optScorer.docID() is less than or equal to
target before calling advanceShallow on it?
{quote}
I didn't touch this part but I agree that it looks buggy. I'll add some tests
to stress the case where this scorer is shallow advanced (inside an inner
clause).
> ReqOptSumScorer should leverage sub scorers' per-block max scores
> -----------------------------------------------------------------
>
> Key: LUCENE-8204
> URL: https://issues.apache.org/jira/browse/LUCENE-8204
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-8204.patch
>
>
> Currently it only looks at max scores on the entire segment. Given that
> per-block max scores usually give lower upper bounds of the score, this
> should help.
> This is especially important for LUCENE-8197 to work well since the main
> query would typically be added as a MUST clauses of a boolean query while the
> query that scores on features would be a SHOULD clause.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]