[
https://issues.apache.org/jira/browse/LUCENE-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16886944#comment-16886944
]
Adrien Grand commented on LUCENE-8922:
--------------------------------------
Here is a patch. It uses the first clause that has a score greater than or
equal to the minimum competitive score to lead iteration of impacts and
propagates min competitive scores when the tie break multiplier is 0.
I ran wikibigall with the wikinightly tasks where I added 4 new tasks:
- DisMaxHighMed: same as OrHighMed but with a DisjunctionMaxQuery and a tie
break multiplier of 0.1
- DisMaxHighHigh: same as OrHighHigh but with a DisjunctionMaxQuery and a tie
break multiplier of 0.1
- DisMax0HighMed: same as OrHighMed but with a DisjunctionMaxQuery and a tie
break multiplier of 0
- DisMax0HighHigh: same as OrHighHigh but with a DisjunctionMaxQuery and a tie
break multiplier of 0
{noformat}
TaskQPS baseline StdDev QPS patch StdDev
Pct diff
Fuzzy1 177.71 (11.7%) 174.01 (11.2%)
-2.1% ( -22% - 23%)
SloppyPhrase 6.26 (6.1%) 6.23 (6.2%)
-0.4% ( -12% - 12%)
SpanNear 2.32 (3.0%) 2.32 (3.4%)
-0.0% ( -6% - 6%)
IntervalsOrdered 0.85 (1.7%) 0.85 (1.8%)
0.0% ( -3% - 3%)
Prefix3 47.79 (12.6%) 47.85 (12.7%)
0.1% ( -22% - 29%)
OrHighHigh 9.87 (2.8%) 9.89 (2.8%)
0.2% ( -5% - 5%)
Phrase 70.88 (3.2%) 71.04 (3.1%)
0.2% ( -5% - 6%)
Wildcard 128.13 (8.6%) 128.43 (9.0%)
0.2% ( -16% - 19%)
AndHighMed 65.61 (3.5%) 65.85 (2.9%)
0.4% ( -5% - 6%)
AndHighHigh 36.41 (3.4%) 36.60 (3.1%)
0.5% ( -5% - 7%)
AndHighOrMedMed 25.99 (2.0%) 26.13 (1.8%)
0.5% ( -3% - 4%)
OrHighMed 36.42 (2.7%) 36.61 (2.6%)
0.5% ( -4% - 5%)
Fuzzy2 92.96 (16.1%) 93.59 (13.7%)
0.7% ( -25% - 36%)
IntNRQ 132.08 (37.3%) 133.02 (38.0%)
0.7% ( -54% - 121%)
AndMedOrHighHigh 26.80 (2.0%) 27.07 (2.1%)
1.0% ( -3% - 5%)
Term 1308.93 (3.6%) 1331.58 (3.7%)
1.7% ( -5% - 9%)
DisMaxHighMed 83.40 (3.1%) 111.26 (3.0%)
33.4% ( 26% - 40%)
DisMaxHighHigh 54.28 (4.8%) 81.35 (4.1%)
49.9% ( 39% - 61%)
DisMax0HighHigh 45.39 (5.7%) 217.70 (20.1%)
379.6% ( 334% - 430%)
DisMax0HighMed 129.09 (3.9%) 905.16 (16.5%)
601.2% ( 558% - 646%)
{noformat}
> Speed up retrieval of top hits of DisjunctionMaxQuery
> -----------------------------------------------------
>
> Key: LUCENE-8922
> URL: https://issues.apache.org/jira/browse/LUCENE-8922
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
> Time Spent: 10m
> Remaining Estimate: 0h
>
> There a simple optimization that we are not doing in the case that
> tieBreakMultiplier is 0: we could propagate the min competitive score to sub
> clauses as-is.
> Even in the general case, we currently compute the block boundary of the
> DisjunctionMaxQuery as the minimum of the block boundaries of its sub
> clauses. This generates blocks that have very low score upper bounds but
> unfortunately they are also very small, which means that we might sometimes
> not make progress quickly enough.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]