[ 
https://issues.apache.org/jira/browse/LUCENE-6201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-6201:
---------------------------------
    Attachment: LUCENE-6201.patch

New patch: BooleanScorer can now also deal with minShouldMatch. The way it 
works is that it scores all windows of 2048 documents where at least 
minShouldMatch clauses have a match. However, there is no guarantee about the 
intersection of the matches so it is only used for minShouldMatch > 1 when 
matches are likely dense.

Here are results from the luceneutil benchmark:

{noformat}
                    TaskQPS baseline      StdDev   QPS patch      StdDev        
        Pct diff
     Low4MinShouldMatch4     1349.50      (6.4%)     1064.22      (3.7%)  
-21.1% ( -29% -  -11%)
     Low3MinShouldMatch4     1225.14     (11.9%)      977.61      (4.6%)  
-20.2% ( -32% -   -4%)
     Low4MinShouldMatch3     1040.26      (5.4%)      859.33      (3.0%)  
-17.4% ( -24% -   -9%)
     Low4MinShouldMatch2      316.21      (4.6%)      281.75      (2.5%)  
-10.9% ( -17% -   -3%)
     Low2MinShouldMatch4      349.07      (7.8%)      316.85      (4.8%)   
-9.2% ( -20% -    3%)
     Low3MinShouldMatch3      308.45      (5.4%)      280.00      (2.2%)   
-9.2% ( -15% -   -1%)
     Low4MinShouldMatch0       72.57      (2.9%)       74.43     (11.3%)    
2.6% ( -11% -   17%)
     Low2MinShouldMatch3       38.11     (10.5%)       39.30     (12.6%)    
3.1% ( -18% -   29%)
     Low3MinShouldMatch0       47.95      (2.4%)       49.45     (12.5%)    
3.1% ( -11% -   18%)
     Low1MinShouldMatch4       39.78      (9.7%)       41.05      (2.5%)    
3.2% (  -8% -   16%)
                PKLookup      316.64      (2.5%)      327.40      (3.6%)    
3.4% (  -2% -    9%)
     Low1MinShouldMatch0       30.13      (1.6%)       31.15     (12.8%)    
3.4% ( -10% -   18%)
     Low2MinShouldMatch0       35.75      (1.8%)       37.01     (12.6%)    
3.5% ( -10% -   18%)
     HighMinShouldMatch0       25.94      (1.4%)       26.90     (13.0%)    
3.7% ( -10% -   18%)
     Low3MinShouldMatch2       39.56     (10.3%)       47.62     (13.0%)   
20.4% (  -2% -   48%)
     HighMinShouldMatch4       22.28     (10.0%)       27.59     (15.5%)   
23.8% (  -1% -   54%)
     Low1MinShouldMatch3       22.25     (10.5%)       31.02     (16.4%)   
39.4% (  11% -   74%)
     Low2MinShouldMatch2       23.24     (10.3%)       35.63     (17.5%)   
53.3% (  23% -   90%)
     HighMinShouldMatch3       16.31     (10.0%)       26.31     (19.6%)   
61.3% (  28% -  101%)
     Low1MinShouldMatch2       17.24      (9.7%)       30.30     (21.2%)   
75.8% (  40% -  118%)
     HighMinShouldMatch2       13.98      (9.0%)       26.28     (23.2%)   
88.0% (  51% -  132%)
{noformat}

This time we have the slow queries that become faster but also the fast queries 
that become slower.

 * Queries with minShouldMatch=0 seem to be faster only because BooleanScorer 
is used for more queries which seems to make the JVM happy (if I modify the 
patch to stop using BooleanScorer when minShouldMatch > 0, it's not the case 
anymore)
 * On the other hand queries like Low4MinShouldMatch4 are slower. I tried to 
revert MinShouldMatchSumScorer to the previous impl and got similar results. It 
seems like the fact that MinShouldMatchSumScorer is not used for most queries 
in this benchmark anymore make the JVM unhappy

> MinShouldMatchSumScorer should advance less and score lazily
> ------------------------------------------------------------
>
>                 Key: LUCENE-6201
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6201
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: Trunk, 5.1
>
>         Attachments: LUCENE-6201.patch, LUCENE-6201.patch
>
>
> MinShouldMatchSumScorer currently computes the score eagerly, even on 
> documents that do not eventually match if it cannot find {{minShouldMatch}} 
> matches on the same document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to