[ 
https://issues.apache.org/jira/browse/LUCENE-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15860920#comment-15860920
 ] 

ASF subversion and git services commented on LUCENE-7643:
---------------------------------------------------------

Commit a36ebaa90c95d8be6411464c237593a1ff825af0 in lucene-solr's branch 
refs/heads/branch_6x from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a36ebaa ]

LUCENE-7643,SOLR-10013: Reenable the single-value optimization for sorted dv 
too.


> Move IndexOrDocValuesQuery to queries (or core?)
> ------------------------------------------------
>
>                 Key: LUCENE-7643
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7643
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Priority: Minor
>             Fix For: master (7.0), 6.5
>
>         Attachments: LUCENE-7643.patch
>
>
> I was just doing some benchmarking to check that IndexOrDocValues actually 
> makes things faster when it is supposed to:
> {noformat}
>                     TaskQPS baseline      StdDev   QPS patch      StdDev      
>           Pct diff
>                  Range25       30.27      (0.6%)       29.22      (4.7%)   
> -3.5% (  -8% -    1%)
>                  Range10       66.74      (0.9%)       64.52      (4.2%)   
> -3.3% (  -8% -    1%)
>                   Term35       18.59      (1.6%)       18.16      (1.9%)   
> -2.3% (  -5% -    1%)
>                   Term02      274.98      (1.1%)      269.47      (1.9%)   
> -2.0% (  -4% -    1%)
>         AndTerm35Range10       26.82      (2.5%)       26.50      (2.8%)   
> -1.2% (  -6% -    4%)
>         AndTerm02Range25       56.27      (1.3%)       99.04      (7.9%)   
> 76.0% (  65% -   86%)
> {noformat}
> In the above results, the number after the query type indicates the 
> percentage of docs in the index that it matches. With the baseline, range 
> queries are simple point range queries, while the patch is an 
> {{IndexOrDocValuesQuery}} that wraps both a point range query and a doc 
> values query that matches the same documents. As expected, 
> {{AndTerm35Range10}} performs the same in both cases since the range is 
> supposed to lead the iteration, so the {{IndexOrDocValuesQuery}} is rewritten 
> to the wrapped point range query. However with {{AndTerm02Range25}} the range 
> cost is higher than the term cost so the range is only used for verifying 
> matches and the {{IndexOrDocValuesQuery}} rewrites to the wrapped doc values 
> query, yielding a speedup since we do not have to evaluate the range against 
> the whole index.
> I think the -2/-3% difference we are seeing for everything else than 
> {{AndTerm02Range25}} is noisy since term queries execute exactly the same way 
> in both cases, yet they have this slight slowdown too.
> I would like to make it easier to use by moving {{IndexOrDocValuesQuery}} and 
> {{DocValuesRangeQuery}} to a different module than sandbox, and giving the 
> doc values range query an API that is closer to point ranges by making the 
> bounds required (null disallowed) and removing the {{includeLower}} and 
> {{includeUpper}} parameters. I wanted to move to {{queries}} initially but 
> maybe {{core}} is better, that way we could link from the point API to 
> {{IndexOrDocValuesQuery}} as a way to make queries on fields that have both 
> points and doc values more efficient.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to