[jira] [Updated] (LUCENE-6198) two phase intersection

Adrien Grand (JIRA) Thu, 12 Feb 2015 03:22:50 -0800

     [ 
https://issues.apache.org/jira/browse/LUCENE-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Adrien Grand updated LUCENE-6198:
---------------------------------
    Attachment: phrase_intersections.tasks

I built some tasks for intersections of phrases with terms and ran luceneutil 
on it to validate that it does indeed speed up such queries:

{noformat}
                    TaskQPS baseline      StdDev   QPS patch      StdDev        
        Pct diff
                PKLookup      247.13      (2.0%)      248.14      (1.9%)    
0.4% (  -3% -    4%)
     AndMedPhraseLowTerm       13.74      (0.7%)       14.67      (2.8%)    
6.7% (   3% -   10%)
   AndHighPhraseHighTerm        6.03      (0.9%)        6.45      (0.8%)    
7.0% (   5% -    8%)
    AndMedPhraseHighTerm       45.62      (2.6%)       49.62      (1.7%)    
8.8% (   4% -   13%)
     AndMedPhraseMedTerm       49.14      (2.8%)       58.40      (5.7%)   
18.8% (  10% -   28%)
    AndHighPhraseMedTerm       11.81      (1.5%)       15.02      (2.2%)   
27.1% (  23% -   31%)
    AndHighPhraseLowTerm       31.43      (3.5%)       41.39      (6.2%)   
31.7% (  21% -   42%)
{noformat}

> two phase intersection
> ----------------------
>
>                 Key: LUCENE-6198
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6198
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>         Attachments: LUCENE-6198.patch, LUCENE-6198.patch, LUCENE-6198.patch, 
> LUCENE-6198.patch, phrase_intersections.tasks
>
>
> Currently some scorers have to do a lot of per-document work to determine if 
> a document is a match. The simplest example is a phrase scorer, but there are 
> others (spans, sloppy phrase, geospatial, etc).
> Imagine a conjunction with two MUST clauses, one that is a term that matches 
> all odd documents, another that is a phrase matching all even documents. 
> Today this conjunction will be very expensive, because the zig-zag 
> intersection is reading a ton of useless positions.
> The same problem happens with filteredQuery and anything else that acts like 
> a conjunction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (LUCENE-6198) two phase intersection

Reply via email to