[ 
https://issues.apache.org/jira/browse/LUCENE-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-6198:
--------------------------------
    Attachment: LUCENE-6198.patch

Here is a hack patch (i really am not totally happy with many things about it) 
as a prototype.

DISI gets an optional method to return an approximation, meaning it can return 
false positive matches for intersection, and you must verify its really a match 
via a separate method.

I only implemented this for conjunctionquery and phrasequery initially. It 
works across nested conjunctions as well, so its global agreement and will work 
with filters too if we fix FilteredQuery.

I think we can remove FilteredQuery's execution mode of QUERY_FIRST (for slow 
filters like geo) if we go with this. Instead such slow filters like geo ones 
should implement this api, and return a bounding box or whatever as the 
approximation. If they are so slow they have no approximation, they can return 
MatchAllDocs as the approximation, and its still better, because it will work 
within nested clauses, etc.

> two phase intersection
> ----------------------
>
>                 Key: LUCENE-6198
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6198
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>         Attachments: LUCENE-6198.patch
>
>
> Currently some scorers have to do a lot of per-document work to determine if 
> a document is a match. The simplest example is a phrase scorer, but there are 
> others (spans, sloppy phrase, geospatial, etc).
> Imagine a conjunction with two MUST clauses, one that is a term that matches 
> all odd documents, another that is a phrase matching all even documents. 
> Today this conjunction will be very expensive, because the zig-zag 
> intersection is reading a ton of useless positions.
> The same problem happens with filteredQuery and anything else that acts like 
> a conjunction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to