[
https://issues.apache.org/jira/browse/LUCENE-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated LUCENE-6198:
--------------------------------
Attachment: LUCENE-6198.patch
Here is a hack patch (i really am not totally happy with many things about it)
as a prototype.
DISI gets an optional method to return an approximation, meaning it can return
false positive matches for intersection, and you must verify its really a match
via a separate method.
I only implemented this for conjunctionquery and phrasequery initially. It
works across nested conjunctions as well, so its global agreement and will work
with filters too if we fix FilteredQuery.
I think we can remove FilteredQuery's execution mode of QUERY_FIRST (for slow
filters like geo) if we go with this. Instead such slow filters like geo ones
should implement this api, and return a bounding box or whatever as the
approximation. If they are so slow they have no approximation, they can return
MatchAllDocs as the approximation, and its still better, because it will work
within nested clauses, etc.
> two phase intersection
> ----------------------
>
> Key: LUCENE-6198
> URL: https://issues.apache.org/jira/browse/LUCENE-6198
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Robert Muir
> Attachments: LUCENE-6198.patch
>
>
> Currently some scorers have to do a lot of per-document work to determine if
> a document is a match. The simplest example is a phrase scorer, but there are
> others (spans, sloppy phrase, geospatial, etc).
> Imagine a conjunction with two MUST clauses, one that is a term that matches
> all odd documents, another that is a phrase matching all even documents.
> Today this conjunction will be very expensive, because the zig-zag
> intersection is reading a ton of useless positions.
> The same problem happens with filteredQuery and anything else that acts like
> a conjunction.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]