[
https://issues.apache.org/jira/browse/LUCENE-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290041#comment-14290041
]
Yonik Seeley commented on LUCENE-6198:
--------------------------------------
bq. Otherwise, if we just return "zoo" termsenum, it might save a little cpu
for the approximation intersection, but in many cases can result in more wasted
usages of positions
Oh, right... it's two phase. I guess I had some sort of half-baked multi-phase
in my head where for ("a boy" AND "the zoo") would look at the conjunction of
zoo and boy and only then look at "a" and "the" and then after all that look at
positions. I guess that could be accomplished with something like
{code}
DocIdSetIterator[] getApproximation()
{code}
And then a phrase query could return one for each term and the top level could
sort them all by cost. This could even allow all approximations to be bubbled
up to a higher level to handle nested approximations more efficiently? Not
sure if it's worth the complexity though... it's just random brainstorming.
> two phase intersection
> ----------------------
>
> Key: LUCENE-6198
> URL: https://issues.apache.org/jira/browse/LUCENE-6198
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Robert Muir
> Attachments: LUCENE-6198.patch
>
>
> Currently some scorers have to do a lot of per-document work to determine if
> a document is a match. The simplest example is a phrase scorer, but there are
> others (spans, sloppy phrase, geospatial, etc).
> Imagine a conjunction with two MUST clauses, one that is a term that matches
> all odd documents, another that is a phrase matching all even documents.
> Today this conjunction will be very expensive, because the zig-zag
> intersection is reading a ton of useless positions.
> The same problem happens with filteredQuery and anything else that acts like
> a conjunction.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]