[
https://issues.apache.org/jira/browse/LUCENE-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660122#comment-14660122
]
Michael McCandless commented on LUCENE-6717:
--------------------------------------------
Thanks for the feedback, I'll fold it.
bq. I'm curious about the assertion at the beginning of the doNext() method,
it's been both changed and commented out, should we just remove it if
invariants are hard to verify?
Oh yeah the invariant is wrong now, because for advance I push the old docID
back into the queue and then advance from in the queue ... I'll remove it.
bq. The one suggestion i have is to see in the future if it can always be
two-phased:
This is a good idea! I'll put a TODO ... it shouldn't be so hard, since I
already know all required terms, I can just take "the rest" and make into the
disjunction.
> TermAutomatonQuery should be two-phased
> ---------------------------------------
>
> Key: LUCENE-6717
> URL: https://issues.apache.org/jira/browse/LUCENE-6717
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Attachments: LUCENE-6717.patch
>
>
> {{TermAutomatonQuery}} (still in sandbox) is a simple way to get accurate
> query-time multi-token synonyms using the new {{SynonymGraphFilter}} from
> LUCENE-6664. It already has a utility class to directly translate an
> incoming {{TokenStream}} into a corresponding query.
> However the query is likely quite slow because it always iterates positions
> for all terms in the automaton.
> I think one simple approach is to walk the automaton and find the subset of
> terms (if any) that appear in common to all paths, and then approximate with
> {{ConjunctionDISI}} like {{PhraseQuery}} does. Such a subset doesn't always
> exist for an automaton (i.e. it could be empty), so the logic would have to
> be conditional...
> And I think there are more complex approximations we could make, but using
> {{ConjunctionDISI}} seems like a simple start.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]