[
https://issues.apache.org/jira/browse/LUCENE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014278#comment-14014278
]
Robert Muir commented on LUCENE-5718:
-------------------------------------
For the actual multitermqueries themselves, we could consider a method to get
an automaton representation of what they do. on one hand, its specific to
highlighting, on the other, i dont know a better way that avoids a very
expensive rewrite against the entire index.
As far as punching through the query structure, we have a similar thing
(extractTerms) geared at highlighting-type things. it avoids MUST_NOT clauses
for example. We could consider an extractQueries... maybe there is a cleaner
solution.
> More flexible compound queries (containing mtq) support in postings
> highlighter
> -------------------------------------------------------------------------------
>
> Key: LUCENE-5718
> URL: https://issues.apache.org/jira/browse/LUCENE-5718
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/highlighter
> Affects Versions: 4.8.1
> Reporter: Luca Cavanna
>
> The postings highlighter currently pulls the automata from multi term queries
> and doesn't require calling rewrite to make highlighting work. In order to do
> so it also needs to check whether the query is a compound one and eventually
> extract its subqueries. This is currently done in the MultiTermHighlighting
> class and works well but has two potential problems:
> 1) not all the possible compound queries are necessarily supported as we need
> to go over each of them one by one (see LUCENE-5717) and this requires
> keeping the "switch" up-to-date if new queries gets added to lucene
> 2) it doesn't support custom compound queries but only the set of queries
> available out-of-the-box
> I've been thinking about how this can be improved and one of the ideas I came
> up with is to introduce a generic way to retrieve the subqueries from
> compound queries, like for instance have a new abstract base class with a
> getLeaves or getSubQueries method and have all the compound queries extend
> it. What this method would do is return a flat array of all the leaf queries
> that the compound query is made of.
> Not sure whether this would be needed in other places in lucene, but it
> doesn't seem like a small change and it would definitely affect (or benefit?)
> more than just the postings highlighter support for multi term queries.
> In particular the second problem (custom queries) seems hard to solve without
> a way to expose this info directly from the query though, unless we want to
> make the MultiTermHighlighting#extractAutomata method extensible in some way.
> Would like to hear what people think and work on this as soon as we
> identified which direction we want to take.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]