[
https://issues.apache.org/jira/browse/LUCENE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luca Cavanna updated LUCENE-5718:
---------------------------------
Attachment: LUCENE-5718.patch
I like your extractQueries idea. I gave it a shot, patch attached.
The main difference compared to extractTerms is that it adds the query itself
to the list by default instead of throwing UnsupportedOperationException. Also,
I think this one doesn't necessarily require calling rewrite (not totally sure
though). I overrode the extractQueries method for all the queries that contain
one or more sub-queries, let's see if that's too many of if I missed any...you
tell me ;)
> More flexible compound queries (containing mtq) support in postings
> highlighter
> -------------------------------------------------------------------------------
>
> Key: LUCENE-5718
> URL: https://issues.apache.org/jira/browse/LUCENE-5718
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/highlighter
> Affects Versions: 4.8.1
> Reporter: Luca Cavanna
> Attachments: LUCENE-5718.patch
>
>
> The postings highlighter currently pulls the automata from multi term queries
> and doesn't require calling rewrite to make highlighting work. In order to do
> so it also needs to check whether the query is a compound one and eventually
> extract its subqueries. This is currently done in the MultiTermHighlighting
> class and works well but has two potential problems:
> 1) not all the possible compound queries are necessarily supported as we need
> to go over each of them one by one (see LUCENE-5717) and this requires
> keeping the "switch" up-to-date if new queries gets added to lucene
> 2) it doesn't support custom compound queries but only the set of queries
> available out-of-the-box
> I've been thinking about how this can be improved and one of the ideas I came
> up with is to introduce a generic way to retrieve the subqueries from
> compound queries, like for instance have a new abstract base class with a
> getLeaves or getSubQueries method and have all the compound queries extend
> it. What this method would do is return a flat array of all the leaf queries
> that the compound query is made of.
> Not sure whether this would be needed in other places in lucene, but it
> doesn't seem like a small change and it would definitely affect (or benefit?)
> more than just the postings highlighter support for multi term queries.
> In particular the second problem (custom queries) seems hard to solve without
> a way to expose this info directly from the query though, unless we want to
> make the MultiTermHighlighting#extractAutomata method extensible in some way.
> Would like to hear what people think and work on this as soon as we
> identified which direction we want to take.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]