Hi Thomas, This is a constant source of maintenance in Lucene -- updating all of our highlighters to be aware of new queries. Some of them require more maintenance than others; by far WSTE is the hotspot. WSTE avoids calling query.rewrite(IndexReader) because for some queries it can be quite expensive. It doesn't really know for sure in the face of custom queries, if this is okay or not. I recommend extending WSTE and add some instanceof checks for your custom queries so that you do whatever the right things is for your query. If you use the UnifiedHighlighter, new in Lucene as of 6.3, there are several callback hooks provided for this sort of thing without exposing the guts of it, which uses WSTE.
BTW it's obvious to me that Query needs some sort of visitor API. It's very much related to maintaining the highlighters with respect to new/different queries. https://issues.apache.org/jira/browse/LUCENE-3041 Good luck, ~ David > On Dec 21, 2016, at 1:27 PM, Thomas Kappler <[email protected]> > wrote: > > Hi, > > We have implemented a custom query that extends MultiPhraseQuery (MPQ) > because it uses MPQ’s getTermArrays() and getPositions(). We’d like to use > this query for highlighting, but we’re facing the following issue. > > In highlighter/WeightedSpanTermExtractor, the extract() method does a series > of instanceof checks. There is a special case for MPQ. This branch does not > call rewrite(IndexReader) on the query, but our custom query needs rewriting > to work properly. > > As a test, I commented the MPQ branch in WeightedSpanTermExtractor so the > code takes the last else branch, where rewrite(IndexReader) is called, and > our tests pass. > > My questions are > Are we doing it wrong when our query *needs* rewriting? Our query logic needs > the IndexReader that’s passed in in rewrite(IndexReader). Where else would we > put such code? > If we aren’t doing it wrong, how can we use the highlighter? Extend Query > instead of MPQ and copy the tracking of term arrays and positions from MPQ? > > Thanks, > Thomas
