Re: [5.5.x] Highlighter and rewriting MultiPhraseQuery

David Smiley Wed, 21 Dec 2016 10:39:59 -0800

Hi Thomas,

This is a constant source of maintenance in Lucene -- updating all of our 
highlighters to be aware of new queries.  Some of them require more maintenance 
than others; by far WSTE is the hotspot.  WSTE avoids calling 
query.rewrite(IndexReader) because for some queries it can be quite expensive.  
It doesn't really know for sure in the face of custom queries, if this is okay 
or not.  I recommend extending WSTE and add some instanceof checks for your 
custom queries so that you do whatever the right things is for your query.  If 
you use the UnifiedHighlighter, new in Lucene as of 6.3, there are several 
callback hooks provided for this sort of thing without exposing the guts of it, 
which uses WSTE.


BTW it's obvious to me that Query needs some sort of visitor API.  It's very 
much related to maintaining the highlighters with respect to new/different 
queries.  https://issues.apache.org/jira/browse/LUCENE-3041

Good luck,

~ David

> On Dec 21, 2016, at 1:27 PM, Thomas Kappler <[email protected]> 
> wrote:
> 
> Hi,
>  
> We have implemented a custom query that extends MultiPhraseQuery (MPQ) 
> because it uses MPQ’s getTermArrays() and getPositions(). We’d like to use 
> this query for highlighting, but we’re facing the following issue.
>  
> In highlighter/WeightedSpanTermExtractor, the extract() method does a series 
> of instanceof checks. There is a special case for MPQ. This branch does not 
> call rewrite(IndexReader) on the query, but our custom query needs rewriting 
> to work properly.
>  
> As a test, I commented the MPQ branch in WeightedSpanTermExtractor so the 
> code takes the last else branch, where rewrite(IndexReader) is called, and 
> our tests pass.
>  
> My questions are
> Are we doing it wrong when our query *needs* rewriting? Our query logic needs 
> the IndexReader that’s passed in in rewrite(IndexReader). Where else would we 
> put such code?
> If we aren’t doing it wrong, how can we use the highlighter? Extend Query 
> instead of MPQ and copy the tracking of term arrays and positions from MPQ?
>  
> Thanks,
> Thomas

Re: [5.5.x] Highlighter and rewriting MultiPhraseQuery

Reply via email to