Marc Morissette created LUCENE-8365:
---------------------------------------

             Summary: ArrayIndexOutOfBoundsException in UnifiedHighlighter
                 Key: LUCENE-8365
                 URL: https://issues.apache.org/jira/browse/LUCENE-8365
             Project: Lucene - Core
          Issue Type: Bug
          Components: modules/highlighter
    Affects Versions: 7.3.1
            Reporter: Marc Morissette


We see an ArrayOutOfBoundsExceptions coming out of the UnifiedHighlighter in 
our production logs from time to time:

{code}
java.lang.ArrayIndexOutOfBoundsException
        at java.base/java.lang.System.arraycopy(Native Method)
        at 
org.apache.lucene.search.uhighlight.PhraseHelper$SpanCollectedOffsetsEnum.add(PhraseHelper.java:386)
        at 
org.apache.lucene.search.uhighlight.PhraseHelper$OffsetSpanCollector.collectLeaf(PhraseHelper.java:341)
        at org.apache.lucene.search.spans.TermSpans.collect(TermSpans.java:121)
        at 
org.apache.lucene.search.spans.NearSpansOrdered.collect(NearSpansOrdered.java:149)
        at 
org.apache.lucene.search.spans.NearSpansUnordered.collect(NearSpansUnordered.java:171)
        at 
org.apache.lucene.search.spans.FilterSpans.collect(FilterSpans.java:120)
        at 
org.apache.lucene.search.uhighlight.PhraseHelper.createOffsetsEnumsForSpans(PhraseHelper.java:261)
...
{code}

It turns out that there is an "off by one" error in UnifiedHighlighter code 
that, as far as I can tell, is currently only invoked when two nested 
SpanNearQueries contain the same term.

The behaviour depends on the highlighted document. In most cases, some terms 
will fail to be highlighted. In others, an Exception is thrown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to