Marc Morissette created LUCENE-8365:
---------------------------------------
Summary: ArrayIndexOutOfBoundsException in UnifiedHighlighter
Key: LUCENE-8365
URL: https://issues.apache.org/jira/browse/LUCENE-8365
Project: Lucene - Core
Issue Type: Bug
Components: modules/highlighter
Affects Versions: 7.3.1
Reporter: Marc Morissette
We see an ArrayOutOfBoundsExceptions coming out of the UnifiedHighlighter in
our production logs from time to time:
{code}
java.lang.ArrayIndexOutOfBoundsException
at java.base/java.lang.System.arraycopy(Native Method)
at
org.apache.lucene.search.uhighlight.PhraseHelper$SpanCollectedOffsetsEnum.add(PhraseHelper.java:386)
at
org.apache.lucene.search.uhighlight.PhraseHelper$OffsetSpanCollector.collectLeaf(PhraseHelper.java:341)
at org.apache.lucene.search.spans.TermSpans.collect(TermSpans.java:121)
at
org.apache.lucene.search.spans.NearSpansOrdered.collect(NearSpansOrdered.java:149)
at
org.apache.lucene.search.spans.NearSpansUnordered.collect(NearSpansUnordered.java:171)
at
org.apache.lucene.search.spans.FilterSpans.collect(FilterSpans.java:120)
at
org.apache.lucene.search.uhighlight.PhraseHelper.createOffsetsEnumsForSpans(PhraseHelper.java:261)
...
{code}
It turns out that there is an "off by one" error in UnifiedHighlighter code
that, as far as I can tell, is currently only invoked when two nested
SpanNearQueries contain the same term.
The behaviour depends on the highlighted document. In most cases, some terms
will fail to be highlighted. In others, an Exception is thrown.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]