romseygeek commented on a change in pull request #2127: URL: https://github.com/apache/lucene-solr/pull/2127#discussion_r538157532
########## File path: lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchRegionRetriever.java ########## @@ -361,6 +374,41 @@ public void testIntervalQueries() throws IOException { ); } + @Test + public void testDegenerateIntervalsWithPositions() throws IOException { + testDegenerateIntervals(FLD_TEXT_POS); + } + + @Test @AwaitsFix(bugUrl = "https://issues.apache.org/jira/browse/LUCENE-9634: " + Review comment: So `extend` will widen the bounds of an interval's positions, but leave its offsets untouched (because it has no way of knowing what the offsets actually are). I sort of think that just highlighting the original term is the correct behaviour? But there will be a discrepancy when we generate offsets directly from the token stream by comparing to positions. I see that ExtendedIntervalIterator's javadoc is incorrect regarding prefixes. It says ``` An interval with prefix bounds extended by n will skip over matches that appear in positions lower than n ``` but it actually just readjusts these matches to start at position 0. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org