[GitHub] [lucene-solr] romseygeek commented on a change in pull request #2127: LUCENE-9633: Improve match highlighter behavior for degenerate intervals

GitBox Tue, 08 Dec 2020 01:04:42 -0800


romseygeek commented on a change in pull request #2127:
URL: https://github.com/apache/lucene-solr/pull/2127#discussion_r538157532




##########
File path: 
lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchRegionRetriever.java
##########
@@ -361,6 +374,41 @@ public void testIntervalQueries() throws IOException {
     );
   }
 
+  @Test
+  public void testDegenerateIntervalsWithPositions() throws IOException {
+    testDegenerateIntervals(FLD_TEXT_POS);
+  }
+
+  @Test @AwaitsFix(bugUrl = 
"https://issues.apache.org/jira/browse/LUCENE-9634: " +

Review comment:
       So `extend` will widen the bounds of an interval's positions, but leave 
its offsets untouched (because it has no way of knowing what the offsets 
actually are).  I sort of think that just highlighting the original term is the 
correct behaviour?  But there will be a discrepancy when we generate offsets 
directly from the token stream by comparing to positions.
   
   I see that ExtendedIntervalIterator's javadoc is incorrect regarding 
prefixes.  It says 
   ```
   An interval with prefix bounds extended by n will skip over matches that 
appear in positions lower than n
   ```
   but it actually just readjusts these matches to start at position 0.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] romseygeek commented on a change in pull request #2127: LUCENE-9633: Improve match highlighter behavior for degenerate intervals

Reply via email to