[ http://issues.apache.org/jira/browse/LUCENE-627?page=comments#action_12421332 ] Yonik Seeley commented on LUCENE-627: -------------------------------------
So Mark, does this patch look OK? Without it, even if I order the tokens by startOffset, I get things like HiHi-Speed <em>USB</em> WordDelimiterFilter (that's what is producing these types of tokens) is widely used in Solr-land, so I'm eager to get this fixed. > highlighter problems with overlapping tokens > -------------------------------------------- > > Key: LUCENE-627 > URL: http://issues.apache.org/jira/browse/LUCENE-627 > Project: Lucene - Java > Issue Type: Bug > Components: Other > Affects Versions: 2.0.1 > Reporter: Yonik Seeley > Attachments: highlight_overlap.diff > > > The lucene highlighter has problems when tokens that overlap are generated. > For example, if analysis of iPod generates the tokens "i", "pod", "ipod" > (with pod and ipod in the same position), > then the highlighter will output this as iipod, regardless of if any of those > tokens are highlighted. > Discovered via http://issues.apache.org/jira/browse/SOLR-24 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]