[ http://issues.apache.org/jira/browse/LUCENE-627?page=comments#action_12421024 ]
Mark Harwood commented on LUCENE-627: ------------------------------------- >>It seems like maybe the only way to handle some of this stuff is two passes The highlighter does not expect token positions to "rewind" in this manner. I'm not sure where this ends. Imagine an analyzer, which having considered and emitted tokens for a whole document, chooses to append some tokens positioned which has offsets referencing much earlier sections of the document. (Why, I'm not sure but there's nothing to say this couldn't happen). >>It seems like maybe the only way to handle some of this stuff is two passes Maybe a special "OrderFixer" TokenStream could be used by to wrap "rewinding" token streams such as yours and then accumulate all tokens in a buffer before then sorting and outputting them in ascending start offset order. If the Highlighter ignored position increment and just used offsets (as it does currently) I suspect all would be OK > highlighter problems with overlapping tokens > -------------------------------------------- > > Key: LUCENE-627 > URL: http://issues.apache.org/jira/browse/LUCENE-627 > Project: Lucene - Java > Type: Bug > Components: Other > Versions: 2.0.1 > Reporter: Yonik Seeley > > The lucene highlighter has problems when tokens that overlap are generated. > For example, if analysis of iPod generates the tokens "i", "pod", "ipod" > (with pod and ipod in the same position), > then the highlighter will output this as iipod, regardless of if any of those > tokens are highlighted. > Discovered via http://issues.apache.org/jira/browse/SOLR-24 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]