[ 
http://issues.apache.org/jira/browse/LUCENE-627?page=comments#action_12421024 ] 

Mark Harwood commented on LUCENE-627:
-------------------------------------

>>It seems like maybe the only way to handle some of this stuff is two passes

The highlighter does not expect token positions to "rewind" in this manner. I'm 
not sure where this ends. Imagine an analyzer, which having considered and 
emitted tokens for a whole document, chooses to append some  tokens positioned 
which  has offsets referencing much earlier sections of the document. (Why, I'm 
not sure but there's nothing to say this couldn't happen).

>>It seems like maybe the only way to handle some of this stuff is two passes

Maybe a special "OrderFixer" TokenStream could be used by to wrap "rewinding" 
token streams such as yours and then accumulate all tokens in a  buffer before 
then sorting and outputting them in ascending start offset order. If the 
Highlighter ignored position increment and just used offsets (as it does 
currently) I suspect all would be OK



> highlighter problems with overlapping tokens
> --------------------------------------------
>
>          Key: LUCENE-627
>          URL: http://issues.apache.org/jira/browse/LUCENE-627
>      Project: Lucene - Java
>         Type: Bug

>   Components: Other
>     Versions: 2.0.1
>     Reporter: Yonik Seeley

>
> The lucene highlighter has problems when tokens that overlap are generated.
> For example, if analysis of iPod generates the tokens "i", "pod", "ipod" 
> (with pod and ipod in the same position),
> then the highlighter will output this as iipod, regardless of if any of those 
> tokens are highlighted.
> Discovered via http://issues.apache.org/jira/browse/SOLR-24

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to