[ 
https://issues.apache.org/jira/browse/LUCENE-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13652885#comment-13652885
 ] 

Jack Krupansky commented on LUCENE-3907:
----------------------------------------

Look, the "fix" of position bugs here is to keep the position the same for all 
tokens, right? And that logic can simply be applied to "back" as well, for the 
same reasons and with the same effect. So, how could "back" - which should 
apply that same position logic be a separate cause of "highlighting bugs"?

"previous behavior" (incremented position) is simply NOT linked to front vs. 
back. I'm not sure why you are claiming that it is!

The Jira record simply shows that some people "want" to eliminate a feature... 
not that the feature (if fixed in the same manner as the rest of the fix) 
"could trigger highlighting bugs" - unless I'm missing something, and if I'm 
missing something it is because you are not stating it clearly! So, please do 
so.
                
> Improve the Edge/NGramTokenizer/Filters
> ---------------------------------------
>
>                 Key: LUCENE-3907
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3907
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Adrien Grand
>              Labels: gsoc2013
>             Fix For: 4.3
>
>         Attachments: LUCENE-3907.patch
>
>
> Our ngram tokenizers/filters could use some love.  EG, they output ngrams in 
> multiple passes, instead of "stacked", which messes up offsets/positions and 
> requires too much buffering (can hit OOME for long tokens).  They clip at 
> 1024 chars (tokenizers) but don't (token filters).  The split up surrogate 
> pairs incorrectly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to