[ 
https://issues.apache.org/jira/browse/LUCENE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842206#comment-13842206
 ] 

Steve Rowe commented on LUCENE-5357:
------------------------------------

No problem Robert, thanks for taking a look.

About back-compat: none of the JFlex-based tokenizers on trunk have 
version-based behavior at this point, in contrast to branch_4x.  It could be 
argued that that was because all previous back-compat version were for 3.X, but 
this issue introduced a 4.0 version, which puts it within the version X-1 
window for trunk/5.0.  Should I forward-port the 4.0 back-compat stuff from 
branch_4x for StandardTokenizer and UAX29URLEmailTokenizer?  There are other 
analysis components on trunk that do different things based on version, so 
clearly the practice has not been abandoned on trunk.

> Upgrade StandardTokenizer & co to latest unicode rules
> ------------------------------------------------------
>
>                 Key: LUCENE-5357
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5357
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/analysis
>            Reporter: Robert Muir
>            Assignee: Steve Rowe
>             Fix For: 5.0, 4.7
>
>         Attachments: LUCENE-5357.patch
>
>
> besides any change in data, the rules have also changed (regional indicators, 
> better handling for hebrew, etc)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to