[ 
https://issues.apache.org/jira/browse/LUCENE-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718795#action_12718795
 ] 

Michael McCandless commented on LUCENE-1545:
--------------------------------------------

bq. but if you want, i'm willing to come up with some minor grammar changes for 
StandardAnalyzer that could help things like this.

Is it possible to conditionalize, at runtime, certain parts of a JFlex grammar? 
 Ie, with matchVersion (LUCENE-1684) we could preserve back-compat on this 
issue, but I'm not sure how to cleanly push that matchVersion (provided @ 
runtime to StandardAnalyzer's ctor) "down" into the grammar so that eg we're 
not force to make a new full copy of the grammar for each fix.  (Though perhaps 
that's an OK solution since it'd make it easy to strongly guarantee back 
compat...).

> Standard analyzer does not correctly tokenize combining character U+0364 
> COMBINING LATIN SMALL LETTRE E
> -------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1545
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1545
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis
>    Affects Versions: 2.4
>         Environment: Linux x86_64, Sun Java 1.6
>            Reporter: Andreas Hauser
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: AnalyzerTest.java
>
>
> Standard analyzer does not correctly tokenize combining character U+0364 
> COMBINING LATIN SMALL LETTRE E.
> The word "moͤchte" is incorrectly tokenized into "mo" "chte", the combining 
> character is lost.
> Expected result is only on token "moͤchte".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to