[ 
https://issues.apache.org/jira/browse/LUCENE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778396#action_12778396
 ] 

Robert Muir commented on LUCENE-2068:
-------------------------------------

bq. Is this an improvement or a bug? The summary sounds kind of buggish ...

I'll try to restrain myself, but I think we should have fixed unicode 4 support 
in lucene 3.0, because then it matches the unicode version of java 1.5
the problem is we could not do any of this in 2.9, because you need java 1.5 to 
actually implement most of the support, so it was a chicken and egg problem.

imho its all bugs, but i'll list these issues as improvements :)

> fix reverseStringFilter for unicode 4.0
> ---------------------------------------
>
>                 Key: LUCENE-2068
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2068
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/analyzers
>            Reporter: Robert Muir
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: LUCENE-2068.patch, LUCENE_2068.patch
>
>
> ReverseStringFilter is not aware of supplementary characters: when it 
> reverses it will create unpaired surrogates, which will be replaced by U+FFFD 
> by the indexer (but not at query time).
> The wrong words will conflate to each other, and the right words won't match, 
> basically the whole thing falls apart.
> This patch implements in-place reverse with the algorithm from apache harmony 
> AbstractStringBuilder.reverse0()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to