[
https://issues.apache.org/jira/browse/LANG-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cédrik LIME updated LANG-285:
-----------------------------
Attachment: StringUtilsAccents.patch
Fixing "stripAccents" performance issues, and extending the logic to use
sun.text.Normalizer (Java <= 1.5) or ICU4J when Java 6 is unavailable.
Please note that not all "interesting" characters are removed using Unicode
decomposition; notably ligatures and curly quotes remain as is, which may not
be what the bug reporter wanted in fine. See my previous comment for details
about ASCII folding.
> Wish : method unaccent
> ----------------------
>
> Key: LANG-285
> URL: https://issues.apache.org/jira/browse/LANG-285
> Project: Commons Lang
> Issue Type: New Feature
> Components: lang.*
> Reporter: Guillaume Coté
> Priority: Minor
> Fix For: 3.0
>
> Attachments: LANG-285-unaccent-using-Collator.patch, LANG-285.patch,
> MapBuilder.java, StringUtilsAccents.patch, unaccent.patch, UnnacentMap.java
>
>
> I would like to add a method that replace accented caracter by unaccented
> one. For example, with the input String "L'été où j'ai dû aller à l'île
> d'Anticosti commenca tôt", the method would return "L'ete ou j'ai du aller à
> l'ile d'Anticosti commenca tot".
> I suggest to call that method unaccent and to add it in StringUtils.
> If we cannot covert all case, the first version could only covert iso-8859-1.
> If you are willing to go forward with that idea, I am willing to contribute a
> patch.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.