[ 
http://issues.apache.org/jira/browse/LANG-285?page=comments#action_12442372 ] 
            
Aldrin Leal commented on LANG-285:
----------------------------------

A while ago, I did something on ISO8859-1, but methinks UTF-8 could handle it 
as well.

Sorry about the comments being in Brazilian Portuguese. Overrall, it works! :)

        /**
         * Reescreve a string, removendo acentos, de forma otimizar a busca.
         * 
         * @param str
         *            String a ser normalizada
         * @return A string sem os acentos
         */
        public static String normalizar(String str) {
                String retval = null;
                char[] chArr = normalize0(str);
                String[] xlatTab = new String[] { "áâãà".toUpperCase(),
                                "a".toUpperCase(), "éêè".toUpperCase(), 
"e".toUpperCase(),
                                "íîì".toUpperCase(), "i".toUpperCase(), 
"óôòõ".toUpperCase(),
                                "o".toUpperCase(), "úûù".toUpperCase(), 
"u".toUpperCase(),
                                "ç".toUpperCase(), "c".toUpperCase(), "áâãà", 
"a", "éêè", "e",
                                "íîì", "i", "óôòõ", "o", "úûù", "u", "ç", "c", 
};

                for (int k = 0; k < chArr.length; k++)
                        for (int i0 = 0; i0 < xlatTab.length; i0 += 2)
                                if (-1 != (xlatTab[i0].indexOf(chArr[k])))
                                        chArr[k] = xlatTab[(i0 + 1)].charAt(0);

                retval = new String(chArr);

                log.debug("data0=" + str + "; data=" + retval);

                return retval;
        }


> Wish : method unaccent
> ----------------------
>
>                 Key: LANG-285
>                 URL: http://issues.apache.org/jira/browse/LANG-285
>             Project: Commons Lang
>          Issue Type: New Feature
>            Reporter: Guillaume Coté
>            Priority: Minor
>
> I would like to add a method that replace accented caracter by unaccented 
> one.  For example, with the input String "L'été où j'ai dû aller à l'île 
> d'Anticosti commenca tôt", the method would return "L'ete ou j'ai du aller à 
> l'ile d'Anticosti commenca tot".
> I suggest to call that method unaccent and to add it in StringUtils.
> If we cannot covert all case, the first version could only covert iso-8859-1.
> If you are willing to go forward with that idea, I am willing to contribute a 
> patch.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to