[ 
https://issues.apache.org/jira/browse/LANG-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary D. Gregory closed LANG-607.
--------------------------------
    Fix Version/s: 3.14.0
                       (was: Patch Needed)
       Resolution: Fixed

> StringUtils methods do not handle Unicode 2.0+ supplementary characters 
> correctly.
> ----------------------------------------------------------------------------------
>
>                 Key: LANG-607
>                 URL: https://issues.apache.org/jira/browse/LANG-607
>             Project: Commons Lang
>          Issue Type: Bug
>          Components: lang.*
>    Affects Versions: 2.5
>         Environment: java version "1.6.0_16"
> Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
> Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)
> Microsoft Windows [Version 6.0.6002]
> Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700)
> Java version: 1.6.0_16
> Java home: C:\Program Files\Java\jdk1.6.0_16\jre
> Default locale: en_US, platform encoding: Cp1252
> OS name: "windows vista" version: "6.0" arch: "amd64" Family: "windows"
>            Reporter: Gary Gregory
>            Assignee: Gary D. Gregory
>            Priority: Minor
>             Fix For: 3.14.0
>
>         Attachments: LANG-607.diff
>
>
> StringUtils.containsAny methods incorrectly matches Unicode 2.0+ 
> supplementary characters.
> For example, define a test fixture to be the Unicode character U+20000 where 
> U+20000 is written in Java source as "\uD840\uDC00"
>       private static final String CharU20000 = "\uD840\uDC00";
>       private static final String CharU20001 = "\uD840\uDC01";
> You can see Unicode supplementary characters correctly implemented in the JRE 
> call:
>       assertEquals(-1, CharU20000.indexOf(CharU20001));
> But this is broken:
>       assertEquals(false, StringUtils.containsAny(CharU20000, CharU20001));
>       assertEquals(false, StringUtils.containsAny(CharU20001, CharU20000));
> This is fine:
>       assertEquals(true, StringUtils.contains(CharU20000 + CharU20001, 
> CharU20000));
>       assertEquals(true, StringUtils.contains(CharU20000 + CharU20001, 
> CharU20001));
>       assertEquals(true, StringUtils.contains(CharU20000, CharU20000));
>       assertEquals(false, StringUtils.contains(CharU20000, CharU20001));
> because the method calls the JRE to perform the match.
> More than you want to know:
> - http://java.sun.com/developer/technicalArticles/Intl/Supplementary/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to