[ 
https://issues.apache.org/jira/browse/LANG-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502022#comment-13502022
 ] 

Kazuki Hamasaki commented on LANG-857:
--------------------------------------

I created additional test cases.
But tests for {{escapeJava}} and {{escapeEcmaScript}} fail at this time, due to 
[LANG-858]

{code:java}
    @Test
public void testEscapeSurrogatePairs() throws Exception {
    assertEquals("\uD83D\uDE30", StringEscapeUtils.escapeCsv("\uD83D\uDE30"));
    // Examples from https://en.wikipedia.org/wiki/UTF-16
    assertEquals("\uD800\uDC00", StringEscapeUtils.escapeCsv("\uD800\uDC00"));
    assertEquals("\uD834\uDD1E", StringEscapeUtils.escapeCsv("\uD834\uDD1E"));
    assertEquals("\uDBFF\uDFFD", StringEscapeUtils.escapeCsv("\uDBFF\uDFFD"));
    assertEquals("\uDBFF\uDFFD", StringEscapeUtils.escapeHtml3("\uDBFF\uDFFD"));
    assertEquals("\uDBFF\uDFFD", StringEscapeUtils.escapeHtml4("\uDBFF\uDFFD"));
    assertEquals("\\uDBFF\\uDFFD", 
StringEscapeUtils.escapeJava("\uDBFF\uDFFD"));       //fail
    assertEquals("\\uDBFF\\uDFFD", 
StringEscapeUtils.escapeEcmaScript("\uDBFF\uDFFD")); //fail
    assertEquals("\uDBFF\uDFFD", StringEscapeUtils.escapeXml("\uDBFF\uDFFD"));
}

@Test
public void testUnEscapeSurrogatePairs() throws Exception {
    assertEquals("\uD83D\uDE30", StringEscapeUtils.unescapeCsv("\uD83D\uDE30"));
    // Examples from https://en.wikipedia.org/wiki/UTF-16
    assertEquals("\uD800\uDC00", StringEscapeUtils.unescapeCsv("\uD800\uDC00"));
    assertEquals("\uD834\uDD1E", StringEscapeUtils.unescapeCsv("\uD834\uDD1E"));
    assertEquals("\uDBFF\uDFFD", StringEscapeUtils.unescapeCsv("\uDBFF\uDFFD"));
    assertEquals("\uDBFF\uDFFD", 
StringEscapeUtils.unescapeHtml3("\uDBFF\uDFFD"));
    assertEquals("\uDBFF\uDFFD", 
StringEscapeUtils.unescapeHtml4("\uDBFF\uDFFD"));
    assertEquals("\uDBFF\uDFFD", 
StringEscapeUtils.unescapeJava("\\uDBFF\\uDFFD"));
    assertEquals("\uDBFF\uDFFD", 
StringEscapeUtils.unescapeEcmaScript("\\uDBFF\\uDFFD"));
    assertEquals("\uDBFF\uDFFD", StringEscapeUtils.escapeXml("\uDBFF\uDFFD"));
}
{code}
                
> StringIndexOutOfBoundsException in CharSequenceTranslator
> ---------------------------------------------------------
>
>                 Key: LANG-857
>                 URL: https://issues.apache.org/jira/browse/LANG-857
>             Project: Commons Lang
>          Issue Type: Bug
>          Components: lang.text.translate.*
>    Affects Versions: 3.x
>            Reporter: Kazuki Hamasaki
>            Priority: Minor
>              Labels: patch
>             Fix For: 3.2
>
>         Attachments: CharSequenceTranslator_translate.patch
>
>
> I found that there is bad surrogate pair handling in the 
> CharSequenceTranslator
> This is a simple test case for this problem.
> \uD83D\uDE30 is a surrogate pair.
> {code:java}
> @Test
> public void testEscapeSurrogatePairs() throws Exception {
>     assertEquals("\uD83D\uDE30", StringEscapeUtils.escapeCsv("\uD83D\uDE30"));
> }
> {code}
> You'll get the exception as shown below.
> {code}
> java.lang.StringIndexOutOfBoundsException: String index out of range: 2
>       at java.lang.String.charAt(String.java:658)
>       at java.lang.Character.codePointAt(Character.java:4668)
>       at 
> org.apache.commons.lang3.text.translate.CharSequenceTranslator.translate(CharSequenceTranslator.java:95)
>       at 
> org.apache.commons.lang3.text.translate.CharSequenceTranslator.translate(CharSequenceTranslator.java:59)
>       at 
> org.apache.commons.lang3.StringEscapeUtils.escapeCsv(StringEscapeUtils.java:556)
> {code}
> Patch attached, the method affected:
> # public final void translate(CharSequence input, Writer out) throws 
> IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to