[ 
https://issues.apache.org/jira/browse/LANG-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Benson resolved LANG-708.
------------------------------

    Resolution: Duplicate

With LANG-720 fixed, lang3 trunk no longer cuts off the end of the string.

> StringEscapeUtils.escapeEcmaScript from lang3 cuts off long unicode string
> --------------------------------------------------------------------------
>
>                 Key: LANG-708
>                 URL: https://issues.apache.org/jira/browse/LANG-708
>             Project: Commons Lang
>          Issue Type: Bug
>          Components: lang.text.translate.*
>            Reporter: anton
>             Fix For: 3.x
>
>         Attachments: Test.java, input.txt
>
>
> Hello, I have a really big JSON string (generated from db) with unicode chars 
> and I need to pass it though StringEscapeUtils.escapeEcmaScript(value). This 
> string is generated and in most cases it works ok, but I have met a specific 
> string (attached below) which is not correctly converted - few symbols (about 
> 10) at the end of the string are cut-off (and actually they are not already 
> unicode chars).
> the original string ends with:
>  "geonameId":6544329,"valueCode":""}]
> and the produced string ends with:
>  \"geonameId\":6544329,\"value
> So Code":""}] part is missing and this does not allow to parse the result as 
> JSON on the client side.
> I have tried to debug a bit with StringEscapeUtils.escapeEcmaScript source 
> code and is seems that the problem is somewhere around here:
> CharSequenceTranslator.translate(...){
> ...
>         int sz = Character.codePointCount(input, 0, input.length());
>         for (int i = 0; i < sz; i++) {
>             // consumed is the number of codepoints consumed
>             int consumed = translate(input, i, out);
>             if(consumed == 0) { 
>                 out.write( Character.toChars( Character.codePointAt(input, i) 
> ) );
>             }
> ...
> }
> If I put breakpoint condition to stop in the loop when i==(sz-5), I can see 
> that the last chars of "valueCode" literal are being added to the end of 
> "out" stream, but the counter condition ends too early to reach the end of 
> original input String.
> So, it seems that somehow with the provided string either the sz value is 
> calculated incorrectly or the processing loop did wrong counter adjustmes at 
> some point.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to