[ 
https://issues.apache.org/jira/browse/PROTON-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363325#comment-14363325
 ] 

ASF GitHub Bot commented on PROTON-834:
---------------------------------------

GitHub user dnwe opened a pull request:

    https://github.com/apache/qpid-proton/pull/13

    PROTON-834: further UTF-8 encoder fixes

    After commit c65e897 it turned out there were still some issues with
    strings containing a codepoint >0xDBFF which was being incorrectly
    treated as a surrogate pair in the calculateUTF8Length method.
    
    Fixed this up and added some more test coverage.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dnwe/qpid-proton fix-proton-834

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/qpid-proton/pull/13.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13
    
----
commit dc52650e7de53ef5fe294b9066620b4698c30a94
Author: Dominic Evans <dominic.ev...@uk.ibm.com>
Date:   2015-03-16T12:18:20Z

    PROTON-834: further UTF-8 encoder fixes
    
    After commit c65e897 it turned out there were still some issues with
    strings containing a codepoint >0xDBFF which was being incorrectly
    treated as a surrogate pair in the calculateUTF8Length method.
    
    Fixed this up and added some more test coverage.

----


> proton-j: UTF-8 encoder reporting some three byte characters as invalid 
> surrogates
> ----------------------------------------------------------------------------------
>
>                 Key: PROTON-834
>                 URL: https://issues.apache.org/jira/browse/PROTON-834
>             Project: Qpid Proton
>          Issue Type: Bug
>          Components: proton-j
>    Affects Versions: 0.8
>            Reporter: Dominic Evans
>            Assignee: Dominic Evans
>
> Following on from the fixes made under PROTON-576, some UTF-8 characters were 
> getting incorrectly reported as invalid surrogates, when they were valid 
> 3-byte encodings.
> e.g.,
> !!!
> (╯°□°)╯︵ ┻━┻
> etc.
> This is an issue when streaming variable content such as Twitter messages 
> which can often contain such characters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to