XOPAwareStAXOMBuilder / MTOMStAXSOAPModelBuilder should use UTF-8 to decode 
cid: URIs
-------------------------------------------------------------------------------------

                 Key: WSCOMMONS-429
                 URL: https://issues.apache.org/jira/browse/WSCOMMONS-429
             Project: WS-Commons
          Issue Type: Bug
            Reporter: Andreas Veithen
            Assignee: Andreas Veithen
            Priority: Minor


XOPAwareStAXOMBuilder and MTOMStAXSOAPModelBuilder use the document charset 
encoding to decode cid: URIs (see usage of URLDecoder.decode in 
ElementHelper#getContentID). However, as explained in [1] (referenced by the 
definition of the anyURI type), %HH escaping should always be done using UTF-8.

Since non ASCII characters are not allowed in content IDs, this is only an 
issue if the document uses a charset encoding that is not a superset of ASCII 
(e.g. UTF-16). It should also be noted that most of the characters that require 
%HH encoding are also not allowed (or are unusual) in content IDs. Therefore 
this is a minor issue.

It should also be noted that the unit test 
MTOMStAXSOAPModelBuilderTest#testUTF16MTOMMessage specifically tests this 
incorrect behavior. It should therefore be corrected or removed entirely.

[1] http://www.w3.org/TR/2001/WD-charmod-20010126/#sec-URIs


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to