[
https://issues.apache.org/jira/browse/XERCESC-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16322507#comment-16322507
]
Andreas Krantz commented on XERCESC-2130:
-----------------------------------------
Some example data:
{{Invalid
u"\xD800"
u"\xDC00"
u"\xD800 \xDC00"
u"\xD800\xD800\xDC00"
u"\xD800\xDC00\xDC00"
u" \xD800 \xDC00 "
u" \xD800\xD800\xDC00 "
u" \xD800\xDC00\xDC00 "
u"\xFFFE"
u"\xFFFF"
u"\x0001"
u"\x0002"
u"\x0003"
u"\x0004"
u"\x0005"
u"\x0006"
u"\x0007"
u"\x0008"
u"\x000B"
u"\x000C"
u"\x000E"
u"\x000F"
u"\x0010"
u"\x0011"
u"\x0012"
u"\x0013"
u"\x0014"
u"\x0015"
u"\x0016"
u"\x0017"
u"\x0018"
u"\x0019"
u"\x001A"
u"\x001B"
u"\x001C"
u"\x001D"
u"\x001E"
u"\x001F"
Valid
u"\xD800\xDC00"
u""
u"\U0010FFFF"
u"\U00100000"
u"\x0009"
u"\x000A"
u"\x000D"
u"\x0020"}}
> UTF16 Surrgate values 0xD800-0xDFFF can not longer be written with xerces
> 3.2.0
> -------------------------------------------------------------------------------
>
> Key: XERCESC-2130
> URL: https://issues.apache.org/jira/browse/XERCESC-2130
> Project: Xerces-C++
> Issue Type: Bug
> Components: DOM
> Affects Versions: 3.2.0
> Reporter: Andreas Krantz
> Priority: Critical
> Attachments: patch.cpp, reproduce.cpp
>
>
> Solution for XERCESC-1854 introduced method
> {{DOMLSSerializerImpl::ensureValidString}}
> which has an error in validation.
> The method validates XMLCh which represent UTF16.
> [Valid Characters|https://www.w3.org/TR/REC-xml/#NT-Char] #x9 | #xA | #xD |
> [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
> are the valid UTF32 characters.
> The UTF16 surrogate range from xD800 - xDFFF is used to represent
> [#x10000-#x10FFFF] and should not be handled as nvalid.
> *The reader threads this correctly and does not complain, which leads to an
> asmetric behavior*
> Reading DOM => OK
> Save back DOM => Exception
> I tried to attach an example to show the behavior.
> The used methods
> {{bool XMLChar1_1::isXMLChar(const XMLCh toCheck, const XMLCh toCheck2)}}
> already have a second optional parameter to check surrogate values.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]