Andreas Krantz created XERCESC-2130: ---------------------------------------
Summary: UTF16 Surrgate values 0xD800-0xDFFF can not longer be written with xerces 3.2.0 Key: XERCESC-2130 URL: https://issues.apache.org/jira/browse/XERCESC-2130 Project: Xerces-C++ Issue Type: Bug Components: DOM Affects Versions: 3.2.0 Reporter: Andreas Krantz Priority: Critical Attachments: reproduce.cpp Solution for XERCESC-1854 introduced method {{DOMLSSerializerImpl::ensureValidString}} which has an error in validation. The method validates XMLCh which represent UTF16. [Valid Characters|https://www.w3.org/TR/REC-xml/#NT-Char] #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] are the valid UTF32 characters. The UTF16 surrogate range from xD800 - xDFFF is used to represent [#x10000-#x10FFFF] and should not be handled as nvalid. *The reader threads this correctly and does not complain, which leads to an asmetric behavior* Reading DOM => OK Save back DOM => Exception I tried to attach an example to show the behavior. The used methods {{bool XMLChar1_1::isXMLChar(const XMLCh toCheck, const XMLCh toCheck2)}} already have a second optional parameter to check surrogate values. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: c-dev-h...@xerces.apache.org