[jira] [Commented] (XERCESC-2239) When XMLUni::fgDOMWRTSplitCdataSections is true (the default), invalid XML characters are allowed by DOMWriter
[ https://issues.apache.org/jira/browse/XERCESC-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17613505#comment-17613505 ] Scott Cantor commented on XERCESC-2239: --- I suspect there's an intended distinction between "illegal" characters and "unrepresentable" ones. The feature apparently controls how unrepresentable characters are handled, and explicitly changes the behavior such that they're output numerically and don't cause an error. I don't know the specs well enough to even consider making a change to this code, or even if a change is in fact the right thing to do. I'm pretty sure the current behavior is intentional. > When XMLUni::fgDOMWRTSplitCdataSections is true (the default), invalid XML > characters are allowed by DOMWriter > -- > > Key: XERCESC-2239 > URL: https://issues.apache.org/jira/browse/XERCESC-2239 > Project: Xerces-C++ > Issue Type: Bug > Components: DOM >Affects Versions: 3.2.0 > Environment: Operating System: All > Platform: All >Reporter: David Leffingwell >Priority: Major > Fix For: 3.2.4 > > > // Create a Document with a CDATA section that contains an invalid XML > character (e.g. 0x1b). > // This should fail when serializing the Document, but it does not when > XMLUni::fgDOMWRTSplitCdataSections is true. > struct XercesDeleter > { > template > void operator()(T* data) const > { > if (data) { data->release(); }; > } > }; > typedef std::unique_ptr > DOMWriterPtr; > typedef std::unique_ptr > DOMDocumentPtr; > XMLPlatformUtils::Initialize(); > DOMImplementation* impl = > DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("LS")); > // Create DOM with a CDATA section > DOMDocumentPtr document(impl->createDocument()); > DOMElement* element = > document->createElementNS(XMLString::transcode("http://schemas.openxmlformats.org/wordprocessingml/2006/main;), > XMLString::transcode("w:t")); > document->appendChild(element); > DOMCDATASection* codesection = document->createCDATASection(XercesString("c = > '';")); // 0x1B is not a valid XML 1.0 character > element->appendChild(codesection); > DOMWriterPtr writer(impl->createLSSerializer()); > writer->writeToString(document.get()) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: c-dev-h...@xerces.apache.org
[jira] [Commented] (XERCESC-2239) When XMLUni::fgDOMWRTSplitCdataSections is true (the default), invalid XML characters are allowed by DOMWriter
[ https://issues.apache.org/jira/browse/XERCESC-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17613048#comment-17613048 ] Scott Cantor commented on XERCESC-2239: --- I'll at least take a look but I doubt I would have the confidence to make a fix to this. > When XMLUni::fgDOMWRTSplitCdataSections is true (the default), invalid XML > characters are allowed by DOMWriter > -- > > Key: XERCESC-2239 > URL: https://issues.apache.org/jira/browse/XERCESC-2239 > Project: Xerces-C++ > Issue Type: Bug > Components: DOM >Affects Versions: 3.2.0 > Environment: Operating System: All > Platform: All >Reporter: David Leffingwell >Priority: Major > Fix For: 3.2.4 > > > // Create a Document with a CDATA section that contains an invalid XML > character (e.g. 0x1b). > // This should fail when serializing the Document, but it does not when > XMLUni::fgDOMWRTSplitCdataSections is true. > struct XercesDeleter > { > template > void operator()(T* data) const > { > if (data) { data->release(); }; > } > }; > typedef std::unique_ptr > DOMWriterPtr; > typedef std::unique_ptr > DOMDocumentPtr; > XMLPlatformUtils::Initialize(); > DOMImplementation* impl = > DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("LS")); > // Create DOM with a CDATA section > DOMDocumentPtr document(impl->createDocument()); > DOMElement* element = > document->createElementNS(XMLString::transcode("http://schemas.openxmlformats.org/wordprocessingml/2006/main;), > XMLString::transcode("w:t")); > document->appendChild(element); > DOMCDATASection* codesection = document->createCDATASection(XercesString("c = > '';")); // 0x1B is not a valid XML 1.0 character > element->appendChild(codesection); > DOMWriterPtr writer(impl->createLSSerializer()); > writer->writeToString(document.get()) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: c-dev-h...@xerces.apache.org
[jira] [Commented] (XERCESC-2239) When XMLUni::fgDOMWRTSplitCdataSections is true (the default) invalid XML characters are allowed by DOMWriter
[ https://issues.apache.org/jira/browse/XERCESC-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601489#comment-17601489 ] David Leffingwell commented on XERCESC-2239: It looks like ensureValidString() (or something equivalent) is not being done for DOMNode::CDATA_SECTION_NODE. https://github.com/apache/xerces-c/blob/fc1f7d3a41328e978d7f517193367af8966a40f8/src/xercesc/dom/impl/DOMLSSerializerImpl.cpp > When XMLUni::fgDOMWRTSplitCdataSections is true (the default) invalid XML > characters are allowed by DOMWriter > - > > Key: XERCESC-2239 > URL: https://issues.apache.org/jira/browse/XERCESC-2239 > Project: Xerces-C++ > Issue Type: Bug > Components: DOM >Affects Versions: 3.2.0 >Reporter: David Leffingwell >Priority: Major > > // Create a Document with a CDATA section that contains an invalid XML > character (e.g. 0x1b). > // This should fail when serializing the Document, but it does not when > XMLUni::fgDOMWRTSplitCdataSections is true. > struct XercesDeleter > { > template > void operator()(T* data) const > { > if (data) { data->release(); }; > } > }; > typedef std::unique_ptr > DOMWriterPtr; > typedef std::unique_ptr > DOMDocumentPtr; > XMLPlatformUtils::Initialize(); > DOMImplementation* impl = > DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("LS")); > // Create DOM with a CDATA section > DOMDocumentPtr document(impl->createDocument()); > DOMElement* element = > document->createElementNS(XMLString::transcode("http://schemas.openxmlformats.org/wordprocessingml/2006/main;), > XMLString::transcode("w:t")); > document->appendChild(element); > DOMCDATASection* codesection = document->createCDATASection(XercesString("c = > '';")); // 0x1B is not a valid XML 1.0 character > element->appendChild(codesection); > DOMWriterPtr writer(impl->createLSSerializer()); > writer->writeToString(document.get()) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: c-dev-h...@xerces.apache.org