[jira] [Commented] (XERCESC-2239) When XMLUni::fgDOMWRTSplitCdataSections is true (the default), invalid XML characters are allowed by DOMWriter

2022-10-06 Thread Scott Cantor (Jira)


[ 
https://issues.apache.org/jira/browse/XERCESC-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17613505#comment-17613505
 ] 

Scott Cantor commented on XERCESC-2239:
---

I suspect there's an intended distinction between "illegal" characters and 
"unrepresentable" ones. The feature apparently controls how unrepresentable 
characters are handled, and explicitly changes the behavior such that they're 
output numerically and don't cause an error.

I don't know the specs well enough to even consider making a change to this 
code, or even if a change is in fact the right thing to do. I'm pretty sure the 
current behavior is intentional.

> When XMLUni::fgDOMWRTSplitCdataSections is true (the default), invalid XML 
> characters are allowed by DOMWriter
> --
>
> Key: XERCESC-2239
> URL: https://issues.apache.org/jira/browse/XERCESC-2239
> Project: Xerces-C++
>  Issue Type: Bug
>  Components: DOM
>Affects Versions: 3.2.0
> Environment: Operating System: All
> Platform: All
>Reporter: David Leffingwell
>Priority: Major
> Fix For: 3.2.4
>
>
> // Create a Document with a CDATA section that contains an invalid XML 
> character (e.g. 0x1b). 
> // This should fail when serializing the Document, but it does not when 
> XMLUni::fgDOMWRTSplitCdataSections is true.
> struct XercesDeleter
> {
> template
> void operator()(T* data) const
> {
> if (data) { data->release(); };
> }
> };
> typedef std::unique_ptr  
>  DOMWriterPtr;
> typedef std::unique_ptr 
> DOMDocumentPtr;
> XMLPlatformUtils::Initialize();
> DOMImplementation* impl = 
> DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("LS"));
>  // Create DOM with a CDATA section
> DOMDocumentPtr document(impl->createDocument());
> DOMElement* element = 
> document->createElementNS(XMLString::transcode("http://schemas.openxmlformats.org/wordprocessingml/2006/main;),
>  XMLString::transcode("w:t"));
> document->appendChild(element);
> DOMCDATASection* codesection = document->createCDATASection(XercesString("c = 
> '';")); // 0x1B is not a valid XML 1.0 character
> element->appendChild(codesection); 
> DOMWriterPtr writer(impl->createLSSerializer());
> writer->writeToString(document.get())



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Commented] (XERCESC-2239) When XMLUni::fgDOMWRTSplitCdataSections is true (the default), invalid XML characters are allowed by DOMWriter

2022-10-05 Thread Scott Cantor (Jira)


[ 
https://issues.apache.org/jira/browse/XERCESC-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17613048#comment-17613048
 ] 

Scott Cantor commented on XERCESC-2239:
---

I'll at least take a look but I doubt I would have the confidence to make a fix 
to this.

> When XMLUni::fgDOMWRTSplitCdataSections is true (the default), invalid XML 
> characters are allowed by DOMWriter
> --
>
> Key: XERCESC-2239
> URL: https://issues.apache.org/jira/browse/XERCESC-2239
> Project: Xerces-C++
>  Issue Type: Bug
>  Components: DOM
>Affects Versions: 3.2.0
> Environment: Operating System: All
> Platform: All
>Reporter: David Leffingwell
>Priority: Major
> Fix For: 3.2.4
>
>
> // Create a Document with a CDATA section that contains an invalid XML 
> character (e.g. 0x1b). 
> // This should fail when serializing the Document, but it does not when 
> XMLUni::fgDOMWRTSplitCdataSections is true.
> struct XercesDeleter
> {
> template
> void operator()(T* data) const
> {
> if (data) { data->release(); };
> }
> };
> typedef std::unique_ptr  
>  DOMWriterPtr;
> typedef std::unique_ptr 
> DOMDocumentPtr;
> XMLPlatformUtils::Initialize();
> DOMImplementation* impl = 
> DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("LS"));
>  // Create DOM with a CDATA section
> DOMDocumentPtr document(impl->createDocument());
> DOMElement* element = 
> document->createElementNS(XMLString::transcode("http://schemas.openxmlformats.org/wordprocessingml/2006/main;),
>  XMLString::transcode("w:t"));
> document->appendChild(element);
> DOMCDATASection* codesection = document->createCDATASection(XercesString("c = 
> '';")); // 0x1B is not a valid XML 1.0 character
> element->appendChild(codesection); 
> DOMWriterPtr writer(impl->createLSSerializer());
> writer->writeToString(document.get())



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Commented] (XERCESC-2239) When XMLUni::fgDOMWRTSplitCdataSections is true (the default) invalid XML characters are allowed by DOMWriter

2022-09-07 Thread David Leffingwell (Jira)


[ 
https://issues.apache.org/jira/browse/XERCESC-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601489#comment-17601489
 ] 

David Leffingwell commented on XERCESC-2239:


It looks like ensureValidString() (or something equivalent) is not being done 
for DOMNode::CDATA_SECTION_NODE.

https://github.com/apache/xerces-c/blob/fc1f7d3a41328e978d7f517193367af8966a40f8/src/xercesc/dom/impl/DOMLSSerializerImpl.cpp

> When XMLUni::fgDOMWRTSplitCdataSections is true (the default) invalid XML 
> characters are allowed by DOMWriter
> -
>
> Key: XERCESC-2239
> URL: https://issues.apache.org/jira/browse/XERCESC-2239
> Project: Xerces-C++
>  Issue Type: Bug
>  Components: DOM
>Affects Versions: 3.2.0
>Reporter: David Leffingwell
>Priority: Major
>
> // Create a Document with a CDATA section that contains an invalid XML 
> character (e.g. 0x1b). 
> // This should fail when serializing the Document, but it does not when 
> XMLUni::fgDOMWRTSplitCdataSections is true.
> struct XercesDeleter
> {
> template
> void operator()(T* data) const
> {
> if (data) { data->release(); };
> }
> };
> typedef std::unique_ptr  
>  DOMWriterPtr;
> typedef std::unique_ptr 
> DOMDocumentPtr;
> XMLPlatformUtils::Initialize();
> DOMImplementation* impl = 
> DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("LS"));
>  // Create DOM with a CDATA section
> DOMDocumentPtr document(impl->createDocument());
> DOMElement* element = 
> document->createElementNS(XMLString::transcode("http://schemas.openxmlformats.org/wordprocessingml/2006/main;),
>  XMLString::transcode("w:t"));
> document->appendChild(element);
> DOMCDATASection* codesection = document->createCDATASection(XercesString("c = 
> '';")); // 0x1B is not a valid XML 1.0 character
> element->appendChild(codesection); 
> DOMWriterPtr writer(impl->createLSSerializer());
> writer->writeToString(document.get())



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org