[
https://issues.apache.org/jira/browse/DAFFODIL-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Lawrence updated DAFFODIL-3074:
-------------------------------------
Labels: beginner (was: )
> stringAsXML always creates empty elements
> -----------------------------------------
>
> Key: DAFFODIL-3074
> URL: https://issues.apache.org/jira/browse/DAFFODIL-3074
> Project: Daffodil
> Issue Type: Bug
> Components: Back End
> Reporter: Steve Lawrence
> Priority: Major
> Labels: beginner
>
> When using the stringAsXml feature, if the input XML contains something like
> <foo></foo>, this will be written to the infoset as <foo />, i.e. a
> self-closing empty element. Although stringAsXml does not guarantee that it
> will not change the XML in some way, where possible we try to keep the XML
> exactly the same. It would be nice if we could maintain empty vs non-empty
> elements as well.
> The issue seems to be that when the XMLStreamReader sees START_ELEMENT and
> END_ELEMENT events, we call writeStartElement() and writeEndElement()
> functions respectively:
> https://github.com/apache/daffodil/blob/main/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/XMLTextInfosetInputter.scala#L145-L159
> And it seems Woodstox auto collapses elements with no content into empty
> elements (e.g. <foo />).
> The normal XMLStreamReader and XMLStreamWriter API's do not seem to have a
> way to really control this--the XMLStreamReader sees the same START/END
> events regardless if the element is empty or not. And the XMLStreamWriter
> does not have a way to specify if an element should be written as empty or
> not.
> But I think the Woodstox XMLStreamReader2 and XMLStreamWriter2 API's do
> provide enough information. Based on the API and skimming code, I think these
> are the changes that need to be made, though they haven't been tested:
> 1. In the XMLTextInfosetInputter and XMLTextInfosetOutputter, when we call
> createXMLStreamReader or createXMLStreamWriter, we cast the result to the
> Woodstox XMLStreamReader2 and XMLStreamWriter2 interfaces. This gives us
> access to the additional API functions we need.
> 2. Modify the writeXMLStreamEvent in XMLTextInfosetInputter so that the
> START_ELEMENT logic is something like this:
> {code:scala}
> if (xsr.isEmptyElement()) {
> xsw.writeEmptyElement(...)
> } else {
> xsw.writeStartElement(...)
> }
> ... // existing namespace/attribute code
> if (xsr.isEmptyElement()) {
> xsr.next() // skip the END_ELEMENT event since writeEmptyElement ends the
> element
> }
> {code}
> So we call writeEmptyElement or writeStartElement depending on if the the
> element is empty or not. And if the element is empty, then we also call
> xsr.next() to skip the END_ELEMENT event to avoid calling writeEndElement for
> it.
> 3. Modify the END_ELEMENT logic so it calls xsw.writeFullEndElement()
> instead of xsw.writeEndElement(). This forces it to write both the opening
> and closing tag.
> So now END_ELEMENT ensures we always write an opening and closing tag, and
> START_ELEMENT handles the case where the element is empty and ensures
> END_ELEMENT is skipped.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)