By "empty XML nodes," do you mean whitespace-only string nodes? Those are
included because all in-element whitespace is assumed significant by the spec:
https://www.w3.org/TR/xml/#sec-white-space
The exception is if the element is declared in the DTD as only having element
children ("element content"): https://www.w3.org/TR/xml/#dt-elemcontent
For example, if you declare an element like this:
<!ELEMENT one (two,three*,four?)>
Any whitespace around a "two," "three," or "four" element child of a "one"
element is insignificant and ignored (unless #preservesIgnorableWhitespace: is
true). Other parsers, like LibXML2 and Xerces, behave the same way.
I'll see if I can come up with some easier way to deal with this, like an
optional parser setting, new enumeration methods, or maybe a tree
transformation.
> Sent: Tuesday, December 05, 2017 at 8:29 AM
> From: "Stephane Ducasse" <[email protected]>
> To: "Pharo Development List" <[email protected]>
> Subject: [Pharo-dev] How to get rid of empty XML nodes?
>
> )Hi
>
> we are manipulating an XML document and I would like to get rid of the
> spurious empty string.
> We saw that the gt panes are doing it.
>
> (aNodeWithElements isStringNode
> and: [aNodeWithElements isEmpty
> or: [aNodeWithElements isWhitespace]]
>
> Is there a way not to produce empty nodes?
> Is there a simple way not to have to handle them
>
> Now each time we are dealing with a node with have to check.
>
> Stef
>
>