Re: [Pharo-dev] How to get rid of empty XML nodes?

Stephane Ducasse Fri, 08 Dec 2017 06:23:30 -0800

Hi monty

On Fri, Dec 8, 2017 at 9:03 AM, monty <[email protected]> wrote:
> By "empty XML nodes," do you mean whitespace-only string nodes?

Yes

> Those are included because all in-element whitespace is assumed significant 
> by the spec: https://www.w3.org/TR/xml/#sec-white-space

I know. There was a discussion a while ago. I just lost a couple of
hours understanding that :(

But this is a super super super annoying practices.
We had to test each nodes to see if it is a empty nodes so it makes
everything a lot more complex without real justification
beside the fact that these standardizers probably never implemented
some real cases.
This standard is a really out of reality from that perspective.

> The exception is if the element is declared in the DTD as only having element 
> children ("element content"): https://www.w3.org/TR/xml/#dt-elemcontent

Well the XML files that I had (I did not choose XML because I would
have prefer JSON :) ), had no DTD :(

So at the end of the day, this wonderful standard puts all the stress
and burden to people.

>
> For example, if you declare an element like this:
>
> <!ELEMENT one (two,three*,four?)>
>
> Any whitespace around a "two," "three," or "four" element child of a "one" 
> element is insignificant and ignored (unless #preservesIgnorableWhitespace: 
> is true). Other parsers, like LibXML2 and Xerces, behave the same way.
>
> I'll see if I can come up with some easier way to deal with this, like an 
> optional parser setting, new enumeration methods, or maybe a tree 
> transformation.

It would be A HUGE PLUS!!!!!!!!!!!!!!!!!!

Because reality is that people have XML files with just nodes and no
empty nodes and they are forced to
Let me know because I could try.

I was showing how to use Pharo to import code to pharo learners and
this was a big drag.

Stef

I tried to set some values in the parser but it did not work.
BTW I saw that the configuration logic forces to write the following

| parser doc visitor |
parser := XMLDOMParser new
   on: self xmlContents;
   preservesIgnorableWhitespace: true.

and not

| parser doc visitor |
parser := XMLDOMParser new
    preservesIgnorableWhitespace: true.
    on: self xmlContents;

>
>> Sent: Tuesday, December 05, 2017 at 8:29 AM
>> From: "Stephane Ducasse" <[email protected]>
>> To: "Pharo Development List" <[email protected]>
>> Subject: [Pharo-dev] How to get rid of empty XML nodes?
>>
>> )Hi
>>
>> we are manipulating an XML document and I would like to get rid of the
>> spurious empty string.
>> We saw that the gt panes are doing it.
>>
>> (aNodeWithElements isStringNode
>> and: [aNodeWithElements isEmpty
>> or: [aNodeWithElements isWhitespace]]
>>
>> Is there a way not to produce empty nodes?
>> Is there a simple way not to have to handle them
>>
>> Now each time we are dealing with a node with have to check.
>>
>> Stef
>>
>>
>

Re: [Pharo-dev] How to get rid of empty XML nodes?

Reply via email to