On Mon, 2019-12-09 at 20:27 +0100, Arjan Loeffen wrote: > > In general: when the wiki states here: "Many XML documents include > whitespaces that have been added to improve readability. ", this > should not > apply to mixed content fragments as described. Only to start and end > of > "text content of elements", not on text nodes. > I therefore also think that the second approach is not exactly in > line with > the *intention *of the XML standard.
It isn't, but some of the earliest XML parsers had the option to drop white-space-only text nodes (e.g. MSXML i think) because of XML used in data contexts. The intent was that a DTD could be used to determine which spaces to ignore, but then DTDs became optional. A parser without a DTD does not know which elements _could_ contain text, and hence doesn't know what to drop. In addition, markup like, <person> <name> Nigel </name> <obedience> 0.4 </obedience> </person> is common, unfortunately. In SGML this worked but the whitespace rules were complex enough that were a constant source of trouble. Liam -- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org