On 2013-04-05, Michael Seiferle <[email protected]> wrote: > As chopping does not change any semantics (at least with regards to > what XML thinks of semantically important) but only aesthetics this is > enabled by default.
I'm sorry to disagree, but chopping certainly *does* change the semantics--that's precisely why I've argued before that it shouldn't be on by default. The problem becomes obvious with mixed content, e.g., with chopping enabled <doc> <p>Lorem ipsum <em>dolor</em> <x>sit</x> amet ...</p> </doc> becomes <doc> <p>Lorem ipsum<em>dolor</em><x>sit</x>amet ...</p> </doc> which is *not* the same, and AFAIKT this is not conforming behavior (and BaseX doesn't honor xml:space either). I do understand that whitespace chopping as currently implemented is useful for some data-oriented applications, even if it is not conforming, but by default, the behavior should conform to the XML standard. Best regards -- Dr.-Ing. Michael Piotrowski, M.A. <[email protected]> Institute of Computational Linguistics, University of Zurich Phone +41 44 63-54313 | OpenPGP public key ID 0x1614A044 * OUT NOW: Natural Language Processing for Historical Texts * <http://morganclaypool.com/doi/abs/10.2200/S00436ED1V01Y201207HLT017> _______________________________________________ BaseX-Talk mailing list [email protected] https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

