Whitespace in certain places isn't reported by the XML parser to the XQuery 
processor, so there is no way the XQuery processor can preserve it. Examples 
are whitespace between the XML declaration and the first element node, and 
whitespace within a start or end tag.

Other things that aren't reported by the parser (and therefore can't be 
retained) include the choice of single-vs-double quotes around attribute 
values, entity references, CDATA section boundaries, redundant namespace 
declarations, and the order of attributes within a start tag.

Using textual diff tools on XML documents isn't really a viable strategy - you 
need to do the diff in a way that is XML-aware. One way is to canonicalize the 
two documents and compare their canonical forms. Canonicalizing takes a very 
similar view to XDM - though not 100% identical - as to what's significant in 
an XML document and what isn't.

Michael Kay
Saxonica
_______________________________________________
[email protected]
http://x-query.com/mailman/listinfo/talk

Reply via email to