Stefan Behnel <stefan...@behnel.de> added the comment:

Turns out, it was not that easy. :-/

ElementTree lacks prefixes in its tree model, so they would have to be either 
registered globally (via register_namespace()) or come from the parser. I tried 
the latter since that is the most generic way when the input is serialised 
already. See issue 36673 and issue 36676 for extensions to the parser target 
interface that this implementation relies on. Note that this is a new 
implementation, only marginally based off the original ElementC14N 
implementation.

I only implemented C14N 2.0 (which lxml also does not have, but I'll add it 
there). I got most of the official test cases working, including prefix 
rewriting and prefix resolution in tag and attribute content.

https://www.w3.org/TR/xml-c14n2-testcases/

What's not supported?

The original namespace prefixes may not be preserved when namespaces are 
declared with multiple prefixes. In that case, one of them is picked. That's 
difficult to implement in ET because the parser resolves and discards prefixes. 
I think that's acceptable, as long as the prefix selection is deterministic.

Also, qname rewriting in XPath expressions that appear in XML text is not 
currently supported. I guess that's a bit of an esoteric feature which can 
still be added later if it's needed.

While testing, I noticed that ET and cET behave differently when it comes to 
resolving default attributes from an internal DTD subset. The parser in cET 
does it, ET does not. That should probably get aligned. For now, the tests hack 
around that difference.

Comments and reviews welcome.

----------
assignee: serhiy.storchaka -> scoder

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue13611>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to