Many of the C14N spec can be done at the SAX level as the document is being
parsed and this will probably be the most performant way of reading a C14N
dom4j document.

e.g. all of the following could be done as an XMLFilter (I think) as its
mostly involved with text encoding (merging adjacent text, entity references
and CDATA sections) and whitespace handling:-

> - The document is encoded in UTF-8 (i think that is dom4j standard)
> - Line breaks normalized to #xA on input, before parsing
> - Character and parsed entity references are replaced
> - CDATA sections are replaced with their character content
> - The XML declaration and document type declaration (DTD) are removedEmpty
> elements are converted to start-end tag pairs
> - Whitespace outside of the document element and within start and end tags
>     is normalizedAll whitespace in character content is retained
(excluding
> characters
>     removed during line feed normalization)
> Attribute value delimiters are set to quotation marks (double
quotes)Special
>
> characters in attribute values and character content are
>     replaced by character references

The following things are much harder and would probably need to be done
directly on a dom4j document:-

> -Superfluous namespace declarations are removed from each elementDefault
> attributes are added to each element
> -Lexicographic order is imposed on the namespace declarations and
>     attributes of each element
> - Attribute values are normalized, as if by a validating processor

Some or all of the above could be added to a derivation of SAXReader (or
SAXContentHandler). For example, I think already duplicate namespace
declarations are removed by SAXReader (though this should be tested). Though
I'm not sure what is involved in the last 2, do they require DTD knowledge?

>From a usability perspective, it might make sense to put all this
functionality into a single 'Canonicaliser' that is capable of performing a
C14N of any dom4j Document created in any way (either SAX, DOM or
programatically).

James


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


_______________________________________________
dom4j-dev mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/dom4j-dev

Reply via email to