Re: atom:type, xsl:output

Henri Sivonen Thu, 26 May 2005 01:18:35 -0700


On May 26, 2005, at 06:23, James Cerra wrote:

Henri Sivonen,

If the intended Atom content contains essential comments,


There should be no such thing as essential comments. The XML spec does
not require XML processors to report comments to the app. Hence,
comments are inappropriate for transferring essential data.

Yes, but MSIE^H^H^H^Hsome xml processors (cough cough) stillinappropriately

use comments for that purpose.

I am not familiar with that. What purpose exactly? Why should Atomsupport it?

Then there are example XML documents (i.e. for tutorials) thatsometimes require comments be preserved.

If you want people to read the source as is, it would make sense tomake it text/plain or put the source snippets in <code> or <pre> in an(X)HTML tutorial.

Finally if you are posting content associated with an entry with theAtom API, then it is important that any documents uploaded are notmodified.

I think Atom processors should not be required to preserveserialization artifacts. That is, IMO, they should not be required tocare about anything that is not exposed via SAX2 ContentHandler withqNames ignored.

processing instructions,
These could be supported in embedded content if the Atom spec said PIs
in atom:content belong in content and should not be acted upon by the
Atom processor.
I don't like this condition. Any content will be read by XMLprocessors beforebeing handed to an Atom processor. So the generic XML processor willnot beable to differentiate between PIs that are significant - and thusshould beprocessed by their processors - from those which should be passedthrough.

Vanilla XML processors don't act on PIs. They report them to theapplication--in this case the Atom processor.

entities
Entities can be flattened.
Again, as with comments, I agree in principle, but in practice someprocessors
depend on them.

Depending on an entity reference and not being able to accept thestraight replacement text is just wrong.

DOCTYPE declaration(s)
If DOCTYPE is essential for the receiving app, you've got bigger
problems than Atom. Hardwiring IDness based on namespaces is more
practical than relying on DTD-based data typing.
I'm not necessarily talking about external document type definitions.
Preventing DOCTYPEs would cause incompatibilities with XMLwell-formedness
processing.  From the spec:
"[Definition: While they are not required to check the document forvalidity,they are REQUIRED to process all the declarations they read in theinternal DTDsubset and in any parameter entity that they read, up to the firstreference toa parameter entity that they do not read; that is to say, they MUSTuse theinformation in those declarations to normalize attribute values,include thereplacement text of internal entities, and supply default attributevalues.]"

That's a non-issue. You don't just throw away the DOCTYPE but parse theoriginal document with it and reserialize as a DOCTYPEless fragment.You don't lose well-formedness or content. You only lose the shallowattribute data typing provided by the DTD.

Some RDF/XML processors already put internal DOCTYPE declarations tomake the
content more readable.

Atom's main purpose is to facilitate software to softwarecommunication. When interop or implementation ease and readabilityconflict, readability should yield.

There is nothing intrinsically bad about DOCTYPE sections

Yes, there is. On the Web you cannot trust that the recipient is usingan XML processor that processes the DOCTYPE beyond checking theinternal subset for well-formedness. An optional feature that does notdegrade gracefully when not supported is bad. DOCTYPE is such afeature. In the cases where the DOCTYPE can be gracefully ignored, theDOCTYPE is pointless.

Remember that anything compatible with XML MUST allow its optionalfeatures tobe in the markup (even if they are just ignored). So if you disallowDOCTYPE
sections, then you can't claim to support XML.

Atom is not claiming that you could embed the literal source of any XMLdoc. You can embed the stuff that canonicalization would preserve.

Anything that could have any chance of dataloss - in this case, theloss of XML
comments, PIs, and XML and DOCTYPE declarations � is bad.

Would you consider changes in the order of attributes and the whitespace between attributes data loss as well?


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: atom:type, xsl:output

Reply via email to