On May 26, 2005, at 06:23, James Cerra wrote:

Henri Sivonen,

If the intended Atom content contains essential comments,

There should be no such thing as essential comments. The XML spec does
not require XML processors to report comments to the app. Hence,
comments are inappropriate for transferring essential data.

Yes, but MSIE^H^H^H^Hsome xml processors (cough cough) still inappropriately
use comments for that purpose.

I am not familiar with that. What purpose exactly? Why should Atom support it?

Then there are example XML documents (i.e. for tutorials) that sometimes require comments be preserved.

If you want people to read the source as is, it would make sense to make it text/plain or put the source snippets in <code> or <pre> in an (X)HTML tutorial.

Finally if you are posting content associated with an entry with the Atom API, then it is important that any documents uploaded are not modified.

I think Atom processors should not be required to preserve serialization artifacts. That is, IMO, they should not be required to care about anything that is not exposed via SAX2 ContentHandler with qNames ignored.

processing instructions,

These could be supported in embedded content if the Atom spec said PIs
in atom:content belong in content and should not be acted upon by the
Atom processor.

I don't like this condition. Any content will be read by XML processors before being handed to an Atom processor. So the generic XML processor will not be able to differentiate between PIs that are significant - and thus should be processed by their processors - from those which should be passed through.

Vanilla XML processors don't act on PIs. They report them to the application--in this case the Atom processor.

entities

Entities can be flattened.

Again, as with comments, I agree in principle, but in practice some processors
depend on them.

Depending on an entity reference and not being able to accept the straight replacement text is just wrong.

DOCTYPE declaration(s)

If DOCTYPE is essential for the receiving app, you've got bigger
problems than Atom. Hardwiring IDness based on namespaces is more
practical than relying on DTD-based data typing.

I'm not necessarily talking about external document type definitions.
Preventing DOCTYPEs would cause incompatibilities with XML well-formedness
processing.  From the spec:

"[Definition: While they are not required to check the document for validity, they are REQUIRED to process all the declarations they read in the internal DTD subset and in any parameter entity that they read, up to the first reference to a parameter entity that they do not read; that is to say, they MUST use the information in those declarations to normalize attribute values, include the replacement text of internal entities, and supply default attribute values.]"

That's a non-issue. You don't just throw away the DOCTYPE but parse the original document with it and reserialize as a DOCTYPEless fragment. You don't lose well-formedness or content. You only lose the shallow attribute data typing provided by the DTD.

Some RDF/XML processors already put internal DOCTYPE declarations to make the
content more readable.

Atom's main purpose is to facilitate software to software communication. When interop or implementation ease and readability conflict, readability should yield.

There is nothing intrinsically bad about DOCTYPE sections

Yes, there is. On the Web you cannot trust that the recipient is using an XML processor that processes the DOCTYPE beyond checking the internal subset for well-formedness. An optional feature that does not degrade gracefully when not supported is bad. DOCTYPE is such a feature. In the cases where the DOCTYPE can be gracefully ignored, the DOCTYPE is pointless.

Remember that anything compatible with XML MUST allow its optional features to be in the markup (even if they are just ignored). So if you disallow DOCTYPE
sections, then you can't claim to support XML.

Atom is not claiming that you could embed the literal source of any XML doc. You can embed the stuff that canonicalization would preserve.

Anything that could have any chance of dataloss - in this case, the loss of XML
comments, PIs, and XML and DOCTYPE declarations � is bad.

Would you consider changes in the order of attributes and the white space between attributes data loss as well?

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Reply via email to