On May 26, 2005, at 06:23, James Cerra wrote:
Henri Sivonen,
If the intended Atom content contains essential comments,
There should be no such thing as essential comments. The XML spec does
not require XML processors to report comments to the app. Hence,
comments are inappropriate for transferring essential data.
Yes, but MSIE^H^H^H^Hsome xml processors (cough cough) still
inappropriately
use comments for that purpose.
I am not familiar with that. What purpose exactly? Why should Atom
support it?
Then there are example XML documents (i.e. for tutorials) that
sometimes require comments be preserved.
If you want people to read the source as is, it would make sense to
make it text/plain or put the source snippets in <code> or <pre> in an
(X)HTML tutorial.
Finally if you are posting content associated with an entry with the
Atom API, then it is important that any documents uploaded are not
modified.
I think Atom processors should not be required to preserve
serialization artifacts. That is, IMO, they should not be required to
care about anything that is not exposed via SAX2 ContentHandler with
qNames ignored.
processing instructions,
These could be supported in embedded content if the Atom spec said PIs
in atom:content belong in content and should not be acted upon by the
Atom processor.
I don't like this condition. Any content will be read by XML
processors before
being handed to an Atom processor. So the generic XML processor will
not be
able to differentiate between PIs that are significant - and thus
should be
processed by their processors - from those which should be passed
through.
Vanilla XML processors don't act on PIs. They report them to the
application--in this case the Atom processor.
entities
Entities can be flattened.
Again, as with comments, I agree in principle, but in practice some
processors
depend on them.
Depending on an entity reference and not being able to accept the
straight replacement text is just wrong.
DOCTYPE declaration(s)
If DOCTYPE is essential for the receiving app, you've got bigger
problems than Atom. Hardwiring IDness based on namespaces is more
practical than relying on DTD-based data typing.
I'm not necessarily talking about external document type definitions.
Preventing DOCTYPEs would cause incompatibilities with XML
well-formedness
processing. From the spec:
"[Definition: While they are not required to check the document for
validity,
they are REQUIRED to process all the declarations they read in the
internal DTD
subset and in any parameter entity that they read, up to the first
reference to
a parameter entity that they do not read; that is to say, they MUST
use the
information in those declarations to normalize attribute values,
include the
replacement text of internal entities, and supply default attribute
values.]"
That's a non-issue. You don't just throw away the DOCTYPE but parse the
original document with it and reserialize as a DOCTYPEless fragment.
You don't lose well-formedness or content. You only lose the shallow
attribute data typing provided by the DTD.
Some RDF/XML processors already put internal DOCTYPE declarations to
make the
content more readable.
Atom's main purpose is to facilitate software to software
communication. When interop or implementation ease and readability
conflict, readability should yield.
There is nothing intrinsically bad about DOCTYPE sections
Yes, there is. On the Web you cannot trust that the recipient is using
an XML processor that processes the DOCTYPE beyond checking the
internal subset for well-formedness. An optional feature that does not
degrade gracefully when not supported is bad. DOCTYPE is such a
feature. In the cases where the DOCTYPE can be gracefully ignored, the
DOCTYPE is pointless.
Remember that anything compatible with XML MUST allow its optional
features to
be in the markup (even if they are just ignored). So if you disallow
DOCTYPE
sections, then you can't claim to support XML.
Atom is not claiming that you could embed the literal source of any XML
doc. You can embed the stuff that canonicalization would preserve.
Anything that could have any chance of dataloss - in this case, the
loss of XML
comments, PIs, and XML and DOCTYPE declarations � is bad.
Would you consider changes in the order of attributes and the white
space between attributes data loss as well?
--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/