Yesterday, I wrote: > However, Atom is not only a feed syntax, it's also a publishing/editing > protocol. This means publishing software implementing the Atom protocol > will have to deal with this escaped-HTML too. > And even if most publishing software treats HTML "just" as some "opaque > text" that's eaten and regurgitated "as is", some weblogs aim to produce > XHTML (b2evolution or WordPress), thus they will need to deal with > escaped-HTML and transform it to XHTML.
I forgot to talk about publishing software that aim to produce not even HTML/RSS/Atom but also a PDF or RTF version or something else... They'll also need to parse escaped-HTML while it would have been so easier to just use XHTML (as easy as applying an XSLT stylesheet and rendering through Apache FOP for example). There are many other cases where escaped-HTML will be hard to deal with, at least harder than processing an XHTML document fragment, just because the receiving software don't treat Text Constructs and atom:content content as markup but only as "opaque text". And this is NOT interoperable! One thing we could do is requiring Atom to contain plain text or XHTML only when using the Atom protocol, whatever the receiving software do with it, whether it just takes the "innerXml" (Microsoft's way) of the Text Constructs and atom:content elements and treat it as "opaque text", or parse the XHTML to transform or validate it in any way. Escaped-HTML would be preserved in the Atom syntax only for "backwards" compatibility with RSS, thus only when "publishing" a feed. Or we could state type=HTML is not required to be supported by a user agent for it to be Atom-conformant. We would then need a way for user agents to tell "servers" (would it be an Atom composing/publishing software or a server publising a feed) whether or not they support type=HTML, and that would add a bit complexity. Or we would have to state that whenever escaped-HTML is provided, an alternate XHTML or plain text version is required (except for atom:content, which can be replaced by the atom:link when rendering an entry). Or we could just nuke the type=HTML and do it the RSS0.90-way, allowing escaped-HTML in a type=TEXT, but not requiring to be supported by the user agent (not even stripping tags). Or... -- Thomas Broyer
