Friday, May 19, 2006, 1:40:43 AM, Lisa Dusseault wrote:
> I've been trying to understand if there's a technical problem with > the draft's chosen placement of the attributes and the best case > I've seen is that "that location is technically disallowed by > RFC4287" , an assertion which is disputed (alas, natural language > meanings are often disputed). Hmm, that wasn't my argument either... I fully agree that the attributes are technically allowed by both the text and RelaxNG of RFC4287. I don't think anyone is disputing that. Sorry for repeating myself again, but I'll recap my argument in full because it seems like it hasn't been understood. A bit of background: Modelling ========= Software that deals with XML such as an XHTML document, doesn't have much choice but to model the document using generic XML concepts and tools - Infosets, DOM, SAX, strings containing XML tags, etc. For Atom though, it is useful to model feeds and entries in terms of some other data model: OO, RDBMS, WebDAV (I've been doing it as RDF, but that is a dirty word around these parts). Some of the reasons for this are: Most Atom implementations are concerned with the combined state of a feed over time, not the state of an individual feed document, so pure XML access is inadequate. More robust, and efficient implementations, particularly in the case of RDBMS. Implementations may be built upon existing systems, such as existing content management systems, where a mapping between Atom the XML syntax, and the entities in an existing system would need to be established. Even the best XML APIs are horrible to use when compared to domain specific APIs. Extensions ========== Atom standardised a minimal set of elements, with the expectation that any other elements would be created as extensions. It is therefore important that Atom infrastructure shouldn't get in the way of the deployment of extensions. Atom uses mustIgnore semantics for extensions, which allows implementations to copy-through extensions even if they don't understand their purpose. Section 6 of RFC4287 is flawed. It is an ugly mix of my (overly) strict PaceExtensionConstruct proposal[1], and an (overly) liberal philosophy that the existence of foreign markup anywhere won't break implementations, so shouldn't be disallowed. [1] http://www.intertwingly.net/wiki/pie/PaceExtensionConstruct?action=recall&date=1105566248 I complained about it here[2], admittedly a long time after IETF last call, but as we were still making Atom 101 changes about then (eg: let's allow multiple authors), maybe that wasn't so unreasonable. It doesn't make any sense to define two very precise classes of extension element, and then say that any extra markup can go anywhere, without giving any justification or explanation to why we did this, or any guidelines to why extension authors should pick each of these options. [2] http://www.imc.org/atom-syntax/mail-archive/msg15915.html Unfortunately Simple Extension elements are too unconstrained to fulfil their original objective, so the difference between them and Structured Extensions is academic. The intent was that Simple Extensions were a class of extension whose values were context-free strings that would be easy to implement, and easy to provide UIs for, and extension proposers might be encouraged to choose this class of extension where it was appropriate for improved interoperability. I don't think that this worked out. The difference between extension elements, and foreign attributes is significant however. atompub's charter states: > Atom consists of: > * A conceptual model of a resource > * A concrete syntax for this model Extension elements are defined to have both a model and a syntax, but Atom's allowance for foreign attributes to appear anywhere is a case of syntax that has no corresponding model. Atom doesn't really explain what foreign attributes are intended for. It seems like they could be an extension point, but given that many implementations will have an application model that isn't based on the XML Infoset (as described above), it seems very unwise to create an extension proposal which depends on the precise syntax of an element being preserved. The intent of Simple Extensions was to provide a class of extension that was more interoperable; foreign attributes appear to provide a class of extension (if that is what it is) that will be much less interoperable. Some guidance in how to design extensions is definitely missing from the RFC, perhaps an Informational RFC explaining the issues would be appropriate. Relevance of intermediaries =========================== Most of these issues only apply to Atom intermediaries: agents which much accept Atom documents, and represent the documents internally without losing information. It doesn't matter if a desktop aggregator drops extensions if it has no UI for displaying them anyway. But I think that the class of Atom intermediaries will become larger, and more important: All implementations of Atom Publishing Protocol are Atom intermediaries. Value-added feed services, such as FeedBurner. Microsoft's Feed Platform As soon as a desktop aggregator allows plugins for display and processing of extensions, it becomes an intermediary. Behaviour of intermediaries =========================== Atom makes no attempt to standardise the behaviour of intermediaries. A feed store can strip extensions, contributors, or everything but mandatory elements; it can even mutate these core elements - it is just a "quality of implementation" issue. For blogging applications this is acceptable, for web-service-like implementations, perhaps some conformance levels for the preservation of markup might be useful in future. The lack of standardisation is not necessarily a bad thing, implementations are free to implement what is appropriate to their requirements - if implementations were required to preserve everything perfectly it would massively raise the cost of integrating Atom with existing systems. Publishers (and especially the proposers of extensions), need to be mindful of the varying support of implementations: As an extension proposal makes greater requirements on software, the chances of information loss, and interop problems increases. Especially considering as scenario such as using an off-line blog editor, talking to an Atom Protocol Server, providing a feed via FeedBurner, to an application running on top of Microsoft's Feed Engine. It becomes important to be conservative with the use of so many unstandardised components. Anyway the conclusions I take from all of this are: Foreign attributes are bad, and are inherently less interoperable than Extension Elements. Interoperability should take priority of concerns that 'approach X looks better than Y', and other unjustifiable minor concerns. It is a bad precedent for the first IETF approved extension to rely on such a fragile part of RFC4287. Some sort of BCP for extension proposers would be useful to explain the issues. Perhaps foreign attributes could be clarified as being a 3rd class of extension and reincorporated into the Atom model, with the disclaimer that they are less interoperable than Simple & Structured Extensions? -- Dave
