Re: Feed Thread in Last Call

David Powell Sat, 20 May 2006 09:04:28 -0700


Friday, May 19, 2006, 1:40:43 AM, Lisa Dusseault wrote:


> I've been trying to understand if there's a technical problem with
> the draft's chosen placement of the attributes and the best case
> I've seen is that "that location is technically disallowed by
> RFC4287" , an assertion which is disputed (alas, natural language
> meanings are often disputed).

Hmm, that wasn't my argument either... I fully agree that the
attributes are technically allowed by both the text and RelaxNG of
RFC4287. I don't think anyone is disputing that.

Sorry for repeating myself again, but I'll recap my argument in full
because it seems like it hasn't been understood.  A bit of background:


Modelling
=========

Software that deals with XML such as an XHTML document, doesn't have
much choice but to model the document using generic XML concepts and
tools - Infosets, DOM, SAX, strings containing XML tags, etc.

For Atom though, it is useful to model feeds and entries in terms of
some other data model: OO, RDBMS, WebDAV (I've been doing it as RDF,
but that is a dirty word around these parts). Some of the reasons for
this are:

  Most Atom implementations are concerned with the combined state of a
  feed over time, not the state of an individual feed document, so
  pure XML access is inadequate.

  More robust, and efficient implementations, particularly in the case
  of RDBMS.

  Implementations may be built upon existing systems, such as existing
  content management systems, where a mapping between Atom the XML
  syntax, and the entities in an existing system would need to be
  established.

  Even the best XML APIs are horrible to use when compared to domain
  specific APIs.


Extensions
==========

Atom standardised a minimal set of elements, with the expectation that
any other elements would be created as extensions. It is therefore
important that Atom infrastructure shouldn't get in the way of the
deployment of extensions. Atom uses mustIgnore semantics for
extensions, which allows implementations to copy-through extensions
even if they don't understand their purpose.

Section 6 of RFC4287 is flawed. It is an ugly mix of my (overly)
strict PaceExtensionConstruct proposal[1], and an (overly) liberal
philosophy that the existence of foreign markup anywhere won't break
implementations, so shouldn't be disallowed.

[1] 
http://www.intertwingly.net/wiki/pie/PaceExtensionConstruct?action=recall&date=1105566248

I complained about it here[2], admittedly a long time after IETF last
call, but as we were still making Atom 101 changes about then (eg:
let's allow multiple authors), maybe that wasn't so unreasonable. It
doesn't make any sense to define two very precise classes of extension
element, and then say that any extra markup can go anywhere, without
giving any justification or explanation to why we did this, or any
guidelines to why extension authors should pick each of these options.

[2] http://www.imc.org/atom-syntax/mail-archive/msg15915.html

Unfortunately Simple Extension elements are too unconstrained to fulfil
their original objective, so the difference between them and
Structured Extensions is academic. The intent was that Simple
Extensions were a class of extension whose values were context-free
strings that would be easy to implement, and easy to provide UIs for,
and extension proposers might be encouraged to choose this class of
extension where it was appropriate for improved interoperability. I
don't think that this worked out.

The difference between extension elements, and foreign attributes is
significant however.

atompub's charter states:

> Atom consists of:
>     * A conceptual model of a resource
>     * A concrete syntax for this model

Extension elements are defined to have both a model and a syntax, but
Atom's allowance for foreign attributes to appear anywhere is a case
of syntax that has no corresponding model. Atom doesn't really explain
what foreign attributes are intended for. It seems like they could be
an extension point, but given that many implementations will have an
application model that isn't based on the XML Infoset (as described
above), it seems very unwise to create an extension proposal which
depends on the precise syntax of an element being preserved.

The intent of Simple Extensions was to provide a class of extension
that was more interoperable; foreign attributes appear to provide a
class of extension (if that is what it is) that will be much less
interoperable.

Some guidance in how to design extensions is definitely missing from
the RFC, perhaps an Informational RFC explaining the issues would be
appropriate.


Relevance of intermediaries
===========================

Most of these issues only apply to Atom intermediaries: agents which
much accept Atom documents, and represent the documents internally
without losing information.  It doesn't matter if a desktop aggregator
drops extensions if it has no UI for displaying them anyway.  But I
think that the class of Atom intermediaries will become larger, and
more important:

  All implementations of Atom Publishing Protocol are Atom
  intermediaries.

  Value-added feed services, such as FeedBurner.

  Microsoft's Feed Platform

  As soon as a desktop aggregator allows plugins for display and
  processing of extensions, it becomes an intermediary.

  
Behaviour of intermediaries
===========================

Atom makes no attempt to standardise the behaviour of intermediaries.
A feed store can strip extensions, contributors, or everything but
mandatory elements; it can even mutate these core elements - it is
just a "quality of implementation" issue. For blogging applications
this is acceptable, for web-service-like implementations, perhaps
some conformance levels for the preservation of markup might be useful
in future.

The lack of standardisation is not necessarily a bad thing,
implementations are free to implement what is appropriate to their
requirements - if implementations were required to preserve everything
perfectly it would massively raise the cost of integrating Atom with
existing systems.

Publishers (and especially the proposers of extensions), need to be
mindful of the varying support of implementations:

As an extension proposal makes greater requirements on software, the
chances of information loss, and interop problems increases.
Especially considering as scenario such as using an off-line blog
editor, talking to an Atom Protocol Server, providing a feed via
FeedBurner, to an application running on top of Microsoft's Feed
Engine. It becomes important to be conservative with the use of so
many unstandardised components.



Anyway the conclusions I take from all of this are:

Foreign attributes are bad, and are inherently less interoperable
than Extension Elements.

Interoperability should take priority of concerns that 'approach X looks
better than Y', and other unjustifiable minor concerns.

It is a bad precedent for the first IETF approved extension to rely on
such a fragile part of RFC4287.

Some sort of BCP for extension proposers would be useful to explain
the issues. Perhaps foreign attributes could be clarified as being a
3rd class of extension and reincorporated into the Atom model, with
the disclaimer that they are less interoperable than Simple &
Structured Extensions?

  
-- 
Dave

Re: Feed Thread in Last Call

Reply via email to