So I got a bit distracted for a bit. Here's what we were talking about.
### What is it?
Atom Export is a proposal for a standard/convention/something for the
export of content from a CMS (think blogging engine) in a format that
is similar to other Atom standards. The assumption here is that the
content being exported is a good fit for Atom. I believe this to be
the case (hence discussing it on this list) but it is not established
yet.
More rationale, and initial attempt, here: http://girtby.net/articles/
2006/08/14/towards-a-common-blog-export-format
### Where did we get up to?
We have a list of requirements for the export format, specifically
the data from a typical blogging engine that should be included in
the export. We also had identified some preliminary issues with
representing the content in an Atom-based document format. Lastly we
had clarified some scope, namely that the primary use case should be
for migrating from one content engine to another, and not necessarily
for backup purposes within a single application.
### What data is to be represented by this?
Here is a list of data to be included in the export. For convenience
a blog engine is assumed, although other compatible CMSs may
implement this as well:
1. Complete list of authors defined. For each author:
1. Name
2. URI
3. email
2. Complete list of categories defined:
1. Name
2. URI
3. All articles. For each article:
1. Source text
2. All the relevant metadata from the Atom spec, namely:
author, ID, published, rights, title, updated, summary, categories
3. Some other metadata: draft status, syntax of source
4. All comments and trackbacks. For each comment or trackback:
1. Source text
2. Atom spec metadata: author, ID, title, published,
summary, avatar?
3. Additional metadata: pointer to parent article or
comment (ie "in-reply-to")
5. All "Owned" media. For each media object:
1. URI
2. MIME type
3. Binary data
### What are the issues identified so far?
Briefly:
* It was felt that the inclusion of binary media objects warrants
the use of a binary archive format. The issue of how to organise the
content within the archive file is unresolved.
* There was some discussion about the inclusion of generated text,
specifically HTML generated from the source text. It is redundant but
may be useful.
### What now?
I think the best way to proceed is to propose a fresh mapping of the
above data requirements to Atom-derived constructs. I already had a
go at this previously, but I think it might be better to start
afresh. This time I'd like to try an approach based around the
concepts introduced in APP, particularly that of collections, which
look like a good fit for items 3,4, and 5 above.