On 05/10/2006, at 1:08 AM, James M Snell wrote:

1. Complete list of authors and categories defined

With permissions, metadata, etc.

Metadata like URL and email, sure.

I'm not so sure about permissions. Do we do groups then? And do we try to represent permissions for actions that are not directly related to authoring content? (eg changing the theme) There is huge potential for scope creep here.

My intent with an Atom-based export file format is to facilitate migration from one blogging engine (or similar CMS) to another. But implementations will differ in the functions they offer, and the way they offer them, so it is unreasonable I think to expect that a single file format would allow seamless migration in all cases. Such a format would be a union of the collective data models of all current and future blogging engines, and hence doomed to unimplementable complexity.

So there is a compromise to be made here: on the one hand we want the exported file to contain as much important data as possible, but on the other hand the format needs to be simple enough to implement across multiple blogging engines.

I think it's a worthy goal to minimize, rather than eliminate, the amount of manual reworking required when migrating from one platform to another. In other words it is not a goal for this format to be suitable for backup/restore.

All this is a roundabout way of saying that each blogging engine is likely to have a unique permissions model, and that it is not a big win to attempt to represent this in the export file format.

I would add to this information about what plugins have been applied and
what templates have been used.  These, of course, are not going to be
portable to different blog environments but the information would be
necessary in order to faithfully recreate the entries later.

Agree, but there is a minefield of complexity here.

I currently use Typo as a blogging engine and it supports macros and filters in addition to a set of basic markup langauges. Getting the right combination of macros and filters is important to producing correct HTML output. So it would seem pretty important to include this information in the export file.

But on the other hand the macros and filters are unlikely to be supported on other platform, so probably the safest course of action is to expand these when the content is exported? There is some loss of information here, and an export/import operation is no longer semantically neutral, but the alternative is potentially worse.

The tricky bit is defining what is meant by "owned" media.

I would say that it's any media linked to by an entry located on the
same host as the entry, with the exporter given some discretion as to
what to include and what not to include.

Agree. In practice the decision is likely to be based on whether the exporter can find the linked media file on a local filesystem.

Each image included probably needs to be accompanied by the (relative?) URL at which it was originally published.

So the list is now:

1. Complete list of authors defined. For each author:
        a. Name
        b. URI
        c. email
2. Complete list of categories defined:
        a. Name
        b. URI
3. All articles. For each article:
        a. Source text
        b. All the relevant metadata from the Atom spec, namely:
                author, ID, published, rights, title, updated, summary, 
categories
        c. Some other metadata:
                draft status, syntax of source
4. All comments and trackbacks. For each comment or trackback:
        a. Source text
        b. Atom spec metadata:
                author, ID, title, published, summary, avatar?
        c. Additional metadata:
                pointer to parent article or comment (ie "in-reply-to")
5. All "Owned" media. For each media object:
        a. URI
        b. MIME type
        c. Binary data

Does this look about right? Obviously there would need to be a liberal sprinkling of extension points for proprietary information.

Reply via email to