Alastair Rankine wrote:
> [snip]
> Well here's my list:
> 
> 1. Complete list of authors and categories defined

With permissions, metadata, etc.

> 2. For each article:
>     a. Source text
>     b. All the relevant metadata from the Atom spec, namely:
>         author, ID, published, rights, title, updated, summary, categories
>     c. Some other metadata:
>         draft status, syntax of source
>     d. "Owned" media, whether linked to in the source text or enclosure
> 3. For each comment or trackback:
>     a. Source text
>     b. Atom spec metadata:
>         author, ID, title, published, summary, avatar?
>     c. Additional metadata:
>         pointer to parent article or comment (ie "in-reply-to")
> 

I would add to this information about what plugins have been applied and
what templates have been used.  These, of course, are not going to be
portable to different blog environments but the information would be
necessary in order to faithfully recreate the entries later.

> The tricky bit is defining what is meant by "owned" media.
> 
> If we assume that an input to this process is a URL, I would say that
> "owned" media is any referenced media which resolves to the same host.
> This would preclude separately-hosted media (eg "images.example.com")
> but I don't see that this can be handled easily: how would an importer
> handle media destined for more than one host?
> 

I would say that it's any media linked to by an entry located on the
same host as the entry, with the exporter given some discretion as to
what to include and what not to include.

> I don't think it's worthwhile attempting to support arbitary differences
> between paths of the exported data and its desired import location. For
> example, I wouldn't expect to be able to migrate from
> http://example.com/blog to http://example.org/my/cool/blog AND have all
> the relative and absolute links to media magically work.
> 

+1. It is the responsibility of the importer/exporter to make sure the
references work.  The export should result in an internally consistent
package that can be processed without having to understand the URL
structure of the original blog.

- James

Reply via email to