media posts and interop testing

rob yates Tue, 07 Feb 2006 07:23:33 -0800

All,

I work within product development at Lotus / IBM. We are implementing the Atom Publishing Protocol atop something akin to a shared blog. As we moved to interop testing between our client and server we ran into some problems with media posts. We describe these in gory detail below and wondered if anyone else is having similar problems. We have addressed these problems by adopting PaceCompoundEntries and wondered if others are seeing similar things.

Within the shared blog we are implementing, there are several different types of entries including comments, tasks, links and files.  Each type of entry can be tagged, has an author, can be responded to, etc. Each blog (and each entry) has associated access controls. The major issues are outlined below and fall into two categories:

·         Supporting embedded images in posts

·         Media objects as first class objects

Supporting Embedded Images in posts

The entries in our system may contain rich markup content including embedded images.  When such images are uploaded to our shared blog, they are subject to the same access controls as the entry within which they appear.  Currently, the access control is very simple and in blogging terms is such; any user with access to the blog can access its associated media.  Further, when the entry is deleted or updated, the image resources must also be deleted or updated.   The acl for the entry (and thereby the embedded media within the entry) is set via extensions on the pub:control element

To implement this so that we conform with the current spec, our server and client implementations must correlate multiple discreet posts to multiple collections – e.g., maintain client and server state across multiple posts, match up access controls, manage lifecycles, deal with failure situations -- e.g., if the media post succeeds, but the subsequent entry post fails, we need to rollback the media post(s).

What we want, instead, is to allow the client to make a single post containing the entry, and all embedded media resources in a single compound package such that the creation of the entry and it's embedded resources can be performed in a single atomic operation.


Media Objects as First Class Objects

In addition to supporting media resources embedded within entry content, the server we are building requires that some media objects be considered first class objects i.e. they can be categorized, have authors, etc. one such example we have is posting a word document to a shared space.  We feel that there are many other use cases that fall into this category that the ATOM publishing protocol should support e.g. posting pictures to flickr or mp3 files to a podcast.  Note that these aren't resources that we're uploading and just happen to be referencing from a blog entry... the media resources themselves are the primary content of the posts.

There are a several of ways that this can be achieved with the current draft.  The specification allows the initial posting of the media to either an entry collection or a media collection.  Following this initial post a second post (or put) must be made that updates the meta data for the resource.

Unfortunately, these existing solutions suffer from the same fundamental multiple-operation correlation problem seen in the embedded-media scenario discussed previously.

Further, given that there are currently multiple approaches available to posting media objects as first class objects, how does a client determine which is supported on the server they are posting to? will this lead to incompatible clients? does the user making the post make the decision? for example how should google's picassa implement so that it can post to flickr and snapfish? And if it implements multiple approaches how does it negotiate with the server to pick one.

Having worked to implement support for first class media resources, we feel that there is presently a gap in the specification's treatment of media resources that makes reliable, interoperable implementation difficult for anything beyond the cat-blog scenario.

Solution

We feel that the problems we are seeing are not limited to just our implementation.  Consider the steps necessary to publish a weblog post like http://www.tbray.org/ongoing/When/200x/2004/12/12/BMS using the current specification. Five discreet HTTP operations are required to update each of the necessary resources every time the post is updated.  To delete the post and all associated resources, five discreet operations are required.. and even then you'd have to maintain a certain amount of client state between operations to keep track of which resources have or have not been deleted. Five different URI's for five different resources that all have to be tracked just to update and manage a single blog post.

We have implemented the use of MIME multipart messages to address these challenges. We have followed the approach outlined in PaceCompoundEntries and it is working well for us. The approach is simply to wrap an Atom entry and associated media resources into a multipart /related mime document POSTed to an entry collection.  This solution addresses both of our primary use cases without requiring us to perform complicated and problematic correlation and state management across multiple post and put operations.  The client can make a single ATOMic post that includes both the media resources and the metadata about those resources (including access controls, etc)

We are very interested in your thoughts.

Thanks,

Rob

media posts and interop testing

Reply via email to