[Sword-TAP] Fwd: Key Changes and Justifications

Stuart Lewis Fri, 21 Jan 2011 10:57:46 -0800

---------- Forwarded message ----------
From: Tim Brody <t...@ecs.soton.ac.uk>
Date: 22 January 2011 03:54
Subject: Re: Key Changes and Justifications
To: Richard Jones <rich...@oneoverzero.com>
Cc: techadvisorypa...@swordapp.org



On Fri, 21 Jan 2011 12:57:19 +0000, Richard Jones <rich...@oneoverzero.com>
wrote:
> Hi Tim,
>
>>>> While this is all lovely...
>>>>
>>>> Why is it that Google docs API and CMIS both use THE SAME solution to
>>>> returning an ATOM entry which has a link rel to a feed which outlined
>>>> the resources which are part of this object?
>>>>
>>>
>>> Wouldn't this require an extra URI?
>>>
>>> In the original proposal we had a Deposit Receipt as an Atom Entry,
>>> and a Statement as a separate document (which you could content
>>> negotiate for, so rdf or an atom feed would have been fine), but when
>>> we discussed it you were against this approach. It was, in fact, you
>>> who convinced me that the Statement should become part of the Deposit
>>> Receipt rather than a document in its own right!
>>
>> The root feed in SWORD contains a list of atom entries that (I think) we
>> all agree should be the top level of the 'work'.
>
> Do you mean the service document?  Each entry in there is a Collection,
> in line with the Atom definition.

I mean the root collection that the SWORD client interacts with i.e. where
a dumb AtomPub client would start POSTing entries to.

>> The workflow state is
>> the state of the 'work' so lives at this level. It isn't overly
>> controversial to have this as inline or as a link-rel.
>
> During the original feedback to the white paper, it was felt that doing
> this inline was insufficient, as the state information could be
> extensive depending on your implementation decisions.  Would your
> atom:link go to another document for describing the state, as opposed to
> the Statement (which describes the object and the state)?

Yes, I think. Use *two* <link-rels>: one to the contents and one to the
state.

>> What's more
>> important is the mechanism to change that state - do you PUT to the
>> <atom:entry>, do a pseudo-move (see CMIS/GData) between collections or
>> use some new RPC (POST?args)?
>
> We are not planning to include any semantics to allow the depositor to
> change the state in this way.  SWORD is a deposit tool only, and the
> idea of relating the state back to the client is for informational
> purposes only.  I think it's a step to far to attempt to include
> workflow controls into SWORD a) this early (before CRUD is even settled
> in) or b) possibly even at all.

Well anything you define I would define "in the light" of the mechanisms
used in CMIS/GData: the ability to PUT to an ACL to change permissions or
to POST to a collection with a src to move entries between collections. So
you could control state by PUTting a modified Statement (as long as it
lives at a URI).

>> What Dave is talking about is how the media is represented (which
>> relates to 'packaging'). What we've discussed at Soton and decided,
>> before looking at CMIS & GData, is that the simplest representation of
>> the *contents of the work* is a link-reled <feed> that aggregates the 0
>> or more media resources.
>
> I agree with this approach almost entirely:  In the original proposal
> the contents of the work were to be retrievable via the Statement
> (located from a link-rel), for which we had proposed ORE as the format.
>   Nonetheless, the business case also stated that this format would be
> content negotiable, so an application/atom+xml;type=feed content type
> would be acceptable if you wanted to implement one.  After extensive
> discussion with Dave, he convinced me that the Statement should be
> embedded in the Deposit Receipt, not available under a separate URI -
> hence my confusion at the latest feedback.
>
> [snip]

Don't combine contents and statement in one:

atom:entry # work
 |
 |- RDF:statement # workflow status
 |
 |- atom:feed # contents

That is the conceptual model. Linked is more flexible in terms of CRUD but
it may be useful to inline where that will help dumb AtomPub clients.

>> As Scott has previously suggested creating a
>> complex object involves multiple POSTs to the link-reled feed. CMIS &
>> GData use this mechanism to support folders.
>
> So the CMIS and GData approaches allow you to create a collection on the
> server by POST?  I had not proposed this approach because it is not part
> of the AtomPub spec.  Wouldn't it also make quite a big
> back-compatibility issue, to change the deposit process in this way?
> (not that there aren't such issues already, but at the moment updating
> SWORD 1 code to SWORD 2 for POST only should be relatively minor - this
> change would require more engineering).

They support collection creation by POSTing an <atom:entry> with special
tags. IIRC CMIS uses a <cmis:> tag while GData uses <atom:category>.

As the <category> that triggers this behaviour is an extension it doesn't
preclude any other valid AtomPub behaviour (including, if you wish,
unpacking a SWORD Package in response to a X-Package HTTP header). You
always return an <atom:entry> but you have to understand the linked <feed>
to retrieve the thing you just uploaded i.e. the <atom:entry> has no
edit-media link.

To build a complex object using just AtomPub XML:

POST /workarea/contents
<entry><category>swap:work</></>

<entry><link rel="#swap:work" href="/eprint/1/contents"/></>

POST /eprint/1/contents?metadata=1 # extract metadata from this file
%PDF-1.4

<entry><link rel="edit-media" href="/doc/1/foo.pdf"/></>

... repeat for each file

PUT /eprint/1
Content-type: application/x-sword; namespace=METS # whatever
<METS:X></>

>> My previous attempt to explain this approach fell on deaf-ears, so let
>> me try to headline this:
>> 1) Get rid of all mentions of "packaging"
>> 2) Get rid of OAI-ORE
>> 3) Use <atom:entry> with an <atom:category> of 'sword:work' (or
>> similar), with a link-rel to an <atom:feed>
>
> I promise that it didn't fall on deaf ears, but it did fall on the ears
> of someone who hasn't had the chance to properly reply (until now).
>
> I don't believe we can get rid of mentions of packaging, no matter how
> much we would like to.  Packaging is used extensively across all
> repository types and if SWORD does not help with this it will not be
> successful.

Packaging is used by each repository type in a mutually incompatible way,
so the current usage is pretty unhelpful. As I say there's no reason to
forbid packages: the <acceptPackage> becomes optional in SWORD v2.
Repositories that don't understand it will store a .zip file.

>> Anyone who wants to use 'packages' to bundle metadata and files into a
>> single .zip should document it and mint a MIME-type. For instance,
>> BibTeX is plain text but has a commonly accepted MIME-type. If your
>> repository supports "BibTeX" then it can do
>> <accept>application/x-bibtex</accept>.
>> OAI-ORE can be supported through a <link-rel> of the work's
<atom:entry>.
>
> This is not practical, unfortunately.  We have to give people something
> that they can work with now, without having to care about registering
> mime-types.  If I want to use SWORD within my organisation to move data
> around, do I really have to register a mime type just to announce my
> (possibly custom) packaging formats to my other servers?  Way too high
> an entry barrier.

I don't think bundling things into a .zip saves you work because it just
moves the complexity into a different domain. You will get 90% of the way
to moving objects from one system to another with low loss by using Atom +
Dublin Core + a feed of media resources. If you require more complexity but
in a limited scope use a private MIME type.

>> (Ideally I would like SWORD to be compatible with Google Docs, so we can
>> leverage any tools built for that API with SWORD and vice-versa - now
>> how cool would that be!?)
>
> That would be very cool.  What applications of the tech did you have in
> mind?  Could be interesting to register them in the SWORD use cases?

Well I was particularly thinking of this:
http://www.gladinet.com/
(Which I've only looked at the advertising blurb for!)

I want to make it easy for people like that to talk to a repository. If we
can make it as easy as substituting the Google base URL for ours then all
the better.

Conversely, wouldn't it be great if Google Docs could be just another
repository for tools like Repository Junction (without too much coding
overhead)?

--
All the best,
Tim.

------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel

[Sword-TAP] Fwd: Key Changes and Justifications

Reply via email to