[Sword-TAP] the scope of sword

2011-03-21 Thread Richard Jones
Hi Folks,

There's been some great discussion on the list this past week or two, 
and I thought it might be time for a summary of what looks to me to be a 
key sticking point: the scope of sword.

There are two distinct sides to this argument as it's been articulated 
on this list:

a) That we should adopt the approach of content management API like CMIS 
or more likely GData

b) That SWORD should be not say anything about what happens to the 
content once it is sent to the server.

In general, I am against (a) for a number of reasons.  First, I am 
concerned that the idioms that are associated with GData are not 
/necessarily/ appropriate.  The hierarchical file system is a common 
idiom but an idiom nonetheless, and it wouldn't be SWORD's place to 
therefore build itself over the top of it.  CMIS I have a harder time 
refuting or accepting, so am open to persuasion either way.  Secondly, I 
don't see a reason to re-create a content management standard, since 
they already exist.  SWORD should, instead, provide support for the 
things that these standards don't provide for our sector/use cases, 
while not preventing the use of them.

 From a purists perspective of (b) the main thing that SWORD offers, 
then, is support for Packaging (with a capital P).  This is a valuable 
addition to the community since it is both common in our sector and 
expressly not covered at least by GData and I believe not by CMIS 
(though again, open to correction).  The support for packaging, though, 
needs to extend to a full CRUD implementation of AtomPub, which is a 
large part of what the profile attempts to do.  I think we have had some 
good technical discussion which which will allow the next draft of the 
profile to do better at that.

In the mean time, there are some grey area parts of the profile, 
particularly In Progress and Suppress Metadata which are more content 
management than they are deposit.  I, personally, think these are 
important; they are light touch, the profile doesn't mandate the server 
to obey them, and they help fulfill known use cases.  Likewise the 
Statement could be viewed as more content management than not, although 
we have tried to pitch that as more an informational resource rather 
than an operational one (i.e. read but not write).

What I'm going to suggest for the next draft is as follows:  we'll put 
some more time into analysing the appropriate ways of updating and 
overwriting deposit packages using the feedback on this list.  And we 
will extend the profile to cover how you would use the SWORD headers to 
be used in content management operations /if that's what your 
implementation wants/ (e.g. how you might use Suppress Metadata or In 
Progress with GData).  There will, obviously, be plenty of time for comment.

In conclusion: we must constrain the scope of sword to something which 
doesn't tread on anyone's toes and is of value to the community.  Too 
far one way or the other and we'll either be superceded or of no value.

Cheers,

Richard




--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


Re: [Sword-TAP] My Thoughts

2011-03-21 Thread Richard Jones
Hi Scott,

>> 6.6. Adding Content to a Resource

 I'll get onto this and point, just as a pointer however look at the 
 'Creating a folder' section in the GDocs API.
>>>
>>> This whole area of SWORD (secs 6.4-6.6) is effectively doing the same job 
>>> as CMIS. Is there a good reason for creating a new spec here?
>>>
>>> If these kinds of operations (remotely manipulating the content of virtual 
>>> packages) are part of the core functions of SWORD, should it be a profile 
>>> of CMIS rather than AtomPub?
>>
>> I've tried /quite/ hard to get to grips with CMIS without a huge amount of 
>> success.  It seems extremely large and full of a lot of stuff which is way 
>> way way over the top for what SWORD is trying to achieve, as far as I can 
>> tell.
>
> It is indeed rather more complex than SWORD as it is handling content 
> management rather than just deposit -->
>
>>
>> Also, we tried in the business case/technical architecture document to 
>> position SWORD firmly in the "deposit" space rather than the "content 
>> management space" (which I may or may not have succeeded in doing). 
>> Certainly I can see the case that enabling CRUD means that you are doing 
>> some content management, so we're striving to keep the details of what we 
>> say based entirely on the aspects of transferring data point to point rather 
>> than mandating what happens to that data at either end (this is, for 
>> example, part of the reticence to formally incorporate GData).
>
>
> -->   and so we really have to make sure the scope is correct. It has veered 
> into CM with all these operations on the content of packages.

So, I think it's inevitable that there will be some operations which 
could be considered content management by the very nature of allowing 
CRUD.  But CMIS applies a domain model to the server end, which SWORD 
doesn't do - I think this is a key distinction.

I'd be interested in your comments on the other email that I sent just 
now about the scope of sword.

>> So, profiling CMIS seems like a massive undertaking and a large burden on 
>> the implementers just to make sense of what they would or wouldn't have to 
>> implement.  Could you comment on that, do you think?
>
> Actually I don't think profiling CMIS would be at all necessary. If SWORD is 
> about deposit, and CMIS is about CM, then implementers can choose the specs 
> they want to support based on the functionality they want to expose.
>
> So some implementations may just want to support deposit with some packaging 
> format hints, while others may want to look at implementing CMIS (e.g. using 
> CMIS libraries such as Apache Chemistry) - either instead of or as well as 
> SWORD.
>
> At the moment there seems to be an assumption that a solution halfway between 
> SWORD 1.x and CMIS is desirable.

I think that even half way to CMIS is a pretty long way  :)

>> What does seem possible is that we could adopt terms from the CMIS namespace 
>> to use in SWORD, rather than minting new terms.  I'm thinking, for example, 
>> of cmis:createdBy being used instead of sword:depositedBy, although there's 
>> some argument to be had over whether those two are really the same thing.  
>> Could you possibly suggest some similarities in terms in CMIS that might be 
>> appropriate for reuse in SWORD?
>
> See above ...
>
>>
>> Also, do you know of any good introductions to CMIS that are bit more 
>> penetrable than the specification that I could look at?
>
> Playing with the Apache Chemistry libs is probably more rewarding than 
> reading the spec as you can see what its doing.
>
> http://chemistry.apache.org/

Thanks, that sent me off on a pretty useful direction.  For anyone else 
interested in CMIS, I found this basic introduction pretty instructive:

http://www.oldschooltechie.com/blog/2009/11/23/introduction-cmis

Cheers,

Richard



--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


Re: [Sword-TAP] My Thoughts

2011-03-21 Thread Richard Jones
Hi Dave,

>>> 6.4. Editing the Content of a Resource
>>>
>>> The client MAY provide an In-Progress header with a value of true or false 
>>> [SWORD001]
>>
>>> 6.5. Deleting the Content of a Resource
>>>
>>> Should return the 200 header representing what has happened.
>>>
>>> I'm against the delete operation on the content URI returning the receipt 
>>> of the edit-URI.
>>> No one asked for this, the client asked for the thing to be deleted, not a 
>>> statement or
>>> receipt! I've made this point before... it got ignored.
>
> Both of these are in fact about which URI you are interacting with not what 
> is returned.
> If you are CRUD'ing to one URI the server should never return data which is 
> not related to
> that URI without some sort of 300 redirect.

I see what you're saying.  It looks like this is also legitimate if a 
Content-Location header is provided, if I am interpreting the HTTP spec 
correctly (which is maybe not the case).

> So the response of the DELETE should always be relevant to the URI you have 
> just deleted.

So, I guess I'd want to clarify what "relevant" means here.  I can't 
find anything definitive in the HTTP spec which tells you how what you 
get back from the DELETE request should be related to the resource 
deleted.  Any references you have that I could look at?

> The In-Progress header should refer only to the URI which it is related to, 
> since In-Progress
> (or status) is in fact a piece of metadata i'd recommend moving it to the 
> atom metadata and
> turning it into a category. Thus you move the item through a workflow by 
> changing the metadata.
>
> Neither of these are EPrints specific, just general good practise of a web 
> server :)

I agree that the In Progress header should only be supplied on the 
container.  In the next version of the profile I'll make that change.

In the mean time, though, once again: this exists as a header because 
when you are POSTing a ZIP file without an Atom Entry before it, there 
is no Atom Entry for a sword:inprogress element to be placed in.  Also, 
it's not the business of SWORD to be moving things through the workflow 
- that would be either a content management operation or some other 
class of administrative interface, which would be out of legitimate 
scope.  In Progress is intended purely to hint to the server that it 
might expect more content to be coming before the client considers the 
full deposit process finished.

>>> 6.6. Adding Content to a Resource
>
> I'll get onto this and point, just as a pointer however look at the 'Creating 
> a folder' section in the GDocs API.

I've been through the GDocs API, and believe that we've included enough 
in the SWORD spec to ensure that it's use is not prevented (to the point 
of being explicitly mentioned).  Also, I looked in the GData 3.0 spec 
for information on how to retrieve the feed of an object, rather than an 
entry, and it didn't say.  I presumed content negotiation (hence section 
6.8 of the sword profile), but perhaps you know where there's a formal 
definition of how this bit works - the GData documentation did seem a 
little bit all over the place?

I'll include some more details in the next version as to how to use 
SWORD extensions on any old URI, and that should hopefully be enough to 
combine it fully with GData.

See my other email about the scope of sword, and let me know what you think.

Cheers,

Richard



--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


Re: [Sword-TAP] An example Implementation (VIDEO)

2011-03-21 Thread Richard Jones
Hi Ian,

On 18/03/11 09:11, Ian Stuart wrote:
> On 17/03/11 19:37, Richard Jones wrote:
>>> In this second, expanded, view there are three things one need to define
>>> within the discussion
>>>
>>> 1) What the singular Thing is: a (zip|xml|csv|xyz) file
>>> 2) What "standard" the manifest file is written to (METS, EPrintsXML,
>>> etc)
>>> 3) What "standard" the metadata file is written to (DC, QDC, MODS,
>>> EPrintsXML, BibTek, etc...)
>>>
>>> Now, one could define a new identifier for each combination of manifest
>>> &  metadata, or one can describe them separately
>>> One has great flexibility&  expandability, the other uses fewer header
>>> fields.
>>
>> I think at the moment we're going for a URI which describes the
>> combination. This is partly because /at the moment/ repositories tend to
>> support a finite set of Packages which consist of their favourite
>> metadata formats and their favourite structural manifests. So not all
>> possible permutations of structural and descriptive data is currently
>> likely. This may, of course, change.
>
> This is not a problem... we just need to be up-front about it.
>
> This mean, for example, that "METSDSpaceSIP" actually means "Zip file,
> with binary files included (subdirectories allowed). Manifest called
> 'mets.xml', which contains the meta-data in epcdx format."
>
> I'm happy with that. but it does mean that this will spawn a LARGE
> number of distinct variations, each of which needs a unique IRI.
> (Again, not a problem: the package I develop for the OA-RJ broker has an
> IRI in my opendepot.org name-space)

Yup, that's exactly what it means.  Otherwise we're into media feature 
sets, which we sort of discussed earlier on the list - seems way too 
complicated for the moment.  I'm happy that people will mint IRIs for 
the formats that they use regularly.

Cheers,

Richard

>



--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


Re: [Sword-TAP] My Thoughts

2011-03-21 Thread Richard Jones
Hi Tim,

I believe that the SWORD spec already meets these requirements of yours:

> What I've been pushing for with SWORD is:
> 1) Re-use sensible paradigms from existing AtomPub profiles (CMIS/GData)

Well, it explicitly doesn't get in the way of you using them, I don't 
believe.  Although if you could verify that, that would be really 
helpful, as I'm not an expert on either CMIS or GData.

> 2) Behave like an AtomPub implementation (i.e. don't MUST something that
> isn't MUST in AtomPub)

This was an original aim of SWORD 2.0.  See, for example:

http://sword2depositlifecycle.jiscpress.org/identifiers/

I believe that the profile as published in first draft meets this 
requirement, but again it would be good to have that verified in case I 
missed anything.

> 3) Modularise the spec. and features to enable flexibility

Also an original aim of SWORD 2.0, which is what led us towards the 
structure presented on the website:

http://swordapp.org/sword-v2/sword-v2-specifications/

There are 4 Internet Draft style documents which break the various parts 
of sword out into re-usable specifications, and a profile which draws 
them together into SWORD 2.0.

Is this sufficiently modular, or did you have further modularisation in 
mind?  I was thinking about modularising the profile itself, but it 
seemed correct to keep the whole CRUD stuff together.  I imagined, for 
example, breaking authNZ out into a separate profile at some 
undetermined time in the future.

> I just echo what Dave Tarrant has said about talking in real-time with
> this. I can see Richard has injected ideas from various sources into the
> spec. but these ideas need to be thrashed out between multiple brains. I
> was initially thinking we want to add behaviourial controls through headers
> whereas I'm more convinced now that these should be IRI parameters, which
> will make it more obvious that this IRI is going to behave differently to
> the base IRI.

Can you give us a few examples of what you imagined?

Cheers,

Richard



--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


Re: [Sword-TAP] My Thoughts

2011-03-21 Thread Richard Jones
Hi Scott,

>> I think someone
>> earlier pointed out our methodology is RFC-orientated i.e. minimal,
>> well-defined and, where relevant, re-using existing Internet RFCs. CMIS, by
>> comparison, defines a full query language, ACLs, CMS-orientated APIs ...
>
> Exactly, which is why this  new "package content editing" aspect of the spec 
> concerns me.

Do you think you could expand on that a bit?

In my mind, we're not doing any package content editing, we're just 
either adding new/more packages to the server (possibly to the same 
container) or overwriting old ones, not giving the client a way to edit 
the contents of the packages.  The Statement, which might give this 
impression, is supposed to be informative rather than operational, 
except in cases where it is implemented as part of, say, GData.

If some part of the profile is suggesting that packages can be edited by 
SWORD that should definitely be clarified.

It certainly feels to me that the scope of SWORD is becoming clearer 
through this discussion - keep it up!

Cheers,

Richard


> I think to do this "right" you probably do need something like CMIS. I'm not 
> convinced its actually needed in SWORD at all, and goes against the 
> principles of simplicity that made it successful.
>
> Remember why we are even talking about packages - not because people want to 
> edit them, but because we have interop issues due to too many conflicting 
> sector-specific package+metadata content types.
>
>> What I've been pushing for with SWORD is:
>> 1) Re-use sensible paradigms from existing AtomPub profiles (CMIS/GData)
>> 2) Behave like an AtomPub implementation (i.e. don't MUST something that
>> isn't MUST in AtomPub)
>> 3) Modularise the spec. and features to enable flexibility
>
> +1
>
>>
>> I just echo what Dave Tarrant has said about talking in real-time with
>> this. I can see Richard has injected ideas from various sources into the
>> spec. but these ideas need to be thrashed out between multiple brains. I
>> was initially thinking we want to add behaviourial controls through headers
>> whereas I'm more convinced now that these should be IRI parameters, which
>> will make it more obvious that this IRI is going to behave differently to
>> the base IRI.
>>
>> --
>> All the best,
>> Tim.
>>
>> --
>> Colocation vs. Managed Hosting
>> A question and answer guide to determining the best fit
>> for your organization - today and in the future.
>> http://p.sf.net/sfu/internap-sfd2d
>> ___
>> Sword-app-techadvisorypanel mailing list
>> Sword-app-techadvisorypanel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel
>
>
> --
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> ___
> Sword-app-techadvisorypanel mailing list
> Sword-app-techadvisorypanel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel
>



--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] planned changes to next spec draft

2011-03-21 Thread Richard Jones
Hi Folks,

Just a quick summary of changes to be made to the profile in the next 
version.  Did I miss anything?

1/ 6.6.2 and 6.6.3 are incorrect in their description of the usage of 
POST on the URIs.  This needs to be clarified.

2/ Add a section describing how to use SWORD headers/techniques on any 
old URI (such as those provided by the Statement), so that it is clear 
how to integrate SWORD with GData (and CMIS).  This section will be 
speculative, and will be a first attempt to articulate such an approach.

3/ Drop the 202 (Accepted) response code

4/ URI -> IRI

5/ Clearly articulate the need/use cases for Suppress-Metadata

6/ Clarify and correct the usage of the In-Progress header: it should 
only be applied to requests on the Edit-URI.  Evaluate the need for any 
further In-Progress information returned in response to operations on 
other URIs or in the Deposit Receipt and propose any appropriate changes.

7/ Enumerate and demonstrate more atom:link@rel values for the 
extensions SWORD makes to AtomPub, rather than relying on clients to 
know when they are dealing with a pure AtomPub server and when a SWORD 
server.

Cheers,

Richard



--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel