[Sword-TAP] the scope of sword
Hi Folks, There's been some great discussion on the list this past week or two, and I thought it might be time for a summary of what looks to me to be a key sticking point: the scope of sword. There are two distinct sides to this argument as it's been articulated on this list: a) That we should adopt the approach of content management API like CMIS or more likely GData b) That SWORD should be not say anything about what happens to the content once it is sent to the server. In general, I am against (a) for a number of reasons. First, I am concerned that the idioms that are associated with GData are not /necessarily/ appropriate. The hierarchical file system is a common idiom but an idiom nonetheless, and it wouldn't be SWORD's place to therefore build itself over the top of it. CMIS I have a harder time refuting or accepting, so am open to persuasion either way. Secondly, I don't see a reason to re-create a content management standard, since they already exist. SWORD should, instead, provide support for the things that these standards don't provide for our sector/use cases, while not preventing the use of them. From a purists perspective of (b) the main thing that SWORD offers, then, is support for Packaging (with a capital P). This is a valuable addition to the community since it is both common in our sector and expressly not covered at least by GData and I believe not by CMIS (though again, open to correction). The support for packaging, though, needs to extend to a full CRUD implementation of AtomPub, which is a large part of what the profile attempts to do. I think we have had some good technical discussion which which will allow the next draft of the profile to do better at that. In the mean time, there are some grey area parts of the profile, particularly In Progress and Suppress Metadata which are more content management than they are deposit. I, personally, think these are important; they are light touch, the profile doesn't mandate the server to obey them, and they help fulfill known use cases. Likewise the Statement could be viewed as more content management than not, although we have tried to pitch that as more an informational resource rather than an operational one (i.e. read but not write). What I'm going to suggest for the next draft is as follows: we'll put some more time into analysing the appropriate ways of updating and overwriting deposit packages using the feedback on this list. And we will extend the profile to cover how you would use the SWORD headers to be used in content management operations /if that's what your implementation wants/ (e.g. how you might use Suppress Metadata or In Progress with GData). There will, obviously, be plenty of time for comment. In conclusion: we must constrain the scope of sword to something which doesn't tread on anyone's toes and is of value to the community. Too far one way or the other and we'll either be superceded or of no value. Cheers, Richard -- Colocation vs. Managed Hosting A question and answer guide to determining the best fit for your organization - today and in the future. http://p.sf.net/sfu/internap-sfd2d ___ Sword-app-techadvisorypanel mailing list Sword-app-techadvisorypanel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel
Re: [Sword-TAP] My Thoughts
Hi Scott, >> 6.6. Adding Content to a Resource I'll get onto this and point, just as a pointer however look at the 'Creating a folder' section in the GDocs API. >>> >>> This whole area of SWORD (secs 6.4-6.6) is effectively doing the same job >>> as CMIS. Is there a good reason for creating a new spec here? >>> >>> If these kinds of operations (remotely manipulating the content of virtual >>> packages) are part of the core functions of SWORD, should it be a profile >>> of CMIS rather than AtomPub? >> >> I've tried /quite/ hard to get to grips with CMIS without a huge amount of >> success. It seems extremely large and full of a lot of stuff which is way >> way way over the top for what SWORD is trying to achieve, as far as I can >> tell. > > It is indeed rather more complex than SWORD as it is handling content > management rather than just deposit --> > >> >> Also, we tried in the business case/technical architecture document to >> position SWORD firmly in the "deposit" space rather than the "content >> management space" (which I may or may not have succeeded in doing). >> Certainly I can see the case that enabling CRUD means that you are doing >> some content management, so we're striving to keep the details of what we >> say based entirely on the aspects of transferring data point to point rather >> than mandating what happens to that data at either end (this is, for >> example, part of the reticence to formally incorporate GData). > > > --> and so we really have to make sure the scope is correct. It has veered > into CM with all these operations on the content of packages. So, I think it's inevitable that there will be some operations which could be considered content management by the very nature of allowing CRUD. But CMIS applies a domain model to the server end, which SWORD doesn't do - I think this is a key distinction. I'd be interested in your comments on the other email that I sent just now about the scope of sword. >> So, profiling CMIS seems like a massive undertaking and a large burden on >> the implementers just to make sense of what they would or wouldn't have to >> implement. Could you comment on that, do you think? > > Actually I don't think profiling CMIS would be at all necessary. If SWORD is > about deposit, and CMIS is about CM, then implementers can choose the specs > they want to support based on the functionality they want to expose. > > So some implementations may just want to support deposit with some packaging > format hints, while others may want to look at implementing CMIS (e.g. using > CMIS libraries such as Apache Chemistry) - either instead of or as well as > SWORD. > > At the moment there seems to be an assumption that a solution halfway between > SWORD 1.x and CMIS is desirable. I think that even half way to CMIS is a pretty long way :) >> What does seem possible is that we could adopt terms from the CMIS namespace >> to use in SWORD, rather than minting new terms. I'm thinking, for example, >> of cmis:createdBy being used instead of sword:depositedBy, although there's >> some argument to be had over whether those two are really the same thing. >> Could you possibly suggest some similarities in terms in CMIS that might be >> appropriate for reuse in SWORD? > > See above ... > >> >> Also, do you know of any good introductions to CMIS that are bit more >> penetrable than the specification that I could look at? > > Playing with the Apache Chemistry libs is probably more rewarding than > reading the spec as you can see what its doing. > > http://chemistry.apache.org/ Thanks, that sent me off on a pretty useful direction. For anyone else interested in CMIS, I found this basic introduction pretty instructive: http://www.oldschooltechie.com/blog/2009/11/23/introduction-cmis Cheers, Richard -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Sword-app-techadvisorypanel mailing list Sword-app-techadvisorypanel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel
Re: [Sword-TAP] My Thoughts
Hi Dave, >>> 6.4. Editing the Content of a Resource >>> >>> The client MAY provide an In-Progress header with a value of true or false >>> [SWORD001] >> >>> 6.5. Deleting the Content of a Resource >>> >>> Should return the 200 header representing what has happened. >>> >>> I'm against the delete operation on the content URI returning the receipt >>> of the edit-URI. >>> No one asked for this, the client asked for the thing to be deleted, not a >>> statement or >>> receipt! I've made this point before... it got ignored. > > Both of these are in fact about which URI you are interacting with not what > is returned. > If you are CRUD'ing to one URI the server should never return data which is > not related to > that URI without some sort of 300 redirect. I see what you're saying. It looks like this is also legitimate if a Content-Location header is provided, if I am interpreting the HTTP spec correctly (which is maybe not the case). > So the response of the DELETE should always be relevant to the URI you have > just deleted. So, I guess I'd want to clarify what "relevant" means here. I can't find anything definitive in the HTTP spec which tells you how what you get back from the DELETE request should be related to the resource deleted. Any references you have that I could look at? > The In-Progress header should refer only to the URI which it is related to, > since In-Progress > (or status) is in fact a piece of metadata i'd recommend moving it to the > atom metadata and > turning it into a category. Thus you move the item through a workflow by > changing the metadata. > > Neither of these are EPrints specific, just general good practise of a web > server :) I agree that the In Progress header should only be supplied on the container. In the next version of the profile I'll make that change. In the mean time, though, once again: this exists as a header because when you are POSTing a ZIP file without an Atom Entry before it, there is no Atom Entry for a sword:inprogress element to be placed in. Also, it's not the business of SWORD to be moving things through the workflow - that would be either a content management operation or some other class of administrative interface, which would be out of legitimate scope. In Progress is intended purely to hint to the server that it might expect more content to be coming before the client considers the full deposit process finished. >>> 6.6. Adding Content to a Resource > > I'll get onto this and point, just as a pointer however look at the 'Creating > a folder' section in the GDocs API. I've been through the GDocs API, and believe that we've included enough in the SWORD spec to ensure that it's use is not prevented (to the point of being explicitly mentioned). Also, I looked in the GData 3.0 spec for information on how to retrieve the feed of an object, rather than an entry, and it didn't say. I presumed content negotiation (hence section 6.8 of the sword profile), but perhaps you know where there's a formal definition of how this bit works - the GData documentation did seem a little bit all over the place? I'll include some more details in the next version as to how to use SWORD extensions on any old URI, and that should hopefully be enough to combine it fully with GData. See my other email about the scope of sword, and let me know what you think. Cheers, Richard -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Sword-app-techadvisorypanel mailing list Sword-app-techadvisorypanel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel
Re: [Sword-TAP] An example Implementation (VIDEO)
Hi Ian, On 18/03/11 09:11, Ian Stuart wrote: > On 17/03/11 19:37, Richard Jones wrote: >>> In this second, expanded, view there are three things one need to define >>> within the discussion >>> >>> 1) What the singular Thing is: a (zip|xml|csv|xyz) file >>> 2) What "standard" the manifest file is written to (METS, EPrintsXML, >>> etc) >>> 3) What "standard" the metadata file is written to (DC, QDC, MODS, >>> EPrintsXML, BibTek, etc...) >>> >>> Now, one could define a new identifier for each combination of manifest >>> & metadata, or one can describe them separately >>> One has great flexibility& expandability, the other uses fewer header >>> fields. >> >> I think at the moment we're going for a URI which describes the >> combination. This is partly because /at the moment/ repositories tend to >> support a finite set of Packages which consist of their favourite >> metadata formats and their favourite structural manifests. So not all >> possible permutations of structural and descriptive data is currently >> likely. This may, of course, change. > > This is not a problem... we just need to be up-front about it. > > This mean, for example, that "METSDSpaceSIP" actually means "Zip file, > with binary files included (subdirectories allowed). Manifest called > 'mets.xml', which contains the meta-data in epcdx format." > > I'm happy with that. but it does mean that this will spawn a LARGE > number of distinct variations, each of which needs a unique IRI. > (Again, not a problem: the package I develop for the OA-RJ broker has an > IRI in my opendepot.org name-space) Yup, that's exactly what it means. Otherwise we're into media feature sets, which we sort of discussed earlier on the list - seems way too complicated for the moment. I'm happy that people will mint IRIs for the formats that they use regularly. Cheers, Richard > -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Sword-app-techadvisorypanel mailing list Sword-app-techadvisorypanel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel
Re: [Sword-TAP] My Thoughts
Hi Tim, I believe that the SWORD spec already meets these requirements of yours: > What I've been pushing for with SWORD is: > 1) Re-use sensible paradigms from existing AtomPub profiles (CMIS/GData) Well, it explicitly doesn't get in the way of you using them, I don't believe. Although if you could verify that, that would be really helpful, as I'm not an expert on either CMIS or GData. > 2) Behave like an AtomPub implementation (i.e. don't MUST something that > isn't MUST in AtomPub) This was an original aim of SWORD 2.0. See, for example: http://sword2depositlifecycle.jiscpress.org/identifiers/ I believe that the profile as published in first draft meets this requirement, but again it would be good to have that verified in case I missed anything. > 3) Modularise the spec. and features to enable flexibility Also an original aim of SWORD 2.0, which is what led us towards the structure presented on the website: http://swordapp.org/sword-v2/sword-v2-specifications/ There are 4 Internet Draft style documents which break the various parts of sword out into re-usable specifications, and a profile which draws them together into SWORD 2.0. Is this sufficiently modular, or did you have further modularisation in mind? I was thinking about modularising the profile itself, but it seemed correct to keep the whole CRUD stuff together. I imagined, for example, breaking authNZ out into a separate profile at some undetermined time in the future. > I just echo what Dave Tarrant has said about talking in real-time with > this. I can see Richard has injected ideas from various sources into the > spec. but these ideas need to be thrashed out between multiple brains. I > was initially thinking we want to add behaviourial controls through headers > whereas I'm more convinced now that these should be IRI parameters, which > will make it more obvious that this IRI is going to behave differently to > the base IRI. Can you give us a few examples of what you imagined? Cheers, Richard -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Sword-app-techadvisorypanel mailing list Sword-app-techadvisorypanel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel
Re: [Sword-TAP] My Thoughts
Hi Scott, >> I think someone >> earlier pointed out our methodology is RFC-orientated i.e. minimal, >> well-defined and, where relevant, re-using existing Internet RFCs. CMIS, by >> comparison, defines a full query language, ACLs, CMS-orientated APIs ... > > Exactly, which is why this new "package content editing" aspect of the spec > concerns me. Do you think you could expand on that a bit? In my mind, we're not doing any package content editing, we're just either adding new/more packages to the server (possibly to the same container) or overwriting old ones, not giving the client a way to edit the contents of the packages. The Statement, which might give this impression, is supposed to be informative rather than operational, except in cases where it is implemented as part of, say, GData. If some part of the profile is suggesting that packages can be edited by SWORD that should definitely be clarified. It certainly feels to me that the scope of SWORD is becoming clearer through this discussion - keep it up! Cheers, Richard > I think to do this "right" you probably do need something like CMIS. I'm not > convinced its actually needed in SWORD at all, and goes against the > principles of simplicity that made it successful. > > Remember why we are even talking about packages - not because people want to > edit them, but because we have interop issues due to too many conflicting > sector-specific package+metadata content types. > >> What I've been pushing for with SWORD is: >> 1) Re-use sensible paradigms from existing AtomPub profiles (CMIS/GData) >> 2) Behave like an AtomPub implementation (i.e. don't MUST something that >> isn't MUST in AtomPub) >> 3) Modularise the spec. and features to enable flexibility > > +1 > >> >> I just echo what Dave Tarrant has said about talking in real-time with >> this. I can see Richard has injected ideas from various sources into the >> spec. but these ideas need to be thrashed out between multiple brains. I >> was initially thinking we want to add behaviourial controls through headers >> whereas I'm more convinced now that these should be IRI parameters, which >> will make it more obvious that this IRI is going to behave differently to >> the base IRI. >> >> -- >> All the best, >> Tim. >> >> -- >> Colocation vs. Managed Hosting >> A question and answer guide to determining the best fit >> for your organization - today and in the future. >> http://p.sf.net/sfu/internap-sfd2d >> ___ >> Sword-app-techadvisorypanel mailing list >> Sword-app-techadvisorypanel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel > > > -- > Colocation vs. Managed Hosting > A question and answer guide to determining the best fit > for your organization - today and in the future. > http://p.sf.net/sfu/internap-sfd2d > ___ > Sword-app-techadvisorypanel mailing list > Sword-app-techadvisorypanel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel > -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Sword-app-techadvisorypanel mailing list Sword-app-techadvisorypanel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel
[Sword-TAP] planned changes to next spec draft
Hi Folks, Just a quick summary of changes to be made to the profile in the next version. Did I miss anything? 1/ 6.6.2 and 6.6.3 are incorrect in their description of the usage of POST on the URIs. This needs to be clarified. 2/ Add a section describing how to use SWORD headers/techniques on any old URI (such as those provided by the Statement), so that it is clear how to integrate SWORD with GData (and CMIS). This section will be speculative, and will be a first attempt to articulate such an approach. 3/ Drop the 202 (Accepted) response code 4/ URI -> IRI 5/ Clearly articulate the need/use cases for Suppress-Metadata 6/ Clarify and correct the usage of the In-Progress header: it should only be applied to requests on the Edit-URI. Evaluate the need for any further In-Progress information returned in response to operations on other URIs or in the Deposit Receipt and propose any appropriate changes. 7/ Enumerate and demonstrate more atom:link@rel values for the extensions SWORD makes to AtomPub, rather than relying on clients to know when they are dealing with a pure AtomPub server and when a SWORD server. Cheers, Richard -- Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar ___ Sword-app-techadvisorypanel mailing list Sword-app-techadvisorypanel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel