[Sword-TAP] SWORD v2 - final stages of the project

2012-03-17 Thread Stuart Lewis
Hi all,

We're now brining the SWORD v2 project to a close - thanks to your
help, and the help of the whole advisory group, we have been able to
successfully develop the SWORD v2 standard.  This has now been
published, and implemented in several different repository platforms,
along with complementary client libraries and exemplars.

We hope to write a full press release in the coming days, part of
which will include a big nod towards this group and the role it
played.

Due to the administrative overhead of constantly having to decline
spam emails which are sent to this list, I am now going to close the
list.  We will move all further discussion of SWORD to the normal
sword-app-tech email list
(https://lists.sourceforge.net/lists/listinfo/sword-app-tech).

Many thanks and best wishes,


Stuart

--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Ian Stuart ian.stu...@ed.ac.uk
Date: 11 January 2011 03:04
Subject: Re: content negotiating for package formats
To: techadvisorypa...@swordapp.org


We're looking at two things here, are we not?

1) we want the data returned in s specific media type (zip file, xml,
json, atom+xml, etc)
2) we want the content of that data to be encoded in a particular way
(METSDSpaceSIP, METSOARJ, ORE, RDFa, etc)

I read your email as wanting to combine the two of them in one http
header field.

The alternative is to use pragma fields in which case, you can
do what you like :D

On 07/01/11 17:36, Richard Jones wrote:

 Hi Folks,

 I'd be really interested in people's input on the following problem that
 I've come across in creating the first draft of the spec. It's to do
 with how one can content negotiate with a server for a particular
 package format.

 Allowing the Media Resource URI to abstractly refer to the contents of
 the resource on the sword server (as per the business case/technical
 design document) means that in order to specify what you want to get
 back from the server when requesting that resource may require content
 negotiation. Content negotiation uses the HTTP Accept- headers, and the
 main Accept header itself allows you to list mimetypes and your
 preferences for receiving them, but package formats aren't represented
 by mimetypes (for the most part).

 There are two ways that we might go about content negotiating for a
 format (such as the SWORD example format of METSDSpaceSIP) that I can
 see, and I'd like to solicit feedback:

 1/ Use the Accept-Encoding header in some way. This header allows you to
 do things like:

 Accept-Encoding: compress, gzip

 which seems to suggest that we could put in the package format like:

 Accept-Encoding: METSDSpaceSIP

 Does anyone have any experience with this header and could tell us
 whether this seems like a reasonable usage of it?

 2/ Define an extension to the application/zip mimetype which allows us
 to specify the package format as a parameter. Parameters are used in
 mimetypes to further refine their definition, as in:

 application/atom+xml;type=entry

 This is a valid mimetype, and the Atom spec defines the parameter type
 with possible values entry and feed so that you can more accurately
 identify the content of the thing you are getting back. Content
 negotiation explicitly allows for the use of parameters (although some
 of the details are a little unclear with regard to wildcards).

 So we could, for example, specify a parameter swordpackage which can
 take the URI of a package format, and construct mimetypes like

 application/zip;swordpackage=uri:METSDSpaceSIP

 (see http://tools.ietf.org/html/draft-ietf-atompub-typeparam-00)

 The questions here are: is this a legitimate extension/approach, would
 this break anything else on the web in general, and is it naive to
 assume that all packages have the top level mimetype of application/zip?


 There has also been some discussion about the OASIS CMIS standard, and I
 wonder if anyone is familiar enough with it to tell us how that
 community handles this kind of issue (if at all?).

 Cheers,

 Richard





--

Ian Stuart.
Developer: Open Access Repository Junction and OpenDepot.org
Bibliographics and Multimedia Service Delivery team,
EDINA,
The University of Edinburgh.

http://edina.ac.uk/

This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: David Tarrant d...@ecs.soton.ac.uk
Date: 11 January 2011 03:20
Subject: Re: content negotiating for package formats
To: Ian Stuart ian.stu...@ed.ac.uk
Cc: techadvisorypa...@swordapp.org


I agree with Ian, why can we just use the existing x-packaging header,
cos that's how point (2) works in the current sword?

Dave T

On 10 Jan 2011, at 14:04, Ian Stuart wrote:

 We're looking at two things here, are we not?

 1) we want the data returned in s specific media type (zip file, xml, json, 
 atom+xml, etc)
 2) we want the content of that data to be encoded in a particular way 
 (METSDSpaceSIP, METSOARJ, ORE, RDFa, etc)

 I read your email as wanting to combine the two of them in one http header 
 field.

 The alternative is to use pragma fields in which case, you can do what 
 you like :D

 On 07/01/11 17:36, Richard Jones wrote:
 Hi Folks,

 I'd be really interested in people's input on the following problem that
 I've come across in creating the first draft of the spec. It's to do
 with how one can content negotiate with a server for a particular
 package format.

 Allowing the Media Resource URI to abstractly refer to the contents of
 the resource on the sword server (as per the business case/technical
 design document) means that in order to specify what you want to get
 back from the server when requesting that resource may require content
 negotiation. Content negotiation uses the HTTP Accept- headers, and the
 main Accept header itself allows you to list mimetypes and your
 preferences for receiving them, but package formats aren't represented
 by mimetypes (for the most part).

 There are two ways that we might go about content negotiating for a
 format (such as the SWORD example format of METSDSpaceSIP) that I can
 see, and I'd like to solicit feedback:

 1/ Use the Accept-Encoding header in some way. This header allows you to
 do things like:

 Accept-Encoding: compress, gzip

 which seems to suggest that we could put in the package format like:

 Accept-Encoding: METSDSpaceSIP

 Does anyone have any experience with this header and could tell us
 whether this seems like a reasonable usage of it?

 2/ Define an extension to the application/zip mimetype which allows us
 to specify the package format as a parameter. Parameters are used in
 mimetypes to further refine their definition, as in:

 application/atom+xml;type=entry

 This is a valid mimetype, and the Atom spec defines the parameter type
 with possible values entry and feed so that you can more accurately
 identify the content of the thing you are getting back. Content
 negotiation explicitly allows for the use of parameters (although some
 of the details are a little unclear with regard to wildcards).

 So we could, for example, specify a parameter swordpackage which can
 take the URI of a package format, and construct mimetypes like

 application/zip;swordpackage=uri:METSDSpaceSIP

 (see http://tools.ietf.org/html/draft-ietf-atompub-typeparam-00)

 The questions here are: is this a legitimate extension/approach, would
 this break anything else on the web in general, and is it naive to
 assume that all packages have the top level mimetype of application/zip?


 There has also been some discussion about the OASIS CMIS standard, and I
 wonder if anyone is familiar enough with it to tell us how that
 community handles this kind of issue (if at all?).

 Cheers,

 Richard





 --

 Ian Stuart.
 Developer: Open Access Repository Junction and OpenDepot.org
 Bibliographics and Multimedia Service Delivery team,
 EDINA,
 The University of Edinburgh.

 http://edina.ac.uk/

 This email was sent via the University of Edinburgh.

 The University of Edinburgh is a charitable body, registered in
 Scotland, with registration number SC005336.


--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Scott Wilson scott.bradley.wil...@gmail.com
Date: 11 January 2011 03:55
Subject: Re: content negotiating for package formats
To: techadvisorypa...@swordapp.org


To answer the CMIS question - AFAIK CMIS doesn't explicitly support
external packaging formats (in its scope it declares that compound and
virtual objects are extended features); instead it directly uses
Atom's collection handling. So a CMIS client would create the Folder
object and then POST each enclosed item to it, rather than POST a zip
file and rely on the repository to unpackage and store it as some sort
of composite object.

There is a line in the CMIS charter setting it as a secondary
priority, so it may become part of CMIS in the future.

Packaging is also used for supporting alternative renditions of a
resource - and CMIS supports this explicitly - see renditions in the
OASIS CMIS spec:

http://docs.oasis-open.org/cmis/CMIS/v1.0/os/cmis-spec-v1.0.html#_Toc243905395

... however support is currently limited to retrieving renditions, and
the spec doesn't specify how to create a document with multiple
renditions.

This could be a good topic on which to link up with the OASIS CMIS TC.

S


On 10 Jan 2011, at 14:20, David Tarrant wrote:

 I agree with Ian, why can we just use the existing x-packaging header, cos 
 that's how point (2) works in the current sword?

 Dave T

 On 10 Jan 2011, at 14:04, Ian Stuart wrote:

 We're looking at two things here, are we not?

 1) we want the data returned in s specific media type (zip file, xml, json, 
 atom+xml, etc)
 2) we want the content of that data to be encoded in a particular way 
 (METSDSpaceSIP, METSOARJ, ORE, RDFa, etc)

 I read your email as wanting to combine the two of them in one http header 
 field.

 The alternative is to use pragma fields in which case, you can do what 
 you like :D

 On 07/01/11 17:36, Richard Jones wrote:
 Hi Folks,

 I'd be really interested in people's input on the following problem that
 I've come across in creating the first draft of the spec. It's to do
 with how one can content negotiate with a server for a particular
 package format.

 Allowing the Media Resource URI to abstractly refer to the contents of
 the resource on the sword server (as per the business case/technical
 design document) means that in order to specify what you want to get
 back from the server when requesting that resource may require content
 negotiation. Content negotiation uses the HTTP Accept- headers, and the
 main Accept header itself allows you to list mimetypes and your
 preferences for receiving them, but package formats aren't represented
 by mimetypes (for the most part).

 There are two ways that we might go about content negotiating for a
 format (such as the SWORD example format of METSDSpaceSIP) that I can
 see, and I'd like to solicit feedback:

 1/ Use the Accept-Encoding header in some way. This header allows you to
 do things like:

 Accept-Encoding: compress, gzip

 which seems to suggest that we could put in the package format like:

 Accept-Encoding: METSDSpaceSIP

 Does anyone have any experience with this header and could tell us
 whether this seems like a reasonable usage of it?

 2/ Define an extension to the application/zip mimetype which allows us
 to specify the package format as a parameter. Parameters are used in
 mimetypes to further refine their definition, as in:

 application/atom+xml;type=entry

 This is a valid mimetype, and the Atom spec defines the parameter type
 with possible values entry and feed so that you can more accurately
 identify the content of the thing you are getting back. Content
 negotiation explicitly allows for the use of parameters (although some
 of the details are a little unclear with regard to wildcards).

 So we could, for example, specify a parameter swordpackage which can
 take the URI of a package format, and construct mimetypes like

 application/zip;swordpackage=uri:METSDSpaceSIP

 (see http://tools.ietf.org/html/draft-ietf-atompub-typeparam-00)

 The questions here are: is this a legitimate extension/approach, would
 this break anything else on the web in general, and is it naive to
 assume that all packages have the top level mimetype of application/zip?


 There has also been some discussion about the OASIS CMIS standard, and I
 wonder if anyone is familiar enough with it to tell us how that
 community handles this kind of issue (if at all?).

 Cheers,

 Richard





 --

 Ian Stuart.
 Developer: Open Access Repository Junction and OpenDepot.org
 Bibliographics and Multimedia Service Delivery team,
 EDINA,
 The University of Edinburgh.

 http://edina.ac.uk/

 This email was sent via the University of Edinburgh.

 The University of Edinburgh is a charitable body, registered in
 Scotland, with registration number SC005336.



--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!

[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Ian Stuart ian.stu...@ed.ac.uk
Date: 11 January 2011 04:01
Subject: Re: content negotiating for package formats
To: techadvisorypa...@swordapp.org


More specifically. the Open Access Repository Junction Discovery
APIs use the Accept header to determine the content type of the
returned data, and it would be sensible to remain consistent:

Accept defines the mime-type (application/xml, text/plain, etc...)
Pragma:X-Packaging then defines the package format for the content
(METSDSpaceSIP, METSOARJ, ORE, RDFa, etc)

 I suggest Pragma:X-Packaging, as one currently uses an
'X-Packaging' element in the http request header object when *posting*
a SWORD deposit to define the packaging type of the content.

--

Ian Stuart.
Developer: Open Access Repository Junction and OpenDepot.org
Bibliographics and Multimedia Service Delivery team,
EDINA,
The University of Edinburgh.

http://edina.ac.uk/

This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Richard Jones rich...@oneoverzero.com
Date: 12 January 2011 09:52
Subject: Re: content negotiating for package formats
To: Graham Klyne g...@ninebynine.org
Cc: techadvisorypa...@swordapp.org


Hi Graham,

On 11/01/11 19:05, Graham Klyne wrote:

 Richard - a small thing: rather than using the X- header convention,
 just pick a suitable name and request provisional registration via IANA,
 per [1]. This avoids messing around if the new header goes standards
 track. And, anyway, X- headers don't have the same reserved for
 experimentation status that applies for email headers.

 [1] http://tools.ietf.org/html/rfc3864

That's an interesting idea.  Shouldn't we do this with all SWORD
headers, then, though?

Probably this is a decision that we need to make as a project as to
whether we go right away towards something which can be put onto a
real standards track in the future, or whether we stick with X-
headers.

I'm in favour of being able to go standards track, but there are a few
things which nag at me, including:

1/ the amount of time required to do the work to even to provisional
registration of the headers

2/ backwards compatibility with SWORD 1.0.  If we drop, say,
X-On-Behalf-Of and go for just On-Behalf-Of we'd be breaking the back
compat or placing the onus on the server to interpret both headers.

What do you think?

 I'm also wondering about your combination of content-type and internal
 packaging format. The media feature description framework [2] was
 intended to capture this kind of combination of features in a more
 structured fashion. Thus, I would imagine something like:

 Accept-media-features:
 (| ( type=application/zip package=METSDSpaceSIP);q=1.0
 ( type=application/atom+xml atomtype=entry package=AtomSIP);q=0.8 )

 This would require IANA registration of the new header field, and a new
 media features called package and atomtype, per [4]. Feature type
 is already registered [3].

 [2] http://tools.ietf.org/html/rfc2533

 [3] http://tools.ietf.org/html/rfc2913

 [4] http://tools.ietf.org/html/rfc2506

This seems like the proper way to do it, and just comes with the
same caveats as above.  What is the status of this RFC?  Are we safe
to go ahead and use it?  (incidentally, we wouldn't need to register
atomtype, as the param type=entry is part of the mimetype already).

I'd be interested in the rest of the advisory panel's and project
team's opinions on this, as this may well define the way that I work
for the duration of the project.

Cheers,

Richard


 Richard Jones wrote:

 Hi Folks,

 Thanks, this is really great stuff.

 On 10/01/11 16:05, Robert D. Sanderson wrote:

 On 7 January 2011 17:36, Richard Jonesrich...@oneoverzero.com wrote:

 2/ Define an extension to the application/zip mimetype which allows
 us to
 specify the package format as a parameter. So we could, for example,

 specify

 a parameter swordpackage which can take the URI of a package
 format, and
 construct mimetypes like

 application/zip;swordpackage=uri:METSDSpaceSIP
 (see http://tools.ietf.org/html/draft-ietf-atompub-typeparam-00)

 The questions here are: is this a legitimate extension/approach, would

 this

 break anything else on the web in general, and is it naive to
 assume that
 all packages have the top level mimetype of application/zip?


 First of all, no, it's not legitimate to invent new parameters for
 existing mime types.

 RFC 2048, Section 2.2.3
 ... the names, values, and meanings of any parameters must
 be fully specified when a media type is registered in the IETF tree ...

 http://www.faqs.org/rfcs/rfc2048.html

 So it's not legal to create a parameter swordpackage and attach it to
 the
 existing application/zip.

 Ok, it sounds like this option then is simply out altogether.

 More generally, the HTTP specification defines the accept headers as:
 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

 Note that the extension parameter here is for the header, not the
 mimetype.
 The BNF allows the accept-extension rule ONLY after the mandatory q
 value
 in accept-params.

 I have to confess to having overlooked the accept-extension rule, as
 there wasn't an example of usage in that document. Can you give us an
 example as to how that is used?

 Which means using Accept-Encoding in this way is potentially
 problematic,
 but Accept does have provision that would make this use legitimate.

 Like mime types, content encodings also have a registry.
 See: http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html section 3.5

 Basically, there are two routes to avoid breaking the rules, neither
 easy:

 1. Register new Mime Types for every packaging format.

 2. Use an x- header and eventually write an RFC to standardize it.

 We looked at this in both SRU (eg what it would take to have a wrapper
 format and an internal format: SRU vs Atom, wrapping Simple DC vs METS)
 and in conjunction with the digital format registry for preservation
 purposes.

 So my 

[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Ian Stuart ian.stu...@ed.ac.uk
Date: 19 January 2011 01:11
Subject: Re: content negotiating for package formats
To: techadvisorypa...@swordapp.org


On 10/01/11 18:49, Richard Jones wrote:

 It's looking like a separate header is the way to do this, with the
 following couple of options immediately standing out:

 Accept-Features (or X-Accept-Features if it isn't sufficiently official)
 X-Packaging
 X-Accept-Packaging (which I just made up for the purposes of this
 discussion)

 Some comments on these:

 Accept-Features
 Having looked at the document [1] (thanks Graham (K)) it looks like it
 would give us the leeway that we need to describe requirements while
 ensuring that Graham (T)'s concerns (which I share) about matching up
 package format requirements with mimetypes would be dealt with. On the
 other hand, this document is 12/13 years old and the header has not made
 it into the HTTP content negotiation documentation and is significantly
 different in format to all the other Accept- headers. It could also be a
 substantial effort for servers to implement the full requirements of
 this header.

 X-Packaging
 I'm against using this in this way as it is already used to alert the
 server during POST as to the package format that is being supplied. The
 format of the header for content negotiation would have to be totally
 different to this usage: a list of package formats and q values for
 example, rather than a single definitive URI. I see scope for confusion.

 X-Accept-Packaging
 Given my concerns about X-Packaging and the comments above about
 Accept-Feature, perhaps there is a middle ground that we can define
 which does something more minimal with just mimetypes, package formats
 and q values in a way similar to having a mimetype that has added
 parameters.

 For example:
 Accept: application/zip; q=1.0, application/atom+xml;type=entry;q=0.8

 X-Accept-Packaging: application/zip;{package=METSDSpaceSIP};q=1.0,
 application/atom+xml;type=entry;{package=AtomSIP};q=0.8

 Or some other suitably neat and unambiguous serialisation which is in
 line with how the other Accept- headers work and also gives us the
 information we want in a totally definitive mimetype-package format
 way. This could be supplied alongside the usual Accept header so that
 clients which can't generate the X-Accept-Packaging header can fall back
 easily to the usual content negotiation route.

I'm still unclear why there is a need to combine the content type
(application/zip; q=1.0) with the data encoding (METSDSpaceSIP;
q=1.0)

Can't you say (1) I only deal in .tgz content, and (2) you can
package whatevers within that content as 'Foo', 'Bar', or even
'Acme::WhiteSpaceEncoded'

--

Ian Stuart.
Developer: Open Access Repository Junction and OpenDepot.org
Bibliographics and Multimedia Service Delivery team,
EDINA,
The University of Edinburgh.

http://edina.ac.uk/

This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Richard Jones rich...@oneoverzero.com
Date: 19 January 2011 21:06
Subject: Re: content negotiating for package formats
To: Ian Stuart ian.stu...@ed.ac.uk
Cc: techadvisorypa...@swordapp.org


Hi Ian,

On 18/01/11 12:11, Ian Stuart wrote:

 On 10/01/11 18:49, Richard Jones wrote:

 It's looking like a separate header is the way to do this, with the
 following couple of options immediately standing out:

 Accept-Features (or X-Accept-Features if it isn't sufficiently official)
 X-Packaging
 X-Accept-Packaging (which I just made up for the purposes of this
 discussion)

 Some comments on these:

 Accept-Features
 Having looked at the document [1] (thanks Graham (K)) it looks like it
 would give us the leeway that we need to describe requirements while
 ensuring that Graham (T)'s concerns (which I share) about matching up
 package format requirements with mimetypes would be dealt with. On the
 other hand, this document is 12/13 years old and the header has not made
 it into the HTTP content negotiation documentation and is significantly
 different in format to all the other Accept- headers. It could also be a
 substantial effort for servers to implement the full requirements of
 this header.

 X-Packaging
 I'm against using this in this way as it is already used to alert the
 server during POST as to the package format that is being supplied. The
 format of the header for content negotiation would have to be totally
 different to this usage: a list of package formats and q values for
 example, rather than a single definitive URI. I see scope for confusion.

 X-Accept-Packaging
 Given my concerns about X-Packaging and the comments above about
 Accept-Feature, perhaps there is a middle ground that we can define
 which does something more minimal with just mimetypes, package formats
 and q values in a way similar to having a mimetype that has added
 parameters.

 For example:
 Accept: application/zip; q=1.0, application/atom+xml;type=entry;q=0.8

 X-Accept-Packaging: application/zip;{package=METSDSpaceSIP};q=1.0,
 application/atom+xml;type=entry;{package=AtomSIP};q=0.8

 Or some other suitably neat and unambiguous serialisation which is in
 line with how the other Accept- headers work and also gives us the
 information we want in a totally definitive mimetype-package format
 way. This could be supplied alongside the usual Accept header so that
 clients which can't generate the X-Accept-Packaging header can fall back
 easily to the usual content negotiation route.

 I'm still unclear why there is a need to combine the content type
 (application/zip; q=1.0) with the data encoding (METSDSpaceSIP; q=1.0)

 Can't you say (1) I only deal in .tgz content, and (2) you can package
 whatevers within that content as 'Foo', 'Bar', or even
 'Acme::WhiteSpaceEncoded'

I think that the problem is that you can't guarantee that the list of
content types and the list of packaging types are combinable in a
meaningful way; Graham T's email had an example.

So suppose a server can give you content type A with packaging formats
X and Y, or content type B with packaging format Z:

A + (X or Y)
B + Z

and your content negotiation header says:

Accept: A; q=1.0, B; q=0.8
Accept-Packaging: Z; q=1.0, X; q=1.0

Which combination do you return?

On the other hand, this is a general problem and even within the Media
Feature syntax that Graham K describes in his RFC acknowledges this
effectively limits the use of q values to top-level feature sets.
So, you would be limited to content negotiating for:

Accept-Media-Feature: A(X), B(Z), A(Y)

for example; i.e. explicitly declaring your preference of the
combination of content-type and packaging format.

I've spent the last 3 or 4 days looking at the Media Feature stuff in
detail, and I have to confess it does feel like a sledgehammer to
crack a nut.  At the moment I'm playing with specifying restricted
version of it to see if we can get the effect that we want without the
huge overhead of a full implementation.

As a consequence, I'm still open to Ian's suggested approach here,
provided that we can decide a) what the new HTTP header should be
called, and b) what the rules for resolving content negotiation
ambiguities as shown above should be.

Cheers,

Richard

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Ian Stuart ian.stu...@ed.ac.uk
Date: 19 January 2011 21:44
Subject: Re: content negotiating for package formats
To: techadvisorypa...@swordapp.org


On 19/01/11 08:06, Richard Jones wrote:

 and your content negotiation header says:

 Accept: A; q=1.0, B; q=0.8
 Accept-Packaging: Z; q=1.0, X; q=1.0

 Which combination do you return?

I agree its not a clear-cut case. however it also has to be said
that the content negotiation header isn't helping by declaring two
options to be of equal value

I would say:
A+Z, then A+X, then B+Z, then B+X

 FIFO-style

(after all, if you have a preference for order, try:

 Accept-Packaging: X; q=1.0, Z; q=1.0

instead)


Even under the concept of a combined Accept model, what do you do when
you receive

 Accept-Media-Feature: A(Z); q=1.0, A(X); q=1.0, B(Z); q=0.8, B(X); q=0.8


--

Ian Stuart.
Developer: Open Access Repository Junction and OpenDepot.org
Bibliographics and Multimedia Service Delivery team,
EDINA,
The University of Edinburgh.

http://edina.ac.uk/

This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Ian Stuart ian.stu...@ed.ac.uk
Date: 19 January 2011 23:33
Subject: Re: content negotiating for package formats
To: Scott Wilson scott.bradley.wil...@gmail.com
Cc: techadvisorypa...@swordapp.org


On 19/01/11 10:16, Scott Wilson wrote:

 I have an excellent content package that will either work with
 binary data directly included or passed-by-reference, and I am
 working on Importers for DSpace  EPrints as we speak... as outputs
 of the OA-RJ Broker work.

 I suggest we standardise on a really crappy packaging format with
 almost zero features, combined with a totally inadequate metadata
 schema. At least that way it might actually work.*

Oh, so you've seen my work then ;-)


--

Ian Stuart.
Developer: Open Access Repository Junction and OpenDepot.org
Bibliographics and Multimedia Service Delivery team,
EDINA,
The University of Edinburgh.

http://edina.ac.uk/

This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Robert D. Sanderson rsander...@lanl.gov
Date: 20 January 2011 04:53
Subject: Re: content negotiating for package formats
To: techadvisorypa...@swordapp.org


 I don't think you can ever get away from a degree of content negotiation,
 but it doesn't necessarily need to be as complex as the scenarios outlined
 depending on what agreements you can have for common formats in common
 cases.

I agree with Graham and Scott.  Please can we step back and clearly and
completely define the scope of the issue.

My understanding is as follows:

There are two dimensions of content type -- packaging (eg zip, bagit, tar,
mets,...) and metadata (dc, mets, ore, ead, ...) and it is not feasible to
specify all of the combinations as unique keys.
These combinations are used to both deposit and retrieve content packages.


Client to Server (Deposit):

* There is a recommended packaging format (Zip + DC in Atom)
* There is a header to inform the server which packaging format is
actually being sent, as well as the metadata format
* There is information available as to the types of format the server will
accept (in the Svc Desc)

Server to Client (Retrieve):

* There SHOULD be the same recommended packaging format.  If a client can
construct it, then it can likely read it back again.  Requiring a
different format just doubles the work.
* There is a header to inform the client which packaging format is being
sent, as well as the metadata format
* There is information available as to the types of format the server can
send (in Svc Desc) which may not be the same as the set of formats it will
accept for deposit.
* There SHOULD be a header to request a particular format.


Now, call me a heretic, but I think Packaging is the wrong way round.  The
outermost layer is the top level content type, and hence should be in the
Accept and Content-Type headers.  If you download a zip file containing 5
plain text files, Content-Type is application/zip.

So what is really needed is Accept-Metadata and Content-Metadata, to
request the format of the metadata in the response package.  If the server
can't deliver the combination, then it won't do that.  The same way that
if you ask for Accept-Language: fr and the server can't deliver French
then it won't.

I disagree with Ed that there MUST be only a single format.  That's never
going to fly with current workflows, and would result in the
interoperability of OAI-PMH's use of simple DC ... to wit, that there is
syntactic interoperability, but the semantics are completely worthless due
to people stuffing random content into fields because they can't say what
they really mean.  The Linked Data effort has shown us that even in an
open world of infinite vocabularies, people *will* self-organize into
support of useful relationships, and the same should apply here.

Rob

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Richard Jones rich...@oneoverzero.com
Date: 20 January 2011 06:26
Subject: Re: content negotiating for package formats
To: Ed Summers e...@pobox.com
Cc: techadvisorypa...@swordapp.org


Hi Ed,

On 19/01/11 13:27, Ed Summers wrote:

 On Wed, Jan 19, 2011 at 6:28 AM, Richard Jonesrich...@oneoverzero.com  
 wrote:

 We've had a few discussions in the past about supporting some formats, and
 they always end up pretty divisive.  So SWORD is aiming to be totally
 agnostic on the point, but it does need to provide the client and server a
 mechanism to negotiate over what format they are interchanging.  If we can
 achieve that, that will be relatively useful in an interoperability setting,
 I think, particularly as many SWORD servers (particularly repositories) are
 able to create dissemination packages in a large number of formats (see
 EPrints export plugins for example).

 Would it be too restrictive to require SWORD collections to only
 support one package format? This would mean that there MUST be
 one and only one sword:acceptPackaging per app:collection in the
 service document. I think this would simplify matters significantly
 for implementors since:

 1) there would no longer be any need for the q attribute on
 sword:acceptPackaging, and the requirement to interpret them.
 2) X-Packaging header registration, and the need for clients to send
 it would go away
 2) a client could only retrieve a package in the format it was
 deposited in. Does anyone really have an appetite for dynamically
 rewriting package formats as part of an HTTP request cycle?

I think that would be overly restrictive, while the above gains are
relatively minimal in the scheme of things.

Also, as per my earlier comment about export plugins in EPrints, you
could easily imagine throwing a known package format into the
repository, and then asking it to give you it back in a variety of
formats that you don't know how to generate yourself.

 I guess a better question is: do we have many SWORD implementations
 that support POSTing multiple package flavors to the same collection?

 Also, I am -1 on SWORD requiring a standard package format. I think it
 would be fine to list some preferences, and light-weight, community
 driven mechanisms for identifying them, but that's as far as SWORD
 should go IMHO.

It's been a long standing complaint against SWORD that despite it
being an interoperability standard, you can't even deposit the same
package into DSpace, EPrints and Fedora, let alone other
implementations that weren't funded as part of the original project.
From both a practical point of view and a community perception point
of view this has to be addressed.

We have tried to work around the standard package format issue by
adopting an Atom Multipart [1] deposit of an Atom Entry with optional
embedded metadata plus a binary payload which may either be a single
file or if given the Content-Type of application/zip a plain old zip
file with no prescribed internal content.  This ought to be
satisfactory because it is not only about the most simple format you
could come up with, but it also leverages the existing semantics of
AtomPub, so adding it is of minimal effort.

http://tools.ietf.org/html/draft-gregorio-atompub-multipart-04

Cheers,

Richard

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Ian Stuart ian.stu...@ed.ac.uk
Date: 20 January 2011 22:00
Subject: Re: content negotiating for package formats
To: techadvisorypa...@swordapp.org


On 19/01/11 17:26, Richard Jones wrote:

 Also, as per my earlier comment about export plugins in EPrints, you
 could easily imagine throwing a known package format into the
 repository, and then asking it to give you it back in a variety of
 formats that you don't know how to generate yourself.

H.. interesting!
I see a future development for the OA-RJ broker there!!


 It's been a long standing complaint against SWORD that despite it being
 an interoperability standard, you can't even deposit the same package
 into DSpace, EPrints and Fedora, let alone other implementations that
 weren't funded as part of the original project. From both a practical
 point of view and a community perception point of view this has to be
 addressed.

This is an issue that the OA-RJ Project is addressing with the Broker work.
Our initial work is to produce a single importable package, and the
importers that go with them, so that the Broker can pass on an Item to
a number of target repositories.

(There is a future development idea which would be to allow each
target repository to identify the package format it wanted, and the
Broker would transfer in that package but that is a *much* slower
mechanism when dealing with multi-institutional papers)

--

Ian Stuart.
Developer: Open Access Repository Junction and OpenDepot.org
Bibliographics and Multimedia Service Delivery team,
EDINA,
The University of Edinburgh.

http://edina.ac.uk/

This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: content negotiating for package formats

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Ian Stuart ian.stu...@ed.ac.uk
Date: 21 January 2011 21:46
Subject: Re: content negotiating for package formats
To: techadvisorypa...@swordapp.org


On 20/01/11 18:11, Julie Allinson wrote:

 I might be talking nonsense here, but is this something that could
 support 'graceful' behaviour ... one thing I noticed in testing was that
 EPrints will accept a METS package with epdcx but the deposit fails if
 there is any other metadata instead of or in addition to the epdcx
 embedded in the METS doc. I'm not criticising EPrints or advocating METS
 but it struck me that if there is a package that could be deposited
 knowing that it will succeed and that you could stuff all kinds of
 things into it which the repository will either know what to do with and
 do that (unpack etc.) or simply accept and store? ... so for the EPrints
 case, you might need a new export plug, but in the meantime you could
 still be making deposits.

EPrints does, indeed, expect epcdx in the xmlData section of the
METS data, however I have also found that EPints is much more relaxed
about the structure of METS/epcdx that DPspace is I've yet to find
a Fedora volunteer to try imports with ;-)

It is certainly possible to write an Importer for EPrints that will
accept whatever format you care to specify... and my experience is
that this is easier in EPrints than DSpace (but then again, I'm a Perl
Monkey :chuckle: )

Whilst on the subject of epcdx: I am swinging away from it now - there
are just so many things it doesn't do well, or misses out all
together. Perhaps this is an opportunity to get people from LT, Data,
Article, and various other data-store types together, and try to come
up with an extensible core schema that can be both cross-platform as
well as cross-type?

--

Ian Stuart.
Developer: Open Access Repository Junction and OpenDepot.org
Bibliographics and Multimedia Service Delivery team,
EDINA,
The University of Edinburgh.

http://edina.ac.uk/

This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: Key Changes and Justifications

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Richard Jones rich...@oneoverzero.com
Date: 20 January 2011 02:30
Subject: Re: Key Changes and Justifications
To: rsander...@lanl.gov
Cc: techadvisorypa...@swordapp.org


Hi Folks,

 * Content Negotiating for Package Formats

 RFC2533 seems massive overkill, and very different from HTTP content
 negotiation.

 Could you set out the requirements that cannot be fulfilled by accept
 headers?  My understanding is that the packaging format and the wrapped
 media format should be separately negotiable, but that can be handled with
 just a single new Accept- header that handles the wrapped data's format.
 [As the packaging is the outermost layer, it goes into Accept]

 If RFC2533 is the way you decide to go, then you should follow RFC2295
 Section 6, which discusses a Accept-Features.  Note in RFC 2616 (HTTP)
 defined after 2533/2295,  it doesn't mention Accept-Features.  However,
 2295 defines a different syntax than 2533, and 2533 doesn't appear to
 officially update 2295.  Transparent Content Negotiation from 2295 is very
 poorly implemented, and 2533 doesn't appear to be implemented at all.

 Basically, ... don't do it. Whatever the problem is, 2295 + 2533 is not
 the solution.

Regarding this, perhaps the easiest thing to do is share my first stab
at an internet draft for the various HTTP headers that look relevant
to SWORD 2.0.

http://sword-app.svn.sourceforge.net/viewvc/sword-app/spec/trunk/PackagedContentDelivery.txt?revision=226view=markup

Feel free to mock my first attempt at writing anything of this nature
- I've hacked it together from a variety of example sources, and hope
that it's the right sort of thing, but any hints as to how to make it
better would be great.

The main point, though, is that it describes briefly the
Accept-Media-Formats header with its constrained contents, which will
hopefully clarify what we're trying to achieve.

I originally discarded Accept-Features, because the definition of it
seemed to concern the features of the request, rather than any content
negotiation (in Section 8.2 of RFC2295):

The Accept-Features request header can be used by a user agent to
give information about the presence or absence of certain features in
the feature set of the current request.

Must be I'm reading that wrong.

The more we discuss it the more I'm leaning towards a lazy approach of
having a separate Accept-Packaging header, and some clearly stated
rules as to the way in which servers should interpret the combination
of Accept and Accept-Packaging.  This issue must arise in other types
of content negotiation, for example with Accept-Language where not all
content-types are available in all languages, so perhaps there are
some resources on that that we can learn from.

Cheers,

Richard

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel


[Sword-TAP] Fwd: Key Changes and Justifications

2011-01-21 Thread Stuart Lewis
-- Forwarded message --
From: Richard Jones rich...@oneoverzero.com
Date: 22 January 2011 01:57
Subject: Re: Key Changes and Justifications
To: Tim Brody t...@ecs.soton.ac.uk
Cc: techadvisorypa...@swordapp.org


Hi Tim,

 While this is all lovely...

 Why is it that Google docs API and CMIS both use THE SAME solution to
 returning an ATOM entry which has a link rel to a feed which outlined
 the resources which are part of this object?


 Wouldn't this require an extra URI?

 In the original proposal we had a Deposit Receipt as an Atom Entry,
 and a Statement as a separate document (which you could content
 negotiate for, so rdf or an atom feed would have been fine), but when
 we discussed it you were against this approach. It was, in fact, you
 who convinced me that the Statement should become part of the Deposit
 Receipt rather than a document in its own right!

 The root feed in SWORD contains a list of atom entries that (I think) we
 all agree should be the top level of the 'work'.

Do you mean the service document?  Each entry in there is a
Collection, in line with the Atom definition.

 The workflow state is
 the state of the 'work' so lives at this level. It isn't overly
 controversial to have this as inline or as a link-rel.

During the original feedback to the white paper, it was felt that
doing this inline was insufficient, as the state information could be
extensive depending on your implementation decisions.  Would your
atom:link go to another document for describing the state, as opposed
to the Statement (which describes the object and the state)?

 What's more
 important is the mechanism to change that state - do you PUT to the
 atom:entry, do a pseudo-move (see CMIS/GData) between collections or
 use some new RPC (POST?args)?

We are not planning to include any semantics to allow the depositor to
change the state in this way.  SWORD is a deposit tool only, and the
idea of relating the state back to the client is for informational
purposes only.  I think it's a step to far to attempt to include
workflow controls into SWORD a) this early (before CRUD is even
settled in) or b) possibly even at all.

 What Dave is talking about is how the media is represented (which
 relates to 'packaging'). What we've discussed at Soton and decided,
 before looking at CMIS  GData, is that the simplest representation of
 the *contents of the work* is a link-reled feed that aggregates the 0
 or more media resources.

I agree with this approach almost entirely:  In the original proposal
the contents of the work were to be retrievable via the Statement
(located from a link-rel), for which we had proposed ORE as the
format.  Nonetheless, the business case also stated that this format
would be content negotiable, so an application/atom+xml;type=feed
content type would be acceptable if you wanted to implement one.
After extensive discussion with Dave, he convinced me that the
Statement should be embedded in the Deposit Receipt, not available
under a separate URI - hence my confusion at the latest feedback.

Personally, I'd be happy to return to the original proposal with an
additional defined feature that the Statement be negotiable as an
atom:feed or an ore resource map.  I would also like to ensure that
the atom:feed can suitably hold all the information that we would like
to include in the Statement, such as the state information (which
could, of course, be embedded as foreign markup).

Folks - Perhaps we could have a brief show of hands with regard to the
notion that the Statement be separated from the Deposit Receipt?

It strikes me that there are some opportunities here to leverage the
Aggregation-URI in ORE.  At the moment it feels like a bit of an
appendix, existing only to be different from the other URIs that we
can't use for it.  Perhaps instead the Aggregation-URI can be our main
entry point for the Statement in it's various forms, via content
negotiation?  This would at least stem any proliferation of
unnecessary URIs.

 As Scott has previously suggested creating a
 complex object involves multiple POSTs to the link-reled feed. CMIS 
 GData use this mechanism to support folders.

So the CMIS and GData approaches allow you to create a collection on
the server by POST?  I had not proposed this approach because it is
not part of the AtomPub spec.  Wouldn't it also make quite a big
back-compatibility issue, to change the deposit process in this way?
(not that there aren't such issues already, but at the moment updating
SWORD 1 code to SWORD 2 for POST only should be relatively minor -
this change would require more engineering).

 My previous attempt to explain this approach fell on deaf-ears, so let
 me try to headline this:
 1) Get rid of all mentions of packaging
 2) Get rid of OAI-ORE
 3) Use atom:entry with an atom:category of 'sword:work' (or
 similar), with a link-rel to an atom:feed

I promise that it didn't fall on deaf ears, but it did fall on the
ears of someone who hasn't had the 

[Sword-TAP] Welcome to the SWORD Technical Advisory Panel list archive

2011-01-20 Thread Stuart Lewis
Welcome to the SWORD Technical Advisory Panel list archive

http://swordapp.org/

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Sword-app-techadvisorypanel mailing list
Sword-app-techadvisorypanel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-techadvisorypanel