On Thu, Jul 11, 2013 at 11:09 AM, Richard Jones <rich...@cottagelabs.com> wrote:
> It is my plan and hope to back all of the multipart OUT of the sword
> spec for a future version (like a 2.1), so I strongly recommend not
> using multipart deposit.  Instead do a POST of an Atom Entry and a PUT
> of the Media Resource in two distinct HTTP requests.

Thank you very much for this explanation, Richard. I have many follow
up questions for you or anyone on the list.

However, before I dive into POST vs. PUT I'd like to back up and
explain a little more at a high level what I'm trying to achieve.

Our first use case for SWORDv2 is to allow Open Journal Systems (OJS)
to deposit data into Dataverse Network (DVN).

http://projects.iq.harvard.edu/ojs-dvn has the details but just
imagine that an academic journal called the New England Journal of
Coffee is hosted in an installation of OJS. An author publishes paper
called "Roasting at Home" and has some data in a file called

On the DVN side, we have already created a dataverse called "NEJC
Dataverse" dedicated to that journal. It's empty at first, but the
academic paper will eventually have a study within that dataverse.
That study is a container (with metadata such as author and
publication data) into which files will be deposited.

Continuing the example, the OJS plugin would use SWORDv2 to do the following:

1. Retrieve a service document that lists SWORDv2 collections for the
NEJC Dataverse.

2. Create a study within the NEJC Dataverse called "Roasting at Home"
(to match the academic paper published in OJS).

3. Upload example.zip to the "Roasting at Home" study in the NEJC Dataverse.

(I hope this makes sense. If you look at the "Textbooks and Test
Scores" study at http://hdl.handle.net/1902.1/11252 and click "Data &
Analysis" you can see a number of .zip files that have been uploaded
to the study.)

Now, back to SWORDv2... I want to make sure I'm understanding
correctly how to achieve the three steps above. As I understand it,
each step could be expressed as a curl command:

1. Retrieve a service document (GET):

curl -s http://sword:sword@localhost:8080/sd-uri

2. Create a resource from an atom entry (POST) using the "collection" IRI:

curl --http1.0 --data-binary "@atom-entry-study.xml" -H "Content-Type:

3. Upload example.zip (PUT) using the "edit media" IRI:

curl --upload-file example.zip -H "Content-Disposition:
filename=example.zip" -H "Content-Type: application/zip"

With those curl commands I've switched to the URLs (sd-uri, col-uri,
em-uri) supported by the SWORDv2 reference implementation:

The reason I'm asking about this is that in the SWORDv2 implementation
I'm working on for DVN, I'm using col-uri for both the creation of the
study (creating a resource with an atom entry) and the upload of
example.zip. The latest code is here:

It works, and I'm pleased but... after your comment about POST vs. PUT
I wonder if I'm doing it wrong since the CollectionServletDefault
(used for the equivalent of col-uri) at
https://github.com/swordapp/JavaServer2.0 doesn't support PUT.
MediaResourceServletDefault on the other hand (the equivalent of
em-iri) does support both POST and PUT.

This morning I experimented with the curl commands above against the
reference implementation of SWORDv2 and captured the output in the
readme at https://github.com/dvn/swordpoc (I'll include the relevant
portion below.)

So, to sum up my questions...

1. Am I doing it wrong by using the col-uri to both create the study
and upload example.zip?

2. Should I be using col-ui to create the study and em-iri to upload



p.s. From the https://github.com/dvn/swordpoc readme:

### A more complicated SSS example: two step deposit

In the example above, "example.zip" is deposited into a SWORD
collection, but let's imagine a more complicated example.

Let's say we want to perform the following steps:

1. Create a resource with an atom entry (i.e. an XML file):
2. Upload example.zip to the newly created resource:

These steps translate into two distinct HTTP request, as explained by
the SWORDv2 spec lead, Richard Jones...

> It is my plan and hope to back all of the multipart OUT of the sword
> spec for a future version (like a 2.1), so I strongly recommend not
> using multipart deposit.  Instead do a POST of an Atom Entry and a PUT
> of the Media Resource in two distinct HTTP requests.

... at 

#### Retrieve a service document (GET)

(For brevity, we only include 2 of the 10 collections in the Service Document)

    [root@logus sss-client]# curl -s http://sword:sword@localhost:8080/sd-uri
    <service xmlns:dcterms="http://purl.org/dc/terms/";
        <atom:title>Main Site</atom:title>
          <accept alternate="multipart-related">*/*</accept>
          <sword:collectionPolicy>Collection Policy</sword:collectionPolicy>
          <dcterms:abstract>Collection Description</dcterms:abstract>
          <sword:treatment>Treatment description</sword:treatment>
          <accept alternate="multipart-related">*/*</accept>
          <sword:collectionPolicy>Collection Policy</sword:collectionPolicy>
          <dcterms:abstract>Collection Description</dcterms:abstract>
          <sword:treatment>Treatment description</sword:treatment>

#### Create a resource from an atom entry (POST)

We POST (using --data-binary) our atom-entry-study.xml content...

    [root@logus sss-client]# cat atom-entry-study.xml
    <?xml version="1.0"?>
    <entry xmlns="http://www.w3.org/2005/Atom";
       <title>The first study for the New England Journal of Coffee
       <summary type="text">The abstract</summary>
       <dcterms:title>Roasting at Home</dcterms:title>
       <dcterms:creator>Peets, John</dcterms:creator>
       <dcterms:creator>Stumptown, Jane</dcterms:creator>
    [root@logus sss-client]#

... to the first collection we saw in the service document

    [root@logus sss-client]# curl --http1.0 --data-binary
"@atom-entry-study.xml" -H "Content-Type: application/atom+xml"
    <entry xmlns:dcterms="http://purl.org/dc/terms/";
      <title>The first study for the New England Journal of Coffee
      <summary type="text">The abstract</summary>
      <generator uri="http://www.swordapp.org/sss"; version="1.0"/>
      <dcterms:abstract>The abstract</dcterms:abstract>
      <dcterms:title>The first study for the New England Journal of
Coffee dataverse</dcterms:title>
      <dcterms:title>Roasting at Home</dcterms:title>
      <dcterms:creator>Peets, John</dcterms:creator>
      <dcterms:creator>Stumptown, Jane</dcterms:creator>
      <sword:verboseDescription>SSS has done this, that and the other
to process the deposit</sword:verboseDescription>
      <sword:treatment>Treatment description</sword:treatment>
      <link rel="alternate"
      <content type="application/zip"
      <link rel="edit"
      <link rel="edit-media"
      <link rel="edit-media" type="application/atom+xml;type=feed"
      <link rel="http://purl.org/net/sword/terms/add";
      <link rel="http://purl.org/net/sword/terms/statement";
      <link rel="http://purl.org/net/sword/terms/statement";

This creates a number of files under the
"4e1f7a9a-f8d4-4795-b230-195acb6680c9" collection:

    [root@logus sss-client]# cd
    [root@logus 4e1f7a9a-f8d4-4795-b230-195acb6680c9]# ls -1d */*

We can think of "70566b02-76a9-496d-b1bb-356ba9acc7f2" as the unique
identifier for the resource we just created. It contains a number of

- `atom.xml` matches exactly the `atom-entry-study.xml` file we used
to create the resource.
- `sss_deposit-receipt.xml` is the output we saw from the curl command
when we created the resource.
- `sss_metadata.xml` is a simpler version of `atom.xml` (i.e.
`<dcterms:creator>` becomes just `<creator>`).
- `sss_statement.atom.xml` shows the state of the resource, indicating
the following: "The work has passed through review and is now in the
- `sss_statement.xml` is similar to `sss_statement.atom.xml` in that
it shows the state, but is longer and in RDF format.

#### Upload example.zip (PUT)

Next we upload example.zip to the resource we created

We use --upload-file for this, which does a PUT.

    [root@logus 4e1f7a9a-f8d4-4795-b230-195acb6680c9]# cd /tmp/sss-client
    [root@logus sss-client]# curl --upload-file example.zip -H
"Content-Disposition: filename=example.zip" -H "Content-Type:
    [root@logus sss-client]#

We don't see any output above but the console output indicates some activity...

    2013-07-17 16:14:23,311 - sss - INFO - Authentication required
    2013-07-17 16:14:23,311 - sss - INFO - Authentication details:
sword:sword; On Behalf Of: None
    2013-07-17 16:14:23,312 - sss - INFO - Received Binary deposit request
    2013-07-17 16:14:23,313 - sss - INFO - Replace request has file
content - updating
    2013-07-17 16:14:23,314 - sss - INFO - Content replaced - - [17/Jul/2013 16:14:23] "HTTP/1.1 PUT
- 204 No Content

... and the files in the directory for the resource have changed:

    [root@logus 70566b02-76a9-496d-b1bb-356ba9acc7f2]# git status
    # On branch master
    # Changed but not updated:
    #   (use "git add/rm <file>..." to update what will be committed)
    #   (use "git checkout -- <file>..." to discard changes in working
    #       deleted:    atom.xml
    #       modified:   sss_deposit-receipt.xml
    #       modified:   sss_statement.atom.xml
    #       modified:   sss_statement.xml
    # Untracked files:
    #   (use "git add <file>..." to include in what will be committed)
    #       2013-07-17T16:14:23Z_example.zip
    no changes added to commit (use "git add" and/or "git commit -a")
    [root@logus 70566b02-76a9-496d-b1bb-356ba9acc7f2]#

The `atom.xml` file has been deleted, which was an exact copy of the
`atom-entry-study.xml` file we used to create the resource. Some of
the content from that file such as our `<dcterms:creator>` entries are
still preserved in `sss_deposit-receipt.xml` and `sss_metadata.xml`.

Philip Durbin
Software Developer for http://thedata.org

See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
sword-app-tech mailing list

Reply via email to