Re: [sword-app-tech] FW: Use of In-Progress header and the SE-IRI vs. the EM-IRI

2011-10-12 Thread Kathi Fletcher
 is viable (we did try
 doing this, btw), but also is inconsistent with the nature of REST.

  I can't figure out why we don't just use the header in all cases?  What
  benefit accrues from not using the header on actions on EM-IRI? What is
 not
  using it for actions on the EM-IRI saving? If we use the header in all
 cases
  then the publish logic is always the same and super simple: check the
 header
  or its implied default.
 
 
 
  Thoughts?

 Hopefully the above clarifies the decision process.  The crux,
 ultimately, is that In-Progress on the EM-IRI would not be RESTful,
 and the consequences of violating this principle bring uncertainty in
 the implementation and complexity in the profile.

 Does that make sense?

 Cheers,

 Richard




-- 
Kathi Fletcher
Email: kathi.fletc...@shuttleworthfoundation.org
Alternate Email: kathi.fletc...@gmail.com
Twitter: kefletcher http://www.twitter.com/kefletcher
Skype: kef-sky
Blog: kefletcher.blogspot.com
Phone: US 862-345-6178
--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct___
sword-app-tech mailing list
sword-app-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-tech


[sword-app-tech] I think I have found a bug in the V2 spec.

2011-10-20 Thread Kathi Fletcher
Hi guys,

I thought I had the spec all figured out and logged a ticket against our new
implementation of V2 in Connexions when we were disallowing POST multipart
to the SE-IRI and allowing it on EM-IRI. That seemed opposite of my
understanding. For PUT, the case is exactly reversed. Mulitpart is supported
on SE-IRI and NOT on EM-IRI. The logic as I understand it would be that you
can only explicitly change the metadata using the SE-IRI because the
metadata applies to the container as a whole.

But then we looked back at the SWORD V2 spec and saw that it said that POST
of metadata and contents should use the EM-IRI. At first I thought this was
a mere typographical error because the example given just below shows POST
SE-IRI HTTP/1.1. But the spec has a line suggesting that it might have been
intentional and the example might be the typo. See relevant excerpts below.

*6.5.3 specifies PUT Multipart to SE-IRI:*  Section
6.5.3http://sword-app.svn.sourceforge.net/viewvc/sword-app/spec/trunk/SWORDProfile.html?revision=HEAD#protocoloperations_editingcontent_multipart,
that states The client can *replace both the metadata and content *of a
resource by performing an HTTP PUT on the *Edit-IRI* with a multipart mime
message, as per Section 6.3.2. Creating a Resource with a Multipart
Deposithttp://sword-app.svn.sourceforge.net/viewvc/sword-app/spec/trunk/SWORDProfile.html?revision=HEAD#protocoloperations_creatingresource_multipart
.

*6.5.3 specifies POST Multipart to EM-IRI:*

   - In Section
6.7.3http://sword-app.svn.sourceforge.net/viewvc/sword-app/spec/trunk/SWORDProfile.html?revision=HEAD#protocoloperations_addingcontent,
   it says The client can *add new content and update the metadata* attached
   to a resource by issuing an HTTP POST of an Atom Multipart
[AtomMultiparthttp://sword-app.svn.sourceforge.net/viewvc/sword-app/spec/trunk/SWORDProfile.html?revision=HEAD#atommultipart]
   document to the *EM-IRI*.
   - Further it has this puzzling (to me) line. This operation is analagous
   to Section 
6.3.2.http://sword-app.svn.sourceforge.net/viewvc/sword-app/spec/trunk/SWORDProfile.html?revision=HEAD#protocoloperations_creatingresource_multipartexcept
   that the target IRI is the *EM-IRI* as the container already exists and
   may already contain content and metadata. Section 6.3.2 refers to the
   Col-IRI, which gives me hope that this use of EM-IRI is still just a
   typographical error, which would leave my understanding in tact.
   - The example uses SE-IRI, however.
   POST SE-IRI HTTP/1.1 Host: example.org Authorization: Basic
   ZGFmZnk6c2VjZXJldA== Content-Length: [content length] Content-Type:
   multipart/related; boundary1605871705==;
   type=application/atom+xml

Anyone care to way in? Either way, something needs a tweak in the spec --
either the wording or the example.

Cheers,
Kathi

-- 
Katherine Fletcher, kathi.fletc...@gmail.com
kathi.fletc...@gmail.com
Twitter: kefletcher http://www.twitter.com/kefletcher Blog:
kefletcher.blogspot.com
kathi.fletc...@gmail.com
--
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev___
sword-app-tech mailing list
sword-app-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sword-app-tech


Re: [sword-app-tech] How to send large fiels

2011-12-05 Thread Kathi Fletcher
Hi,

I have CC'd Rufus Pollack of CKAN in case he has ideas about some sort of
system where papers go in document repositories like DSpace, EPrint, and
data goes in data repositories like CKAN etc.

Kathi

-- Forwarded message --
From: David FLANDERS d.fland...@jisc.ac.uk
Date: 2011/12/5
Subject: Re: [sword-app-tech] How to send large fiels
To: Ben O'Steen bost...@gmail.com, Stuart Lewis s.le...@auckland.ac.nz
Cc: lt sword-app-tech@lists.sourceforge.netgt 
sword-app-tech@lists.sourceforge.net, Leggett, Pete 
p.f.legg...@exeter.ac.uk


+1 

** **

Why not use systems *built for* data instead of a system built for research
papers?  CKAN, Tardis, Kasabi, MongoDB, NoSQL store (triple, graph,
keyValue)...?

** **

I’d like to hear a good reason not to use these systems and then
interoperate with repositories rather than build the same functionality
into repositories?  /dff

** **

*From:* Ben O'Steen [mailto:bost...@gmail.com]
*Sent:* 05 December 2011 08:00
*To:* Stuart Lewis
*Cc:* lt sword-app-tech@lists.sourceforge.netgt; Leggett, Pete

*Subject:* Re: [sword-app-tech] How to send large fiels

** **

While I think I understand the drive to put these files within a
repository, I would suggest caution. Just because it might be possible to
put a file into the care of a repository doesn't make it a practical or
useful thing to do.

** **

- What do you feel you might gain by placing 500Gb+ files into a
repository, compared with having them in an addressable filestore?

- Have people been able to download files of that size from DSpace, Fedora
or EPrints?

- Has the repository been allocated space on a suitable filesystem? XFS,
EBS, Thumper or similar?

- Once the file is ingested into DSpace or Fedora for example, is there any
other route to retrieve this, aside from HTTP? (Coding your own
servlet/addon is not a real answer to this.) Is it easily accessible via
Grid-FTP or HPN-SSH for example?

- Can the workflows you wish to utilise handle the data you are giving it?
Is any broad stroke tool aside from fixity useful here?

** **

Again, I am advising caution here, not besmirching the name of
repositories. They do a good job with what we might currently term small
files, but were never developed with research data sizes in mind (3-500Gb
is a decent rough guide. 1+Tb sets are certainly not uncommon)

** **

So, in short, weigh up the benefits against the downsides and not in
hypotheticals. Actually do it, and get real researchers to try and use it.
You'll soon have a metric to show what is useful and what isn't.


On Monday, 5 December 2011, Stuart Lewis wrote:

Hi Pete,

Thanks for the information.  I've attached a piece of code that we use
locally as part of the curation framework (in DSpace 1.7 or above), written
by a colleague: Kim Shepherd.  The curation framework allows small jobs to
be run on single items, collections, communities, or the whole repository.
 This particular job looks to see if there is a filename in a pre-described
metadata field, and if there is no matching bitstream, it will then ingest
the file from disk.

More details of the curation system can be seen at:

 - https://wiki.duraspace.org/display/DSPACE/CurationSystem
 - https://wiki.duraspace.org/display/DSPACE/Curation+Task+Cookbook

Some other curation tasks that Kim has written:

 - https://github.com/kshepherd/Curation

This can be used by depositing the metadata via SWORD, with the filename in
a metadata field.  Optionally the code could be changed to copy the file
from another source (e.g. FTP, HTTP, Grid, etc).

Thanks,


Stuart Lewis
Digital Development Manager
Te Tumu Herenga The University of Auckland Library
Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
Ph: +64 (0)9 373 7599 x81928



On 29/11/2011, at 12:09 PM, Leggett, Pete wrote:

 Hi Stuart,

 You asked for more info. We are developing a Research Data Repository
 based on Dspace for storing the research data associated with Exeter
 University research publications.
 For some research fields such as Physics, Biology, this data can be very
 large - TB's it seems!, hence the need to consider large injests over what
 might be several days.
 The researcher has the data, and would I am guessing create the metadata
 but maybe in collaboration with a data curator. Ideally the researcher
 would perform the deposit with, for large data sets, an offline injest of
 the data itself. The data can be on the researchers
 server/workstation/laptop/dvd/usb hard drive etc.

 There seems to be a couple of ways at least of approaching this so what I
 was after was some references to what and how other people have done this
 to give me a better handle on the best way forward - having very little
 dspace or repository experience myself. But given the size of larger data
 sets, I do think the best solution will involve as little copying of the
 data as possible - with the ultimate being just one copy process, 

Re: [sword-app-tech] SWORD community development and support

2013-08-01 Thread Kathi Fletcher
Just a quick update, since the last post was about the quietness of the
list. We are still actively using SWORD to deposit content to Connexionx (
cnx.org). It just works, so we haven't had much traffic on the list!

Kathi


On Thu, Aug 1, 2013 at 8:02 AM, Philip Durbin philip_dur...@harvard.eduwrote:

 Thanks, Stuart and Richard. My reply is inline below.

 On Thu, Jun 20, 2013 at 4:41 AM, Richard Jones rich...@cottagelabs.com
 wrote:
  p.s. I'm growing concerned that this mailing list is so quiet (and
  only admins can see the number of people subscribed). Have people
  moved on from SWORD to some other standard? If so, which one?
 
  I just checked - there are 175 subscribers to this list.

 Thanks, Stuart. I'm glad to hear this number is as large as it is.
 Many mailing lists allow this number to be discoverable by the
 subscribers and if it's easy to do so, I would encourage making this
 change so subscribers don't have to ask.

  As far as I know, SWORD is the main contender in town when it comes to
 a standardized deposit interface to this type of repository.  I've also
 wondered about the quietness of this list.  I think there may be a few
 reasons: one, is that a lot of repository users are still grappling with
 their repositories, without yet getting as far as accepting remote
 deposits.  Second, SWORD doesn't yet really have an active community
 sharing deposit tools.  Partly this is because many uses of SWORD will be
 very specific point-to-point integrations, which might not be of interest
 to too many others.
 
  It would be good to hear a wider discussion about this, and how we
 share more about our individual uses of SWORD.
 
  I think one of the main issues is exactly where to ask about what.
  Because SWORD is a standard, but the technical questions are really
  about implementations, where is the best place to post about problems?
   For example, if the problems are specifically with the DSpace
  implementation of SWORD, it is /probably/ better to ask on dspace-dev.

 I'm sure as implementers begin their work many will have questions
 about the SWORD specification itself (I know I do), so I'm absolutely
 supportive of and thankful for this mailing list.

  Also, because we're in the early stages of community development with
  SWORD, Stuart and I are a bit of a bottleneck on this list - usually
  one of us is required to respond, and if we're unavailable for any
  length of time (e.g. I've been travelling for nearly 2 weeks now, and
  am emailing from the fourth row of a session at OAI8 right now :) ),
  then the list looks dead.
 
  We are hoping to have some discussions around sword sustainability
  with Jisc quite soon which community development and support like this
  is going to be a key part of.
 
  Very interested in people's thoughts as to how to make things better.

 I've mentioned this in passing but I'll repeat my offer to log an IRC
 channel on Freenode to discuss SWORD. I'm discussing implementation
 details that would not be of general interest at
 http://irclog.iq.harvard.edu/dvn/2013-07-30 for example, but a channel
 dedicated to the SWORD spec itself would be fantastic. I find chat to
 be a great way to get a quick pulse on an issue. I fear the walls of
 text I've been sending to this mailing list are simply too much at
 once. :)

 Anyway, I've very interested in community development in general and
 happy to help in any way I can. (I don't mean to beat a dead horse
 about IRC.) I'm learning a lot about SWORD and AtomPub and working on
 our implementation is actually a lot of fun. :)

 Phil

 p.s. In other news, we (royal we, Peter Bull, actually) are starting
 to use Richard's https://github.com/swordapp/python-client-sword2 to
 cook up a test client
 https://github.com/dvn/swordpoc/tree/master/dvn_client . We're very
 thankful for all the libraries that have been published!

 --
 Philip Durbin
 Software Developer for http://thedata.org
 http://www.iq.harvard.edu/people/philip-durbin


 --
 Get your SQL database under version control now!
 Version control is standard for application code, but databases havent
 caught up. So what steps can you take to put your SQL databases under
 version control? Why should you start doing it? Read more to find out.
 http://pubads.g.doubleclick.net/gampad/clk?id=49501711iu=/4140/ostg.clktrk
 ___
 sword-app-tech mailing list
 sword-app-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/sword-app-tech




-- 
Katherine Fletcher, kathi.fletc...@gmail.com
kathi.fletc...@gmail.com
Twitter: kefletcher http://www.twitter.com/kefletcher Blog:
kefletcher.blogspot.com
kathi.fletc...@gmail.com
--
Get your SQL database under version control now!
Version control is standard for application code, but databases havent 
caught up. So