The Cloud, the Researcher and the Repository
Leslie Carr

There's currently a lot of buzz about DuraSpace, the DSpace and Fedora 
project to incorporate cloud storage into repositories. I wasn't able to 
catch their webinar on Thursday, but I'm keeping my ear to the ground 
because it sounds like a very positive agenda for repositories in 
general to adopt. I hope this is a good opportunity to make a few 
remarks about the work that EPrints is doing that also might make cloud 
services accessible to repositories and users of repositories.

Moving your data into the cloud is a bit like moving your stuff into an 
unfurnished apartment. You get an awful lot of space to put things, once 
a month you have to pay the landlord, and you end up with absolutely 
nothing available to help you to organise and look after your things. 
You have to put your clothes, DVDs and crockery in a big pile on the 
floor unless you get some furniture in. But cloud 'furniture' comes as 
downloadable instructions on how to take three planks of wood and craft 
something that functions almost the same as a coffee table. In short, 
it's a great place for highly competent DIY enthusiasts with time on 
their hands. The EPrints team have been working on projects that might 
help researchers looking to take advantage of the cloud's benefits, 
without being put off by its lack of home comforts.

We've previously announced that Dave Tarrant has extended EPrints to use 
cloud storage services as part of JISC's PRESERV2 project 
(preserv.eprints.org). The new EPrints storage controller (debuting in 
EPrints v3.2) allows the repository to offload the storage of its files 
to any external service - cloud storage, local storage area networks or 
even national archiving services. The repository can mix and match these 
services according to the characteristics of each deposited object - 
even storing each item in several places for redundancy or performance 
improvement.

That tackles the technical part of the problem - how to join up 
repositories with the cloud, but it doesn't have much to say about how 
to better engage data-rich-users with the cloud (or with the repository 
come to that). As part of the JISC KULTUR project (kultur.eprints.org), 
Tim Brody has been looking at the problem of user deposit for lots of 
large media files. Not petabyte large, but gigabyte large. Even at that 
scale, the normal web infrastructure fails to deliver a reliable service 
- connections between a web browser and server just time out 
unexpectedly and silently - which makes it unpleasant for an artist who 
is trying to archive their career's-worth of video installations to the 
institutional repository. It's also really tedious even if you try to 
upload 100 small image files to the repository through the web deposit 
interface.

The solution that Tim has come up with is to allow the researcher's 
desktop environment to directly use EPrints as a file system - you can 
'mount' the repository as a network drive on your Windows/Mac/Linux 
desktop using services like WebDAV or FTP. As far as the user is 
concerned, they can just drag and drop a whole bunch of files from their 
documents folders, home directories or DVD-ROMs onto the repository 
disk, and EPrints will automatically deposit them into a new entry or 
entries. Of course, you can also do the reverse - copy documents from 
the repository back onto your desktop, open them directly in 
applications, or attach them to an email. And once you have opened a 
repository file directly in Microsoft Word (say) then why not save the 
changes back into the repository, with the repository either updating 
the original document or making a new version of it according to local 
policy? Or for UNIX admins, you can just set up a command-line FTP 
connection to the repository and relive the glory days of the pre-Web 
internet. And who knows, perhaps there will be demand for a gopher 
interface too?

Now perhaps if you put the desktop front-end together with the cloud 
back-end, the repository might be able to offer institutional 
researchers a realistic path to cloud storage. For the researcher who is 
tempted by the expansion capacity that the cloud's metaphorical 
unfurnished apartment offers them, the repository could offer a removal 
van, a concierge, a security guard, a cleaner and an expandable set of 
prefabricated cupboards and walk-in wardrobes. Not naked cloud storage, 
but storage that is mediated, managed and moderated on the researcher's 
behalf by the institution, so that they have the assurance that their 
data is not stranded and susceptible to the irregularities of cloud 
service provider SLAs. In other words, a cloud you can depend on!

The above paragraph sounds a bit hand-wavy, and to be honest we need to 
get some proper experience of this with real researchers before we can 
be confident that it is a viable approach. Desktop services have already 
been built on top of cloud storage - JungleDisk for example is a desktop 
backup and archiving service, but it still requires the user to have 
their own cloud account. Hopefully, a repository can take away all the 
necessity for special accounts, passwords and storage management from 
the user and provide them with a whole host of extra, valuable services.

Perhaps that's where the challenge lies. Repositories need to commit to 
providing really useful services to all their users - cloud users (or 
potential cloud users) are not a new breed, even if they do have 
exacting requirements. So having taken care of the infrastructure that 
seemlessly connects repositories and clouds, lets make sure that we keep 
on innovating in the user space. Backup, archiving, preservation and 
access are a good foundation, but they are only the start.

There will be a demonstration of this work and other features of EPrints 
3.2 at Open Repositories 2009 in Atlanta, Georgia on May 18th-21st. Make 
sure you come along because it's going to be a really exciting 
conference, whether or not it is cloudy :-)

--

FONTE: 
http://repositoryman.blogspot.com/2009/02/cloud-researcher-and-repository.html 

_______________________________________________
Instruções para desiscrever-se por conta própria:
http://listas.ibict.br/cgi-bin/mailman/options/bib_virtual
Bib_virtual mailing list
[email protected]
http://listas.ibict.br/cgi-bin/mailman/listinfo/bib_virtual

Responder a