Re: [CODE4LIB] Digital collection backups

Chris Cormack Thu, 10 Jan 2013 18:41:28 -0800

Obnam http://liw.fi/obnam/ might do what you need with the minimum of fuss


Chris

On 11 January 2013 12:05, Fleming, Declan <[email protected]> wrote:
> Hi - you might look into Chronopolis (which can be front ended by DuraCloud 
> or not)  http://chronopolis.sdsc.edu/
>
> Declan
>
> -----Original Message-----
> From: Code for Libraries [mailto:[email protected]] On Behalf Of Roy 
> Tennant
> Sent: Thursday, January 10, 2013 2:56 PM
> To: [email protected]
> Subject: Re: [CODE4LIB] Digital collection backups
>
> I'd also take a look at Amazon Glacier. Recently I parked about 50GB of data 
> files in logical tar'd and gzip'd chunks and it's costing my employer less 
> than 50 cents/month. Glacier, however, is best for "park it and forget" kinds 
> of needs, as the real cost is in data flow.
> Storage is cheap, but must be considered "offline" or "near line" as you must 
> first request to retrieve a file, wait for about a day, and then retrieve the 
> file. And you're charged more for the download throughput than just about 
> anything.
>
> I'm using a Unix client to handle all of the heavy lifting of uploading and 
> downloading, as Glacier is meant to be used via an API rather than a web 
> client.[1] If anyone is interested, I have local documentation on usage that 
> I could probably genericize. And yes, I did round-trip a file to make sure it 
> functioned as advertised.
> Roy
>
> [1] https://github.com/vsespb/mt-aws-glacier
>
> On Thu, Jan 10, 2013 at 2:29 PM,  <[email protected]> wrote:
>> We built our own solution for this by creating a plugin that works with our 
>> digital asset management system (ResourceSpace) to invidually back up files 
>> to Amazon S3. Because S3 is replicated to multiple data centers, this 
>> provides a fairly high level of redundancy. And because it's an object-based 
>> web service, we can access any given object individually by using a URL 
>> related to the original storage URL within our system.
>>
>> This also allows us to take advantage of S3 for images on our website. All 
>> of the images from in our online collections database are being served 
>> straight from S3, which diverts the load from our public web server. When we 
>> launch zoomable images later this year, all of the tiles will also be 
>> generated locally in the DAM and then served to the public via the mirrored 
>> copy in S3.
>>
>> The current pricing is around $0.08/GB/month for 1-50 TB, which I think is 
>> fairly reasonable for what we're getting. They just dropped the price 
>> substantially a few months ago.
>>
>> DuraCloud http://www.duracloud.org/ supposedly offers a way to add another 
>> abstraction layer so you can build something like this that is portable 
>> between different cloud storage providers. But I haven't really looked into 
>> this as of yet.
>>
>> -David
>>
>>
>> __________
>>
>> David Dwiggins
>> Systems Librarian/Archivist, Historic New England
>> 141 Cambridge Street, Boston, MA 02114
>> (617) 994-5948
>> [email protected]
>> http://www.historicnewengland.org
>>>>> Joshua Welker <[email protected]> 1/10/2013 5:20 PM >>>
>> Hi everyone,
>>
>> We are starting a digitization project for some of our special collections, 
>> and we are having a hard time setting up a backup system that meets the 
>> long-term preservation needs of digital archives. The backup mechanisms 
>> currently used by campus IT are short-term full-server backups. What we are 
>> looking for is more granular, file-level backup over the very long term. 
>> Does anyone have any recommendations of software or some service or 
>> technique? We are looking into LOCKSS but haven't dug too deeply yet. Can 
>> anyone who uses LOCKSS tell me a bit of their experiences with it?
>>
>> Josh Welker
>> Electronic/Media Services Librarian
>> College Liaison
>> University Libraries
>> Southwest Baptist University
>> 417.328.1624

Re: [CODE4LIB] Digital collection backups

Reply via email to