Dear Fedora users,

I'm currently developing a repository system using Fedora, for the library
and archives at the Rock and Roll Hall of Fame.  The system will house a
variety of digital formats but will have a lot of video, in the 100's of
terabytes range over the next 3-5 years.  I'm sketching a plan for building
the storage systems that will deal with all of this data.  We're starting
completely from scratch with no data and no systems, which is daunting to
say the least, and would very much appreciate any feedback or
recommendations you all might have.

The content of our data will mostly be large, high-quality or lossless a/v
files that are accessed very infrequently, and a much smaller percentage of
compressed a/v files that are accessed almost constantly.  My approach is to
use two kinds of a storage: 1) a disk-based SAN for access data like
streaming audio and video derivatives, and 2) another disk array or tape
library for preservation master files like our lossless files.  From a
couple of email conversations I've had with some folks over at Indiana
University (where I used to work), the way this could work is have an
instance of Fedora that runs a repository of metadata and access data, and
either manually move preservation data to a different data store or have a
second instance of Fedora that manages just the preservation data.

The questions I'm trying to get answers to at this moment are:

1. For preservation data, disk or tape?
Probably too big a question to dig into here, but my current feeling is that
for the amount of data in question, disk would be too costly in the long-run
to maintain, and tape systems are better equipped to deal with this kind of
situation.

2. What are the issues with using a tape library as the underlying storage
layer in Fedora?
I'm looking at tape library products that can present a filesystem to a
server which writes the data.  The tape library uses an internal disk cache
and spools data out to tape in 2 copies.  One copy stays in the machine, the
other gets sent off-site.  Two products I'm currently investigating are
Sony's Petasite and Dell's ML6000.  Similar products from IBM, I believe,
are also a possibility.

2a. If I use a second instance of Fedora for preservation data, can I really
just point the data storage for that instance to the tape library's exported
filesystem and be done with it or am I missing pieces of the puzzle, like
Akubra, iRODS, or other?

3. What are the advantages or disadvantages of using a second Fedora
instance to manage preservation data, over a manual or scripted approach?

4. Anything else?


Many thanks, in advance!

...adam
------------------------------------------------------------------------------
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
_______________________________________________
Fedora-commons-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to