Hi Scott,

On Thu, Mar 7, 2013 at 11:20 AM, Scott Prater <pra...@wisc.edu> wrote:

> Adam,
>
> How do you manage your inventory of videos outside of Fedora?


When a video is created, it first starts out as an object in fedora,
without any attached content.  The workflow process then checks that the
object exists and then processes the video.  This is all done in our Hydra
application. So we can't really add any videos without adding them to
Fedora first.


>  As others
> have recently pointed out, there are some duplication of effort costs
> associated with using external datastreams:  you lose the checksumming,
> the audit paths and backups when a video changes, etc., all functions
> that Fedora provides for managed datastreams, but that need to be
> handled elsewhere when opting for external datastreams.


Yes, very true.  Checksumming is done via the bag.  The user will create a
video with a pid like "changeme:123" which will have a corresponding bag
folder of changeme_123.  The video files are added to changeme_123/data,
uploaded and then checksummed using bagit utility software.  If it passes,
it's moved over to our HSM storage, ingested and then backed up via Tivoli.

Yeah, it's cumbersome and complicated, but it was the only way I could come
up with for dealing with video files that are 500-700 GB each.  These are
uncompressed 10bit video files and uploading/checksumming takes hours, so
the steps need to be separated and performed individually.  A present, we
have about 160 TB of video that's moved on and off of a 10 TB disk
partition.

You are correct that we don't have audit paths if the video content should
change.  But in our case, the video should not change unless it's withdrawn
and deleted from the repository.  Fixity, however, is a problem.  We're not
running regular checksums on content yet.  We just got done ingesting the
majority of our video catalog going back to 1986, so I'm now in the process
of revisiting storage procedures and plan to begin some kind of fixity
check process.  This would be done by running checks on the bags stored on
the filesystem and wouldn't involve Fedora directly.

 Not that I'm
> questioning your approach;  I'm just curious how you manage it.  We're
> embarking on a path of ingesting thousands of RAW TIFFs now, and we want
> to make sure we don't make wrong decisions concerning storage and Fedora
> datastream management, decisions that will come back to haunt us later.
>

I appreciate the questioning!  It helps me thing through my processes and
get feedback on newer/better approches.

We're also embarking on images and generic digital content.  For that, I
definitely will be using managed datastreams.  That will have a much
smaller storage footprint of a few terabytes, so at this point I'm
envisioning keeping it all on disk with backups to tape.  No HSM
involvement required.


> I am now of the mind that Fedora should do what it does best:  store,
> track, and serve up datastreams.  We've started moving towards handling
> transformations of our datastreams with third party webservices that
> simply ask Fedora for a datastream then does the transformation itself,
> and it looks to make our lives much, much simpler.
>

Yes, that's what I use Fedora for and also what I think it does best.  I'm
also going to involving things like a J2K server to do our image
transformations.

...adam
------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
Fedora-commons-users mailing list
Fedora-commons-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to