What I've always hoped for is a solution (via Akubra or whatever) that
uses content model declarations to say "if the object is a preservation
master, store it in that (slow?) mapped storage over there; if it is a
delivery surrogate, store it in this (fast?) storage here; if it's a pdf
for an ETD put it somewhere else" etc - with all the different stores
mapped as managed data.
I'm with Adam - coping with external storage, we did it in our first
solution which was all external, is definitely a pain to be avoided
where possible; our version 2 is all managed but our biggest file
currently is about 600MB. External storage works just fine, it's the
workflow code you have to maintain to move the files around and keep
things synchronised that's the issue.
Richard
___________________________________________________________________
Richard Green
Consultant to Library & Learning Innovation, University of Hull
managing the Hydra in Hull and Hydra (Hull) Projects
http://hydra.hull.ac.uk
http://projecthydra.org
From: Aaron Collie [mailto:w.a.col...@gmail.com]
Sent: 06 March 2013 4:26 PM
To: Support and info exchange list for Fedora users.
Subject: Re: [fcrepo-user] Managed vs. External Storage
There was another discussion on this list (I believe) about E vs M
storage regarding large files like preservation masters. It made me
wonder how many people are actually managing preservation masters in
Fedora Commons (vs simply managing derivative dissemination files). The
files I am looking at (for example) are 800MB TIFF images (e.g. full
color newspaper scans, maps). We are still developing our solution, but
it involves internally managing the serving files and externally
managing/referencing the preservation masters.
I would love to hear more use cases for this type of application.
Specifically, how people are managing preservation masters--which are
really kind of our bread and butter as digital preservation-minded
people. Maybe I am overlooking something?
-Aaron
On Wed, Mar 6, 2013 at 9:56 AM, aj...@virginia.edu <aj...@virginia.edu>
wrote:
The limits on upload depend radically on your network design. Uploading
from a single node to itself, the limits may be in the several GBs.
Uploading across a limited network, maybe not so much. Fedora has no
inherent limitations on the upload process, but many other pieces of the
system probably will. Fedora will require a little more storage than the
actual size of binary content. That additional storage is rarely an
problem or even interesting to budget, unless you are creating a huge
number of tiny objects.
500k objects is not a large amount for a Fedora repository. What will
matter more is how large each is. How large are they, on average?
I want to emphasize what Justin said: "external datastreams are a...
workaround for when you have no other choice." To my mind that's exactly
right.
---
A. Soroka
The University of Virginia Library
On Mar 6, 2013, at 9:40 AM, James, Eric wrote:
> Have any benchmarks been done regarding file size thresholds for
managed datastreams? I.E. how many MB/GB would break the upload process
or just be too slow to be practical? And are the other issues involved
(network, storage, etc).
>
> I'm dealing with 500k images on disk, and am considering either to
leave them where they are and use an external datastream to point to
them, or transfer them into a managed datastreams. There is also
another collection of large AV files in the GBs range which at this
point seem like external is the way to go.
>
> Thanks,
> Eric
> From: Justin Coyne [jus...@curationexperts.com]
> Sent: Monday, March 04, 2013 9:44 PM
> To: Support and info exchange list for Fedora users.
> Subject: Re: [fcrepo-user] Managed vs. External Storage
>
> I find that it's much easier to put all your objects within the Fedora
repository if you have the option. If you're storing externally, you
loose the ability to do automatic checksum validation and versioning.
Furthermore, you have to make certain that you maintain integrity
between your external store and the reference within Fedora. To me
external datastreams are a great workaround for when you have no other
choice.
>
> Best Regards,
> Justin Coyne
> Data Curation Experts
>
> On Mon, Mar 4, 2013 at 4:30 PM, Schmidt, Lisa (lschmidt)
<lschm...@msu.edu> wrote:
> At the Michigan State University Archives, we are wrestling with the
question of where to store digital objects/AIPs: within our Fedora
repository (managed), or externally.
>
> What are the issues associated with each approach?
>
> We have external storage available on an IX Systems storage device,
and have been planning to use it for archival storage of AIPs with
pointers in the Fedora repository; it would be synched to a second IX
storage device that would function as our dark archive. We want to do
our due diligence, however, to ensure that this is the right approach.
>
> Thank you,
> Lisa
>
> ____________________________________________________
>
> Lisa M. Schmidt
> Electronic Records Archivist
> University Archives & Historical Collections
> 888 Wilson Road
> Room 101 , Conrad Hall
> Michigan State University
> East Lansing, MI 48824
>
> lschm...@msu.edu
> 1-517-884-6441
>
>
>
------------------------------------------------------------------------
------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_feb
> _______________________________________________
> Fedora-commons-users mailing list
> Fedora-commons-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>
>
>
------------------------------------------------------------------------
------
> Symantec Endpoint Protection 12 positioned as A LEADER in The
Forrester
> Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in
the
> endpoint security space. For insight on selecting the right partner to
> tackle endpoint security challenges, access the full report.
>
http://p.sf.net/sfu/symantec-dev2dev____________________________________
___________
> Fedora-commons-users mailing list
> Fedora-commons-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
------------------------------------------------------------------------
------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
endpoint security space. For insight on selecting the right partner to
tackle endpoint security challenges, access the full report.
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
Fedora-commons-users mailing list
Fedora-commons-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
**************************************************
To view the terms under which this email is
distributed, please go to
http://www2.hull.ac.uk/legal/disclaimer.aspx
**************************************************
------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
endpoint security space. For insight on selecting the right partner to
tackle endpoint security challenges, access the full report.
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
Fedora-commons-users mailing list
Fedora-commons-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users