Hi Pete,

Managed content is convenient for cases where you're dealing with
relatively small files (measured in MB) and you don't want to worry
about allocating storage paths or hosting the content separately via
http to provide access.  It's also convenient for use cases where you
need to update/delete data periodically because Fedora's APIs allow
you to do this through one interface.

If you're sending in a lot of managed content, currently the most
efficient (official) way to do it currently is to send it in by
reference.  That is, make sure its control group is "M" and provide an
HTTP URL so Fedora can suck it in at ingest-time.  This is more
efficient than UploadServlet approach (which FedoraClient uses) for a
couple reasons, the major one being that the UploadServlet has to
store the content in a temporary location first, then move it over to
the ultimate managed location later.  This gets really noticable with
big files (time and space-wise).

> I've been toying with the idea of having our file "landing zone" the Web
> root of a Web server (meaning just two copies) and pointing the Fedora
> ingest at that - anyone tried it?

Yes.. you may also want to check out this approach:
http://fedora.fiz-karlsruhe.de/docs/Wiki.jsp?page=ManagedContentRetrieval

If you're dealing with a lots of large files (measured in GB+), you
may ultimately find it more practical to use Externally Referenced
content because it doesn't require the movement of existing data.
This does require more up-front thought about how you're store/manage
content outside of the Fedora apis.

- Chris

On Mon, Feb 23, 2009 at 10:01 AM, Pete Cliff <[email protected]> wrote:
> Hello!
>
> My name is Pete Cliff and I've started working with Fedora in anger only
> recently at the University of Oxford. Regarding these messages, I'm also
> struggling to ingest non-Web accessible files and working my way into
> the source code at the moment to see if I can make that happen, but
> would welcome any pointers.
>
> The suggested use of "uploadFile()" works well for small objects, but if
> I want to upload a 20GB file (that will be one of our smaller objects)
> then it becomes an expensive operation, both in terms of time (just a
> 700mb iso transfers over the network at 2 minutes 30 seconds, going via
> Fedora this goes up to 4 minutes 30 seconds - larger files are worse)
> and disk space (we need three times the file size as temporary storage).
> There is also the security risk of having three copies of the object
> lying around at a time...
>
> I've been toying with the idea of having our file "landing zone" the Web
> root of a Web server (meaning just two copies) and pointing the Fedora
> ingest at that - anyone tried it?
>
> Failing this, could anyone give me a clearer idea of the difference
> between "Managed Content" and "Externally Referenced"? I've done a few
> things with Externally Referenced content - like verify checksums - and
> that seems to work fine, so why should I ever want "Managed content"?
> (Access control maybe?).
>
> Sorry for so many questions from a "noob"! :-)
>
> Pete Cliff
> Software Engineer, FutureArch
> http://futurearchives.blogspot.com/
>
> -----Original Message-----
> ------------------------------
> Message: 6
> Date: Thu, 19 Feb 2009 13:30:57 +0100
> From: Pierre-Yves JALLUD <[email protected]>
> Subject: [Fedora-commons-developers] Methods to add Datastreams
> To: fedora-commons-developers
>        <[email protected]>
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi all,
> I'm using API-M to add datastreams to the fedora objects, but I have
> problems with the dsLocation argument: it must be an URL. And my files
> aren't accessible by the web...
> Are there other functions to add datastreams?... using local files for
> exemple.
>
> thanks
> Pierre-Yves
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: pierre-yves_jallud.vcf
> Type: text/x-vcard
> Size: 171 bytes
> Desc: not available
>
> ------------------------------
>
> Message: 7
> Date: Thu, 19 Feb 2009 14:39:18 +0100
> From: "arne anka" <[email protected]>
> Subject: Re: [Fedora-commons-developers] Methods to add Datastreams
> To: fedora-commons-developers
>        <[email protected]>
> Message-ID: <[email protected]>
> Content-Type: text/plain; format=flowed; delsp=yes; charset=utf-8
>
>> Are there other functions to add datastreams?... using local files for
>> exemple.
>
> use the FedoraClient class and uploadFile() -- the string returned
> serves
> as dsLocation.
>
>
>
> ------------------------------------------------------------------------------
> Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
> -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
> -Strategies to boost innovation and cut costs with open source participation
> -Receive a $600 discount off the registration fee with the source code: SFAD
> http://p.sf.net/sfu/XcvMzF8H
> _______________________________________________
> Fedora-commons-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers
>

------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Fedora-commons-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

Reply via email to