Hi Matteo, On Sun, Aug 16, 2009 at 5:42 AM, Matteo Boschini<[email protected]> wrote: > First of all, thanks for the reply and the answer. > > On Fri, Aug 14, 2009 at 8:56 PM, Chris Wilper <[email protected]> wrote: >> >> Hi Matteo, >> >> When ingesting an object with managed content, you have a couple >> options. One is to provide it in base64 format inside the FOXML >> itself. (export an object with some managed content in the "archive" >> context to see an example). The other option is to provide it by >> reference. With this option, you give it an HTTP URL (you have to >> have a webserver fronting the content) and Fedora copies the content >> into the repository at ingest time. >> >> More detail: >> >> For managed content, the way content is included (or referenced) >> within FOXML depends on a couple factors. >> >> If the FOXML is about to be ingested, managed datastreams can be >> included as base64-encoded content right inside the XML. It can also >> be referenced via URL (http). In the latter case, Fedora retrieves >> the content automatically as part of the ingest operation. > > I fear I can not use such an option, sice I have someting like 1M objects, > each with a few Mbytes datastream associated.
Yes, the inline base64 option is best for few, small files. I think the better option for you is to temporarily put a webserver in front of the content wherever it's sitting, and refer to it that way on the way in (via a http://localhost/path/to/datastream). There's an even better option under development, which is to have the ability to use a file:/// url to load the content in if it's local. But we're not sure when that capability will be ready. To follow its progress, see https://fedora-commons.org/jira/browse/FCREPO-453 The fastest possible option (which I personally haven't tried, but maybe others have?) is to pre-stage the content and FOXML in the $FEDORA_HOME/data/ directory, then run the rebuilder. See below for more on that. >> If the object is inside the repository already, managed content is >> referenced within the FOXML in using a special kind of reference (e.g. >> "changeme:6+DS1+DS1.0"). This kind of reference is only used inside >> the repository to get the content from the appropriate place (the low >> level file storage) when appropriate. > > So, in the FOXML I should have a line that reads "<foxml:contentLocation > REF="changeme:6+DS1+DS1.0".... > Is this correct ? That is the form that Fedora changes it to once it's ingested, but you can't ingest with such a reference because Fedora doesn't yet know what path on disk that is associated with. >> You typically don't use or >> create these kind of references. > > why ? In my naiveness, I thought this might be useful, at least for my > purposes: I havve all datastreams already on server, I create a batch XML, > all the objects XML and then ingest them, with the datastreams already in > the server. Actually, it's probably possible (haven't tried) to author your objects like this this, and manually put the FOXML and datastreams in low level storage, then run a rebuild instead of trying to send each one through via ingest. As mentioned above, when you ingest through the API with such references, Fedora doesn't know how to resolve them. But running a rebuild would basically reconstruct the mapping of those IDs to locations on disk in the Fedora tables. This would certainly be the fastest approach to getting your content bulk-loaded into the repository if it's already on the local disk. But I can't guarantee it will work. Just make sure if you try it: - Put the already-constructed FOXML somewhere under the "objects" directory, and the managed content somewhere under the "datastreams" directory. - Make sure the filenames of the above files are what Fedora expects; for FOXML, the filename should be the PID with "_" (underscore) in place of of ":" (colon). For the managed datastreams, the filename should be the same as the ID used in the contentLocation (e.g. pid+dsId+dsVersionId)...also with the ":" in the PID part replaced with "_". After running fedora-rebuild and starting Fedora again, make sure you can access the managed content via http (e.g. http://localhost:8080/fedora/get/demo:PID/DSID) If you try this option, I (and I'm sure others) would be interested in seeing how it works out. Good luck :) - Chris ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Fedora-commons-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
