You might like to get in touch with the AIMS project: http://www2.lib.virginia.edu/aims/
which is tackling exactly the kinds of issues to which you refer. Amongst their intentions is to develop a workflow based on Hydra components that can handle the kind of arrangement and description problem that you so rightly describe as difficult to solve in Fedora by itself. --- A. Soroka Digital Research and Scholarship R & D and Online Library Environment the University of Virginia Library On Dec 21, 2010, at 10:52 AM, Peter Cliff wrote: > I do not know about M. Jallud's domain, but this brings out something the > futureArch project at the University of Oxford have wrestled with. Here we > needed to ingest disk images. Each image is itself a nice, self-contained > file system. Some are small floppy disks, others are larger hard drive images > and the like. The smallest of these (so far) is 2GB and contains several > thousand files (I don't have the exact figure to hand, but suffice to say > more than the number of data streams it'd be sensible to attach to a Fedora > object - though many of those files are probably OS junk, etc.). > > In an ideal world, each disk image would be appraised and the individual > useful files extracted and put into a repository individually. In reality, > sifting through a disk image of that number of files is about as onerous as > sifting through a large number of boxes and so it can take time and staff and > thus the disk image needs preserving until we get those resources to address > it. Further, a bit-by-bit copy of the disk may contain useful research data > in itself... > > In deciding if we use Fedora as repository for these disk images, the > question was how to model the image and its files in Fedora and we thought of > two ways: > > 1) Ingest the disk image and add a datastream per file, as per this thread. > As you can imagine, that isn't a great way to use Fedora... > > 2) Break the image up into files and ingest each and create a contents list > with associated file system metadata, etc. with each file. This seems doable, > but it seems a large overhead just to use Fedora. > > Which led to the conclusion that Fedora probably wasn't the tool *for this > particular job* (don't flame me - I'm well aware of the many good uses for > Fedora!) but this has been bugging me ever since and perhaps we're just the > victims of a "desire to map preexisting persistence architectures"... :-) > > Pete Cliff > Bodleian Library > > On 21 Dec 2010, at 15:15, <aj...@virginia.edu> wrote: > >> That is the point at which I was getting-- I wonder if M. Jallud's domain is >> being effectively and efficiently represented in Fedora. >> >> Something I see a great deal in early use of Fedora is the desire to map >> preexisting persistence architectures directly onto the repository. E.g. the >> expectation that a "directory of files" will become an "object of >> datastreams". >> >> I don't know what M. Jallud is thinking and I don't mean to imply any >> criticism, but I do wonder about any Fedora-based architecture featuring >> objects with thousands of datastreams. It can be objectively said that such >> an architecture is not at all idiomatic. >> >> --- >> A. Soroka >> Digital Research and Scholarship R & D and Online Library Environment >> the University of Virginia Library >> >> >> >> >> On Dec 21, 2010, at 10:06 AM, Alex Rodriguez Lopez wrote: >> >>> Hi. >>> >>> Maybe I'm missing something here, but wouldn't be a better approach to >>> create new objects (each with 1 (or some, but not 100s) datastream) for >>> each file and have them relate to the primary object >>> https://wiki.duraspace.org/display/FCR30/Digital+Object+Relationships ? >>> >>> Instead of having 1 object with 1000s datastreams, you have 1 object >>> linked to 1000s objects (each with one datastream). >>> >>> Unless you *REALLY* need all to reside in one big XML... >>> >>> Pierre-Yves JALLUD, 21-12-2010 14:52: >>>> Thanks for your answers. That conforts me in the idea that the objects I >>>> wanted to store in FedoraCommons are not adapted for this kind of >>>> system. I'll impose to the users to split there archives in an >>>> acceptable number of files. They used to have a maximum of 1000 or 2000 >>>> datastreams (exceptionaly) and FC has correct answers' times. That will >>>> be the limit of my system. >>>> Thank you again and greetings >>>> >>>>> I am wondering a little about the data model in play here. I may have >>>>> missed an earlier part of this conversation, but I wonder if you could >>>>> describe your domain problem a little, M. Jallud? >>>>> Perhaps we can find a more efficient and idiomatic way to use Fedora's >>>>> CMA than is now obvious to you... to have more than a few dozen >>>>> datastreams in a content model is very unusual and >>>>> implies the possibility of useful refactoring. >>>>> >>>>> --- >>>>> A. Soroka >>>>> Digital Research and Scholarship R& D and Online Library Environment >>>>> the University of Virginia Library >>>>> >>>>> >>>>> >>>>> >>>>> On Dec 20, 2010, at 9:00 AM, Asger Askov Blekinge wrote: >>>>> >>>>>> Sounds about right, but this is not a hard limit. >>>>>> >>>>>> As you know, Fedora stores the datastreams in one big xml file. >>>>>> >>>>>> What is the maximum size of xml files? How many elements can there >>>>> be in >>>>>> an xml list? How long do you want to wait for fedora to parse this >>>>>> object? Those are the relevant questions, and by answering them, you >>>>>> will have answered your original question. >>>>>> >>>>>> Regards >>>>>> >>>>>> >>>>>> On Mon, 2010-12-20 at 14:54 +0100, Pierre-Yves JALLUD wrote: >>>>>>> Hi everyone, >>>>>>> I'm using 3.2.1 version of FedoraCommons. I wonder what is the maximum >>>>>>> number of datastreams that we can add in a single object. My >>>>> experiments >>>>>>> seem to demonstrate that this number is around 32000 (32768?...). Is >>>>>>> that true? Is that always true in the last versions? >>>>>>> >>>>>>> Thanks for your answers. >>>>>>> Pierre-Yves >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Lotusphere 2011 >>>> Register now for Lotusphere 2011 and learn how >>>> to connect the dots, take your collaborative environment >>>> to the next level, and enter the era of Social Business. >>>> http://p.sf.net/sfu/lotusphere-d2d >>>> >>>> >>>> >>>> _______________________________________________ >>>> Fedora-commons-users mailing list >>>> Fedora-commons-users@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users >>> >>> ------------------------------------------------------------------------------ >>> Lotusphere 2011 >>> Register now for Lotusphere 2011 and learn how >>> to connect the dots, take your collaborative environment >>> to the next level, and enter the era of Social Business. >>> http://p.sf.net/sfu/lotusphere-d2d >>> _______________________________________________ >>> Fedora-commons-users mailing list >>> Fedora-commons-users@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users >> >> >> ------------------------------------------------------------------------------ >> Lotusphere 2011 >> Register now for Lotusphere 2011 and learn how >> to connect the dots, take your collaborative environment >> to the next level, and enter the era of Social Business. >> http://p.sf.net/sfu/lotusphere-d2d >> _______________________________________________ >> Fedora-commons-users mailing list >> Fedora-commons-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users > > > ------------------------------------------------------------------------------ > Forrester recently released a report on the Return on Investment (ROI) of > Google Apps. They found a 300% ROI, 38%-56% cost savings, and break-even > within 7 months. Over 3 million businesses have gone Google with Google Apps: > an online email calendar, and document program that's accessible from your > browser. Read the Forrester report: http://p.sf.net/sfu/googleapps-sfnew > _______________________________________________ > Fedora-commons-users mailing list > Fedora-commons-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/fedora-commons-users ------------------------------------------------------------------------------ Forrester recently released a report on the Return on Investment (ROI) of Google Apps. They found a 300% ROI, 38%-56% cost savings, and break-even within 7 months. Over 3 million businesses have gone Google with Google Apps: an online email calendar, and document program that's accessible from your browser. Read the Forrester report: http://p.sf.net/sfu/googleapps-sfnew _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-users