Jamey- I'm not sure I understand what you want to do, but: You could have many objects in your Fedora repository (some representing files, and others representing hierarchical groupings of files) that are related by triples in their respective RELS-EXT datastreams. Your last paragraph, though, makes it sound like you just want a document versioning system. Is that the case?
- Ben On 1/28/11, Wood, Jamey <jamey.w...@nrel.gov> wrote: > Sorry to pester, but does anyone have thoughts on this? > > Thanks, > Jamey > > From: Jamey Wood <jamey.w...@nrel.gov<mailto:jamey.w...@nrel.gov>> > Date: Tue, 25 Jan 2011 16:12:40 -0700 > To: > "fedora-commons-users@lists.sourceforge.net<mailto:fedora-commons-users@lists.sourceforge.net>" > <fedora-commons-users@lists.sourceforge.net<mailto:fedora-commons-users@lists.sourceforge.net>> > Subject: Fedora Commons for Large Datasets with Thousands of Files > > Hello, > > I'm trying to understand how Fedora Commons might be applied to managing > datasets that: > > * May each consist of several thousand individual files > * May have files organized in some meaningful hierarchical directory > structure (e.g. "type1/subtype1/file1.csv") > * Would benefit from some form of "whole-object" versioning (along the > lines used by the eSciDoc project [1]) > > An example of one such dataset can be seen at [2] (with overview and > documentation materials at [3]). > > At first, I was assuming that each such dataset would be a single Fedora > Commons object that would have a separate datastream for each file belonging > to the dataset. And then whole-object versioning could be implemented using > a special datastream, as described in the eSciDoc paper. > > But after looking through this mailing list's archives, I found the "Max > number of datastreams of a object" thread (from December) where multiple > people noted that having Fedora Commons objects with hundreds or thousands > of datastreams probably isn't a good idea (although there isn't necessarily > a hard limit preventing it). So now I'm wondering how to best model these > datasets in Fedora Commons? Or is Fedora Commons simply not the right tool > for this usage scenario? > > One possibility I'm wondering about would be to just create some kind of > top-level Fedora Commons object that has a pointer to the top-level data > location (URL), but doesn't attempt to track individual files within the > dataset. Then if a new revision of the dataset is published, that top-level > URL pointer might be directed to some new location. Is this a reasonable > approach? Or would it be considered bad practice? > > Any thoughts on this or pointers towards general best practices would be > appreciated. > > Thanks, > Jamey > > 1: https://www.escidoc.org/media/docs/ges-versioning-article.pdf > 2: ftp://ftp.ncdc.noaa.gov/pub/data/nsrdb-solar/SUNY-gridded-data/ > 3: http://rredc.nrel.gov/solar/old_data/nsrdb/1991-2005/ > > ------------------------------------------------------------------------------ > Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! > Finally, a world-class log management solution at an even better price-free! > Download using promo code Free_Logger_4_Dev2Dev. Offer expires > February 28th, so secure your free ArcSight Logger TODAY! > http://p.sf.net/sfu/arcsight-sfd2d > _______________________________________________ > Fedora-commons-users mailing list > Fedora-commons-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/fedora-commons-users > ------------------------------------------------------------------------------ Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-users