Chris, I think that transactionality was always one of those tough points to deal with in term of integrating DSpace and Fedora. Given most of the DSpace structural/metadata relationships are stored in the DB and the transactional window is used to cleanly abort any changes should something unexpected occur, having some form of transactional API in Fedora would make it much easier mapping for in terms of Fedora replacing significant parts of DSpace database centric persistence tier.
Thinking along the lines of JCR Transaction support... http://www.day.com/specs/jcr/2.0/21_Transactions.html http://wiki.alfresco.com/wiki/Introducing_the_Alfresco_Java_Content_Repository_API#Transaction_Management Cheers, Mark On Fri, Mar 9, 2012 at 10:25 AM, Chris Wilper <cwil...@duraspace.org> wrote: > Hi all, > > While considering the design of a fairly low-level generic batch > utility for Fedora, I started to put together a new interface called > "FedoraStore" that looked similar to what we've been talking about for > High Level Storage[1] with 4.0. > > http://cwilper.github.com/fcrepo-store/apidocs/com/github/cwilper/fcrepo/store/core/FedoraStore.html > (the AkubraFedoraStore impl is compatible with Fedora 3.2+'s > AkubraLowlevelStorage) > > The main purpose of this is to aid in the writing of a generic batch > modify/migrate utility that works with current versions of Fedora. But > being Actual Working Code (tm), I thought it could also serve as a > good subject for discussion to get a better understanding of what we > really want 4.0's storage abstraction to look like. > > So here's a provocative question to start: Assuming for a moment that > the core Fedora object model (versioning warts and all) stays the same > for 4.0, would something like this interface actually be compatible > with the major objectives we've talked about with respect to High > Level Storage? > > a) Clustered Fedora instances. > Yes. As with the old LLStorage interface, this puts all of the storage > of Fedora objects and managed datastreams behind a single interface. > In either case, the actual underlying storage can be clustered itself > (GlusterFS, etc) -- it's really higher level code (caching, > locking…any kind of state sharing) that will have the final say as to > whether clustering is doable. Side note: Hazelcast looks like it could > be really nice for this. > > b) Asynchronous Reads & Writes. > Potentially. In previous discussions we've talked about HLStorage > having a Result return type for each storage method as a way to easily > pass back some sort of token or other information to the caller so it > can check on the status (or associate a future message) with a > particular async read or write request. It seems likely to me that > some other form of association could be done, but I haven't thought it > through much. > > c) Transactions. > Unsure. But I think it's worth stepping back and considering the > cost/benefit of implementing true ACID transactions across Fedora's > API for Fedora 4. I know the discussion of HLStorage has touched on > the possibility of doing this in the past, but it's been very short on > detail. Now, if we could assume that all Fedora state was persisted in > a relational database, this would be a non-issue, but we have managed > content. (I'm assuming for the moment, as previously discussed, that > RISearch and FieldSearch are outside the "core" for 4.0 and therefore > would be not be updated as part of the transaction) What's more, there > continues to be demand for a Fedora that can cope with asynchronous > reads and writes. As in, "the tape robot is going to take a minute to > spin up for that content, please stand by". Or "okay, i'll write that > to the storage cluster in a few minutes; it's super busy right now" It > seems to me that the absolute easiest way to get transactions with 4.0 > would be to discontinue support of managed content and require a > relational database for FOXML (Hey, it was worth mentioning). In any > case, I'm not sure whether transaction semantics would actually need > to be exposed in the storage API at all…I hope not. > > d) Storage Multiplexing. > Yes. As discussed in the original HLStorage paper, having the object > in context at the time that managed datastreams are being persisted > would make it easier to provide the necessary info (e.g. akubra > "hints") to the underlying impl. > > e) Lock-free concurrent updates > No. I think some way of declaring the previously seen state would be > necessary to achieve this. But again, I'm not sure that > whole-Fedora-object-locking at a higher level is such a bad thing if > it's done correctly and doesn't make the single-node-Fedora assumption > that the locking in DOManager does today. > > f) Storing entire object in self-contained file archives > Yes. Although fcrepo-store does split the storage of FedoraObjects and > managed content, having them stored together (e.g. in AtomZIP) at the > low level is still possible. It's a question of efficiency. > > - Chris > > [1] https://wiki.duraspace.org/display/FCREPO/High+Level+Storage > > ------------------------------------------------------------------------------ > Virtualization & Cloud Management Using Capacity Planning > Cloud computing makes use of virtualization - but cloud computing > also focuses on allowing computing to be delivered as a service. > http://www.accelacomm.com/jaw/sfnl/114/51521223/ > _______________________________________________ > Fedora-commons-developers mailing list > Fedora-commons-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers -- Mark Diggory (Schedule a Meeting) 2888 Loker Avenue East, Suite 305, Carlsbad, CA. 92010 Esperantolaan 4, Heverlee 3001, Belgium http://www.atmire.com ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ Fedora-commons-developers mailing list Fedora-commons-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers