Hola Guys, here are some comments about the Storage ideas from my point of view:
> So here's a provocative question to start: Assuming for a moment that > the core Fedora object model (versioning warts and all) stays the same > for 4.0, would something like this interface actually be compatible > with the major objectives we've talked about with respect to High > Level Storage? I think the interface is fine, and just like Akubra it should be straightforward to implement on e.g. HDFS/HBase. But shouldn't there be a locking mechanism on the Digital Object level? So that if this were to be used in a distributed environment, that fedora instance a can lock object X while e.g. it's updating the object's datastreams. It would be nice if the setContent() method could supply the size of the datastreams, so that implmentations could choose a storage layer based on the size of the stream, to mitigate e.g. Hadoop's "Small files problem", so you can choose to write small files in an Sequence file or even a HBase table. Maybe some kind of Hints as they are present in the Akubra API would make sense, so that arbitrary indformation can be passed down into the storage layer. > c) Transactions. > Unsure. But I think it's worth stepping back and considering the > cost/benefit of implementing true ACID transactions across Fedora's > API for Fedora 4. I know the discussion of HLStorage has touched on > the possibility of doing this in the past, but it's been very short on > detail. IMHO transactions would be a feature that the users would like to see very much. Transactions seem to be a feature that invokes a feeling of trust in users. And i recently played a bit with implementing a custom PlatformTransactionManager from Spring which gives you the possibility to use those beatiful @Transactional annotations, instead of handling each transaction programmatically. It's quite easy to implement, although there still is the hard part of rolling back unsucessful transactions. > e) Lock-free concurrent updates > No. I think some way of declaring the previously seen state would be > necessary to achieve this. But again, I'm not sure that > whole-Fedora-object-locking at a higher level is such a bad thing if > it's done correctly and doesn't make the single-node-Fedora assumption > that the locking in DOManager does today. Yes as you might have guessed from the previous paragraph i think object locking would be hughly benefitial in the context of asynchronous writes or a federation of fedora. > f) Storing entire object in self-contained file archives > Yes. Although fcrepo-store does split the storage of FedoraObjects and > managed content, having them stored together (e.g. in AtomZIP) at the > low level is still possible. It's a question of efficiency. Hmm i think that's a quite interesting idea, having all your AIPs as on the filesytem in one file, especially when thinking about integrating fedora with some kind of execution service which requests/updates a lot of objects from/in fedora. You could dramatically decrease load if the whole intellectual entity could be fetched from the repo in one request, with all it's represeantations, instead of requesting an object first and having one subsequent request per datastream. -- *frank asseg* softwareentwicklung feichtmayrstr. 37 76646 bruchsal tel.: ++49-7251-322-6073 fax.: ++49-7251-322-6078 mail: frank.as...@congrace.de web: http://www.congrace.de/ ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Fedora-commons-developers mailing list Fedora-commons-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers