Hi Aaron, I read your writeup again last night and it got me thinking about a couple topics in detail.
First, I like the idea of concentrating on what amounts to whole-object CRUD operations. I think where you started was with the notion that a running repository has a primary instance of this interface, through which all reads and writes of DigitalObject instances must pass. As Asger pointed out, the "write-oriented" side of the interface (or something close to it) could be leveraged to provide a common way to send object updates to various independent modules (alternate indexes, etc) within a running repository. The actual work of sending object updates to various registered modules could be handled by a given HighLevelStorage impl, but at least one of the "sinks" also needs to be capable of providing a DigitalObject getter. Looking at the interface as you currently have defined on page 5: > interface HighLevelStorage { > void add(DigitalObject object) This is straightforward. I'm immediately thinking of memory constraints, though. In particular, since managed content needs to come through this door, it needs to be accessible as a stream. Not particular to this method, but since we'd want this interface to be pretty stable long-term, re-evaluating the DigitalObject interface's suitability for this purpose would be good. > void update(DigitalObject oldVers, DigitalObject newVers) Yes, providing both versions to this method is important for the reasons you describe. > DigitalObject read(String pid) How about org.fcrepo.common.PID instead of String? > void remove(String pid) > } Initially, I thought providing the PID was enough here, but for the same concurrency control reasons behind read(..), why not: void remove(DigitalObject oldVers) This would provide an opportunity to veto a remove() if the object had recently been changed by another thread. Related: should veto-if-changed be part of the published contract of this interface? RE: Multiplexing I basically agree with what you say in section 3.1: that HighLevelStorage would be the right place to provide per-blob "hints" for underlying Akubra impls to make multiplexing decisions. This makes sense because Akubra is (by design) ignorant of Fedora semantics, so whatever is sending content in must be able to do whatever translation is needed. Where it gets interesting, I think, is when you consider the kinds of hints that should be passed from one level of storage abstraction to the next. For example, do you simply pass in the id of the target Akubra store as a "hint", as you did in your writeup, or do you pass in things like mime type, size, etc, and delegate to whatever rule processing is in the underlying storage system to make the final decision about which store it actually goes to. My intuition is that a combination may be best: If the multiplexing decision can be made on basic blob metadata like size, mimetype, etc, it's best to delegate it rather than making the decision in a layer that *must* be Fedora-aware. If it can't, it's best to either translate to hints that can be used at the next level, or failing that, provide the target store id based on some rules that were evaluated at the higher level. RE: Non-blob storage As you point out, this interface provides a clear plug-in point for non-blob-oriented persistence strategies to be implemented. One of the big strengths of Fedora has been this ability to "crawl the files" and reconstitute indexes in disaster recovery scenarios, and this obviously doesn't preclude doing that as an option. But it reminds me that having (at a minimum) the ability to iterate what's been stored is important to providing any kind of re-constitution of secondary indexes. That may apply here. In other words, should the HighLevelStorage interface be iterable<DigitalObject>? - Chris ------------------------------------------------------------------------------ SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev _______________________________________________ Fedora-commons-developers mailing list Fedora-commons-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers