> So here's a provocative question to start: Assuming for a moment that
> the core Fedora object model (versioning warts and all) stays the same
> for 4.0, would something like this interface actually be compatible
> with the major objectives we've talked about with respect to High
> Level Storage?

Here's my perspective:

HighlevelStorage was designed as a data-oriented interface that
explicitly made the fedora object a fundamental and atomic unit of work
with respect to storage and associated "data-oriented" services that
might be plugged in.  This was a key simplification with clear
boundaries that would enable storage implementations the flexibility to
adopt a variety of locking, optimization, and/or communication
strategies within each unit of work - as it is guaranteed that each unit
of work is "complete" and fully defined with respect to a single fedora
object.  Transactions could later be laid on top of that,  but would not
change the fact that each individual operation within a transaction
would be a complete-object-version unit of work.

setContent() could possibly be problematic in that light, I'm not sure.
For example, one potential use case of HighLevelStorage is that the
storage impl might decide a managed datastream's physical storage
location based upon some property of the object (content model, for
example).  Do the semantics of setContent() allow a FedoraStore impl to
"make note that some content is available, hold onto a reference to the
InputStreams, but only act upon it in response to update(), possibly
making storage decisions based upon the content of the FedoraObject"?  

While I don't consider lock-free concurrent updates to be fundamental to
HighLevelStorage per se, the interface was designed to explicitly
declare a handle to prior state in order to provide flexibility and
avoid the need for explicit locking and shared-state.   Forcing the use
of internal or external locks and/or transactions limits the opportunity
to leverage certain kinds of horizontal scalability.  Indeed, the
initial motivation for HighLevelStorage for me was to horizontally-scale
fedora itself by eliminating shared state and locking between instances,
utilizing only the native capabilities of the storage impl (in this case
HBase).   With the FedoraStore interface as it stands right now, locking
(or single-object transactions) *must* be used in order to create fairly
lengthy critical section, making such horizontal scaling more
complicated and less effective.

Used in the same place as ILowlevelStorage, providing a reference to the
"to-be-replaced" version upon update is a fairly natural thing to do.
DOManager would need to retrieve the old version of an object anyway in
order to correctly populate the updated version, so there really is no
additional overhead in supplying a reference to it to the storage impl.
In fact, having a reference to both versions of the object may even make
certain implementations of HighLevelStorage plugins more efficient.
Consider a plugin that calculates the diff of triples to send off for
indexing.  It would be handy to have the metadata of the old version
right there in order to be able to dereference the proper datastream for
comparison, especially if that datastream is not versionable.

  -Aaron


------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Fedora-commons-developers mailing list
Fedora-commons-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

Reply via email to