Re: [fcrepo-dev] More food for 4.0 thought: fcrepo-store

frank Fri, 23 Mar 2012 01:04:07 -0700

Hola Guys!

On 03/22/12 16:42, Chris Wilper wrote:
> Forgetting transactions for a second, I was kind of wondering if some
> sort of LockProvider service would be useful (one implementation of
> which would be a hazelcast/cluster-capable one). Higher level code
> that works with fcrepo-store would do something like:
>
> Lock lock = lockProvider.getLock(pid);
> lock.lock();
> try {
>    store.addObject(fedoraObject);
> } finally {
>    lock.unlock();
> }
I think this would be very nice to have in a possibly distributed 
environment, since a service manipulating a couple of objects for 
preservation (which is planned in the SCAPE project) could lock the 
objects which it might manipulate. Since preservation tasks might be 
running for a long time (e.g. couple of days) the chance for concurrent 
manipulations of objects gets larger. If the task could lock those 
objects beforehand, things would be nice.


> I agree that size and other info (mime type, etc.) are important for
> implementations to have access to at storage time. Interestingly, the
> way the API currently works, the associated FedoraObject *must* be
> provided to the impl prior to the call to setContent().
Oh ok. I did not know that.

> Are there other hints, not
> necessarily present in the FedoraObject, that we can envision being
> important to making content storage decisions?
Hmm not at the moment, i'd just like to have information about the size 
in setContent().

> Note that I'm not convinced that having stream-oriented (for managed
> content) and object-oriented (for FedoraObjects) methods at the same
> level in the API is the right move necessarily -- it just seemed more
> practical to implement in the short term because embedding
> stream-getting/setting functionality directly inside the FedoraObject
> interface would tie instances to a particular FedoraStore impl...which
> makes them harder to move around, if that makes sense.
Since this would be completely fedora internal APIs, that would not be 
exposed to the user but only employed by developers, i don't think 
having stream based API methods alongside of "normal" methods is a bad 
thing, since developers will understand the need to handle big data in a 
stream based fashion rather than filling up memory.
  This also gives you the possibility to access datastreams 
independendant of the objects. If getting/setting a Stream would involve 
calling a method on the FedoraObject, the system would have to fetch the 
object beforehand for every request. If you have a Stream based API like 
you proposed datastreams can be fetched/updated with their ID only.
  In short: im all for a stream based API as proposed :)

> In your experience, were you working with already-transaction
> resources (via JTA?) As mentioned on the call, I think if we attempt
> to implement transactions ourselves, there's all kinds of opportunity
> for failure. But if we can "wrap" already-transactional resources
> while still keeping the ability to integrate non-transactional blob
> storage, that seems more palatable to me.
In this Test i had a service write to a Datasource connected via 
Hibernate and to a Filesystem in a Transaction. The way i did this was 
quite straigthforward. I defined an Action class which held all the 
neccessary information about the atomic operations (e.g. one Action can 
be: write object to datasource, or: write xml to file system) in a 
LinkedList. I kept an index in the Transaction telling the system which 
Action is the current one, and if some error occurs, the system iterates 
up in the LinkedList undoing any Actions it encounters.
So the system used Hibernate's org.hibernate.Session and Transaction for 
handling transactions on the datasource level, but it uses a simple 
handwritten logic for handling transactions on the filesystem.
  This logic has all bee wrapped into a PlatformTransactionManager from 
Spring, which i weaved into the service using @Transactional annotations 
and a spring bean configuration for the datasource, filesystem and the 
transaction manager.

> The original point made in the paper was that there was a way to not
> *force* locking to occur (via optimistic concurrency control) if the
> storage interface provided a way to declare the previously-seen state
> with each request.
But this would mean fetching the existing objects from the storage layer 
before applying any update in order to be able to compare the versions 
which also introduces quite an overhead, depending on the object's 
complexity. And when dealing with large datastreams it seems quite 
inefficient to compare the currently stored version with one version 
supplied with a put request in order to overwrite it with yet another 
version given in the put request.

Have Fun!


Frank

-- 
*frank asseg*
softwareentwicklung
feichtmayrstr. 37
76646 bruchsal
tel.: ++49-7251-322-6073
fax.: ++49-7251-322-6078
mail: frank.as...@congrace.de
web: http://www.congrace.de/


------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Fedora-commons-developers mailing list
Fedora-commons-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

Re: [fcrepo-dev] More food for 4.0 thought: fcrepo-store

Reply via email to