Hola Guys! On 03/22/12 16:42, Chris Wilper wrote: > Forgetting transactions for a second, I was kind of wondering if some > sort of LockProvider service would be useful (one implementation of > which would be a hazelcast/cluster-capable one). Higher level code > that works with fcrepo-store would do something like: > > Lock lock = lockProvider.getLock(pid); > lock.lock(); > try { > store.addObject(fedoraObject); > } finally { > lock.unlock(); > } I think this would be very nice to have in a possibly distributed environment, since a service manipulating a couple of objects for preservation (which is planned in the SCAPE project) could lock the objects which it might manipulate. Since preservation tasks might be running for a long time (e.g. couple of days) the chance for concurrent manipulations of objects gets larger. If the task could lock those objects beforehand, things would be nice.
> I agree that size and other info (mime type, etc.) are important for > implementations to have access to at storage time. Interestingly, the > way the API currently works, the associated FedoraObject *must* be > provided to the impl prior to the call to setContent(). Oh ok. I did not know that. > Are there other hints, not > necessarily present in the FedoraObject, that we can envision being > important to making content storage decisions? Hmm not at the moment, i'd just like to have information about the size in setContent(). > Note that I'm not convinced that having stream-oriented (for managed > content) and object-oriented (for FedoraObjects) methods at the same > level in the API is the right move necessarily -- it just seemed more > practical to implement in the short term because embedding > stream-getting/setting functionality directly inside the FedoraObject > interface would tie instances to a particular FedoraStore impl...which > makes them harder to move around, if that makes sense. Since this would be completely fedora internal APIs, that would not be exposed to the user but only employed by developers, i don't think having stream based API methods alongside of "normal" methods is a bad thing, since developers will understand the need to handle big data in a stream based fashion rather than filling up memory. This also gives you the possibility to access datastreams independendant of the objects. If getting/setting a Stream would involve calling a method on the FedoraObject, the system would have to fetch the object beforehand for every request. If you have a Stream based API like you proposed datastreams can be fetched/updated with their ID only. In short: im all for a stream based API as proposed :) > In your experience, were you working with already-transaction > resources (via JTA?) As mentioned on the call, I think if we attempt > to implement transactions ourselves, there's all kinds of opportunity > for failure. But if we can "wrap" already-transactional resources > while still keeping the ability to integrate non-transactional blob > storage, that seems more palatable to me. In this Test i had a service write to a Datasource connected via Hibernate and to a Filesystem in a Transaction. The way i did this was quite straigthforward. I defined an Action class which held all the neccessary information about the atomic operations (e.g. one Action can be: write object to datasource, or: write xml to file system) in a LinkedList. I kept an index in the Transaction telling the system which Action is the current one, and if some error occurs, the system iterates up in the LinkedList undoing any Actions it encounters. So the system used Hibernate's org.hibernate.Session and Transaction for handling transactions on the datasource level, but it uses a simple handwritten logic for handling transactions on the filesystem. This logic has all bee wrapped into a PlatformTransactionManager from Spring, which i weaved into the service using @Transactional annotations and a spring bean configuration for the datasource, filesystem and the transaction manager. > The original point made in the paper was that there was a way to not > *force* locking to occur (via optimistic concurrency control) if the > storage interface provided a way to declare the previously-seen state > with each request. But this would mean fetching the existing objects from the storage layer before applying any update in order to be able to compare the versions which also introduces quite an overhead, depending on the object's complexity. And when dealing with large datastreams it seems quite inefficient to compare the currently stored version with one version supplied with a put request in order to overwrite it with yet another version given in the put request. Have Fun! Frank -- *frank asseg* softwareentwicklung feichtmayrstr. 37 76646 bruchsal tel.: ++49-7251-322-6073 fax.: ++49-7251-322-6078 mail: frank.as...@congrace.de web: http://www.congrace.de/ ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Fedora-commons-developers mailing list Fedora-commons-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers