It seems that the problem is quite serious. Does anyone use Jackrabbit in production environment which can successfully find an alternative way to solve this problem?
I am working on a Content Management system which requires a lot of Content I/O and a lot of versioning will take place. -----Original Message----- From: Miro Walker [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 11, 2006 1:31 AM To: [email protected] Subject: Re: About Issue JCR-546 > My best advice for now has been to explicitly synchronize on the > repository instance whenever you are doing versioning operations. Note > that you can still do normal read and write operations concurrently > with versioning, so this isn't as bad as it could be. Perhaps we > should put that synchronization inside the versioning methods until > the concurrency issues are solved... The problem here is that "versioning operations" covers quite a lot. For us the real nasty is cloning nodes between workspaces, as we've used a content model that maps releases to workspaces. Publishing a release therefore involves cloning an entire workspace (which takes a few 10s of minutes). During this period no other write operations can take place. Putting synchronisation code inside the versioning methods would mean that the entire application locks up during this period, while having it outside in our own app means that we can be a bit more flexible with how we handle locking (e.g. use locks that timeout with an error rather than allowing the application to be completely locked for 30-60 mins at a time). There are a few areas of the code that cause this sort of problem - the other big one is indexing. In order to support a home-brewed failover mechanism for active-passive clustering we need to delete search indexes on failover (as they are likely to be corrupt in the event of failover). On subsequent startup the application needs to reindex each workspace independently when it is first accessed. This takes a few minutes to do, again locking users out while this takes place. I don't think there is a "quick fix" other than to go in and spend some time fixing the existing scenarios where deadlock can occur and doing some hardcore testing of concurrency issues. Miro
