Thanks, Claus, for responding. Unfortunately, upgrading to 2.4.x is not an immediate option. But as I have backported several fixes to the 1.6.x baseline a solution seems to be in place already.
I'm mostly interested in fully understanding how these deadlock came into existence in the first place. Looking at the code, there is a fixed order of events, namely 1. Acquire write lock on Shared ISM during prepare. 2. Downgrading the write lock to a read lock during commit, 3. Broadcasting "update ended" events, e.g. to update the Lucene indexes. 3.1. Lucene acquires a read lock on Shared ISM. Specifically, the order of steps 2. and 3. is fixed within the SISM.Update#end() method However, from my analysis it seems sometimes (and this is really rare) step 3. is being executed while 2. is not yet effective. And this bends my mind. Current revisions of SISM have lines reordered a little, but to me there is no clear indication that a similar situation will not occur with trunk as well. I guess it all boils down to the question: does all this go back to a bug within the JVM or is there still some "happens before" indicator missing from the code. During merge verification we have manually switched 2. and 3. and could verify that the fixes (specifically JCR-2820) are effective. Thanks and kind regards Robert
