[
https://issues.apache.org/jira/browse/DERBY-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15368858#comment-15368858
]
ASF subversion and git services commented on DERBY-6879:
--------------------------------------------------------
Commit 1751978 from [~bryanpendleton] in branch 'code/trunk'
[ https://svn.apache.org/r1751978 ]
DERBY-6879: Engine deadlock between XA timeout handling and cleanupOnError
This patch was contributed by Brett Bergquist (brett at thebergquistfamily dot
com)
The original problem to be solved is that a connection that is performing a
XA transaction that discovers an error that must be cleaned up is going to
have a lock on a EmbedConnection and will then require a lock on the
XATransactionState.
And if the same XA transaction times out (because of the XA transaction
timer value) while the original XA transaction is being worked, then the
timeout handling is going to have a lock on the XATransactionState and then
require a lock on the EmbedConnection, triggering the deadlock.
The change in the patch fixes this problem by altering the locking on
the XATransactionState when invoked via the timeout by removing the
synchronization on the "cancel" method and instead using inline
synchronization on the "XATransactionState" while its internals are being
altered.
Then, it releases the XATransactionState lock to allow the
EmbedConnection.rollback method to be invoked without the lock.
Finally, it acquires the lock again to finish cleaning up the
XATransactionState.
> Engine deadlock between XA timeout handling and cleanupOnError
> --------------------------------------------------------------
>
> Key: DERBY-6879
> URL: https://issues.apache.org/jira/browse/DERBY-6879
> Project: Derby
> Issue Type: Bug
> Components: Services
> Affects Versions: 10.10.2.0
> Environment: Solaris 10.5 on Oracle M5000
> Reporter: Brett Bergquist
> Attachments: derby-6879-2016-07-05.diff, derby-6879-2016-07-08.diff,
> derby-6879-test.diff, svnstatus.txt
>
>
> Deadlock between XA timer cleanup task and the ContextManager.cleanupOnError
> Found one Java-level deadlock:
> =============================
> "DRDAConnThread_34":
> waiting to lock monitor 0x0000000104b14d18 (object 0xfffffffd9090f058, a
> org.apache.derby.jdbc.XATransactionState),
> which is held by "Timer-0"
> "Timer-0":
> waiting to lock monitor 0x00000001038b96e8 (object 0xfffffffd9090d8b0, a
> org.apache.derby.impl.jdbc.EmbedConnection40),
> which is held by "DRDAConnThread_34"
>
> Java stack information for the threads listed above:
> ===================================================
> "DRDAConnThread_34":
> at org.apache.derby.jdbc.XATransactionState.cleanupOnError(Unknown
> Source)
> - waiting to lock <0xfffffffd9090f058> (a
> org.apache.derby.jdbc.XATransactionState)
> at
> org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(Unknown
> Source)
> at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.cleanupOnError(Unknown
> Source)
> at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown
> Source)
> at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown
> Source)
> at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown
> Source)
> at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown
> Source)
> - locked <0xfffffffd9090d8b0> (a
> org.apache.derby.impl.jdbc.EmbedConnection40)
> at
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown
> Source)
> at org.apache.derby.impl.jdbc.EmbedPreparedStatement.execute(Unknown
> Source)
> at org.apache.derby.iapi.jdbc.BrokeredPreparedStatement.execute(Unknown
> Source)
> at org.apache.derby.impl.drda.DRDAStatement.execute(Unknown Source)
> at
> org.apache.derby.impl.drda.DRDAConnThread.parseEXCSQLSTTobjects(Unknown
> Source)
> at org.apache.derby.impl.drda.DRDAConnThread.parseEXCSQLSTT(Unknown
> Source)
> at org.apache.derby.impl.drda.DRDAConnThread.processCommands(Unknown
> Source)
> at org.apache.derby.impl.drda.DRDAConnThread.run(Unknown Source)
> "Timer-0":
> at org.apache.derby.impl.jdbc.EmbedConnection.xa_rollback(Unknown Source)
> - waiting to lock <0xfffffffd9090d8b0> (a
> org.apache.derby.impl.jdbc.EmbedConnection40)
> at org.apache.derby.jdbc.XATransactionState.cancel(Unknown Source)
> - locked <0xfffffffd9090f058> (a
> org.apache.derby.jdbc.XATransactionState)
> at
> org.apache.derby.jdbc.XATransactionState$CancelXATransactionTask.run(Unknown
> Source)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
>
> Found 1 deadlock.
> This deadlock caused Derby to create 18000 transaction recovery logs because
> of the XA transaction that did not cleanup in the timeout. Rebooting the
> system would cause a 50 hour boot up time to process the transaction logs so
> recovery had to be done by going to a backup database before the issue
> occurred.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)