[ 
https://issues.apache.org/jira/browse/DERBY-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brett Bergquist updated DERBY-6879:
-----------------------------------
    Attachment: derby-6879-test-2.diff

I have attached a revised patch with an updated test case and the resolution of 
the issue.

The final solution I came up with was to modify XATransactionState.cancel.   
The change removes the method level synchronization and instead synchronizes on 
the instance for handle the cancelling and rollback work against the 
XATransactionState, releases the synchronization state and performs the 
connection level rollback (which synchronizes on the connection), and then 
again synchronizes on the XATransactionState to return the connection.

The change ensure that XATransactionState.cancel does not hold a lock on the 
XATransactionState instance while waiting for the lock on the EmbedConnection.


> Engine deadlock between XA timeout handling and cleanupOnError
> --------------------------------------------------------------
>
>                 Key: DERBY-6879
>                 URL: https://issues.apache.org/jira/browse/DERBY-6879
>             Project: Derby
>          Issue Type: Bug
>          Components: Services
>    Affects Versions: 10.10.2.0
>         Environment: Solaris 10.5 on Oracle M5000 
>            Reporter: Brett Bergquist
>         Attachments: derby-6879-test-2.diff, derby-6879-test.diff
>
>
> Deadlock between XA timer cleanup task and the ContextManager.cleanupOnError
> Found one Java-level deadlock:
> =============================
> "DRDAConnThread_34":
>   waiting to lock monitor 0x0000000104b14d18 (object 0xfffffffd9090f058, a 
> org.apache.derby.jdbc.XATransactionState),
>   which is held by "Timer-0"
> "Timer-0":
>   waiting to lock monitor 0x00000001038b96e8 (object 0xfffffffd9090d8b0, a 
> org.apache.derby.impl.jdbc.EmbedConnection40),
>   which is held by "DRDAConnThread_34"
>  
> Java stack information for the threads listed above:
> ===================================================
> "DRDAConnThread_34":
>      at org.apache.derby.jdbc.XATransactionState.cleanupOnError(Unknown 
> Source)
>      - waiting to lock <0xfffffffd9090f058> (a 
> org.apache.derby.jdbc.XATransactionState)
>      at 
> org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(Unknown 
> Source)
>      at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.cleanupOnError(Unknown 
> Source)
>      at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
> Source)
>      at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
> Source)
>      at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
> Source)
>      at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
> Source)
>      - locked <0xfffffffd9090d8b0> (a 
> org.apache.derby.impl.jdbc.EmbedConnection40)
>      at 
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown 
> Source)
>      at org.apache.derby.impl.jdbc.EmbedPreparedStatement.execute(Unknown 
> Source)
>      at org.apache.derby.iapi.jdbc.BrokeredPreparedStatement.execute(Unknown 
> Source)
>      at org.apache.derby.impl.drda.DRDAStatement.execute(Unknown Source)
>      at 
> org.apache.derby.impl.drda.DRDAConnThread.parseEXCSQLSTTobjects(Unknown 
> Source)
>      at org.apache.derby.impl.drda.DRDAConnThread.parseEXCSQLSTT(Unknown 
> Source)
>      at org.apache.derby.impl.drda.DRDAConnThread.processCommands(Unknown 
> Source)
>      at org.apache.derby.impl.drda.DRDAConnThread.run(Unknown Source)
> "Timer-0":
>      at org.apache.derby.impl.jdbc.EmbedConnection.xa_rollback(Unknown Source)
>      - waiting to lock <0xfffffffd9090d8b0> (a 
> org.apache.derby.impl.jdbc.EmbedConnection40)
>      at org.apache.derby.jdbc.XATransactionState.cancel(Unknown Source)
>      - locked <0xfffffffd9090f058> (a 
> org.apache.derby.jdbc.XATransactionState)
>      at 
> org.apache.derby.jdbc.XATransactionState$CancelXATransactionTask.run(Unknown 
> Source)
>      at java.util.TimerThread.mainLoop(Timer.java:555)
>      at java.util.TimerThread.run(Timer.java:505)
>  
> Found 1 deadlock.
> This deadlock caused Derby to create 18000 transaction recovery logs because 
> of the XA transaction that did not cleanup in the timeout.  Rebooting the 
> system would cause a 50 hour boot up time to process the transaction logs so 
> recovery had to be done by going to a backup database before the issue 
> occurred.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to