[ 
https://issues.apache.org/jira/browse/DERBY-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175192#comment-13175192
 ] 

Brett Bergquist commented on DERBY-5552:
----------------------------------------

I guess I am confused as well Kathey as I had the debugger attached and do see 
it going through the XA code in Derby on the client side.  The application 
server is setup with the ClientXADataSource and I do see it calling xa.commit 
and xa.end for example.   The ClientXADataSource is required otherwise the 
error:

        Local transaction already has 1 non-XA Resource: cannot add more 
resources. 

occurs.  So although there is one database (Derby), it is using XA.   The 
database is being accessed through EJB's and through Eclipselink and also 
through a custom JCA interface driving Message Driven Beans.  

For the test case, I had to limit things to get my sanity.  So I stopped as 
much access to the database as I could but still trigger the problem.  
Eventually I got down to one thread of control being processed by EJB's which 
do start new transactions.  Even with this one access going on, I hit the 
lockup issue that I posted.  That is when I found the issue that I mention.  So 
whether or not this is the real issue, I don't know but when I tried to get as 
simple a condition as possible, I ran into this.

Thinking now, I don't understand why this would not be hit in a normal case of 
a lock timeout being thrown. The only thing that I can think of is that the 
Activation.checkStatementValidity() is seeing the statement as valid and not 
going to try to recompile it.  Why it occurred in my case where I see the 
"isValid" member set to false, I don't know.  I will try to hitch up the 
debugger and try to determine the difference so that I can understand it better.

I do believe that the code should not swallow and exception such as a lock 
timeout being reported regardless if the statement is no longer reporting to be 
valid.  This is definitely a condition that will cause an infinite loop of 
processing.

Again, I appreciate the help and your time.  If I gain an understanding of how 
the condition is triggered, I will look to write a test case for it.  I am 
reading the Derby testing docs that are relating to use JUnit which I assume is 
the correct path for newer test cases, correct?





                
> Derby threads hanging when using ClientXADataSource and a deadlock or lock 
> timeout occurs
> -----------------------------------------------------------------------------------------
>
>                 Key: DERBY-5552
>                 URL: https://issues.apache.org/jira/browse/DERBY-5552
>             Project: Derby
>          Issue Type: Bug
>          Components: Network Server
>    Affects Versions: 10.8.1.2
>         Environment: Solaris 10, Glassfish V2.1.1,
>            Reporter: Brett Bergquist
>            Priority: Blocker
>         Attachments: appserverstack.txt, client.tar.Z, derby.log, 
> derbystackatshutdown.txt, execute.patch, transactionsleft.txt
>
>
> The issue arrives when multiple XA transactions are done in parallel and 
> there is either a lock timeout or a lock deadlock detected.  When this 
> happens the connection is leaked in the Glassfish connection pool and the 
> client thread hangs in 
> "org.apache.derby.client.netReply.fill(Reply.java:172)".  
> Shutting down the app server fails because the thread has a lock in 
> "org.apache.derby.client.net.NetConnection40" and another task is calling 
> "org.apache.derby.client.ClientPooledConnection.close(ClientPooledConnection.java:214)"
>  which is waiting for the lock.
> Killing the appsever using "kill" and then attempting to shutdown Derby 
> network server causes the Network Server to hang.  One of the threads hangs 
> waiting for a lock at 
> "org.apache.derby.impl.drda.NeworkServerControlImpl.removeFromSessionTable(NetworkServerControlImpl.java:1525)"
>  and the "main" thread has this locked at 
> "org.apache.derby.impl.drda.NetworkServerControlImpl.executeWork(NetworkServerControlImpl.java:2242)"
>  and it itself is waiting for a lock which belongs to a thread that is stuck 
> at 
> "org.apache.derby.impl.services.locks.ActiveLock.waitForGrant(ActiveLock.java:118)
>  which is in the TIMED_WAITING state.
> Only by killing the Network Server using "kill" is possible at this point.
> There are transactions left even though all clients have been removed.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to