Not normal as it stayed this way for 2.5 days until the customer called me. I
took stack traces of both the Derby Network Server and the Glassfish
application server every 6 seconds for one minute. All show the exact same
state.
I do believe the Derby Network Server is okay at the point of the stack traces
as customer was able to shutdown the Derby Network Server and restart it to
clear the problem, but until this thread in the Glassfish Server which is a
Derby client call, returns the system is locked:
"PMCollectionActivationManager.newPmTestResult" daemon prio=3
tid=0x000000010253e800 nid=0x243 runnable [0xfffffffd264fc000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at org.apache.derby.client.net.Reply.fill(Unknown Source)
at org.apache.derby.client.net.Reply.ensureALayerDataInBuffer(Unknown
Source)
at org.apache.derby.client.net.Reply.readDssHeader(Unknown Source)
at org.apache.derby.client.net.Reply.startSameIdChainParse(Unknown
Source)
at
org.apache.derby.client.net.NetXAConnectionReply.readLocalXACommit(Unknown
Source)
at
org.apache.derby.client.net.NetXAConnection.readLocalXACommit_(Unknown Source)
at org.apache.derby.client.net.NetXAConnection.readCommit(Unknown
Source)
at org.apache.derby.client.net.NetConnection.readXACommit_(Unknown
Source)
at org.apache.derby.client.am.Connection.readCommit(Unknown Source)
at org.apache.derby.client.am.Connection.readAutoCommit(Unknown Source)
at org.apache.derby.client.am.Connection.flowClose(Unknown Source)
at org.apache.derby.client.am.Connection.closeResourcesX(Unknown Source)
at org.apache.derby.client.am.Connection.closeResources(Unknown Source)
- locked <0xfffffffdad887980> (a
org.apache.derby.client.net.NetConnection40)
at org.apache.derby.client.net.NetConnection.closeResources(Unknown
Source)
- locked <0xfffffffdad887980> (a
org.apache.derby.client.net.NetConnection40)
at org.apache.derby.client.ClientPooledConnection.close(Unknown Source)
- locked <0xfffffffdad887860> (a
org.apache.derby.client.ClientXAConnection40)
at org.apache.derby.client.ClientXAConnection.close(Unknown Source)
- locked <0xfffffffdad887860> (a
org.apache.derby.client.ClientXAConnection40)
at com.sun.gjc.spi.ManagedConnection.destroy(ManagedConnection.java:276)
>From looking at the Derby Network Server stack trace, I think there should be
>some indication that Derby Network Server is doing something related to this
>but I see nothing in the threads of the Derby Network Server. It is as if
>the Derby Network Server has lost this XA transaction and the response is
>never going to be returned to this client call.
So I guess the question is, with this state information, how can a client be
waiting on the server forever but the server has no indication that it is
serving this client and will never respond. It seems to me if the socket
connection were closed, this "read" would return with an error, but this has
not happened and after 2.5 days I don't think the server side is going to
respond.
Note the errors that are in the derby.log at this time:
Thu May 10 08:10:52 EDT 2012 Thread[DRDAConnThread_23,5,main] (DATABASE
= csemdb), (DRDAID = ????????.??-736900152376645348{85251}), Execution
failed because of a Distributed Protocol Error: DRDA_Proto_SYNTAXRM; CODPNT
arg = 0; Error Code Value = 3. Plaintext connection attempt from an SSL
enabled client?
Thu May 10 08:10:52 EDT 2012 : Execution failed because of a
Distributed Protocol Error: DRDA_Proto_SYNTAXRM; CODPNT arg = 0; Error Code
Value = 3. Plaintext connection attempt from an SSL enabled client?
Thu May 10 08:10:52 EDT 2012 : Execution failed because of a
Distributed Protocol Error: DRDA_Proto_SYNTAXRM; CODPNT arg = 0; Error Code
Value = 3. Plaintext connection attempt from an SSL enabled client?
org.apache.derby.impl.drda.DRDAProtocolException: Execution failed
because of a Distributed Protocol Error: DRDA_Proto_SYNTAXRM; CODPNT
arg = 0; Error Code Value = 3. Plaintext connection attempt from an SSL
enabled client?
There are no SSL enabled clients in use. So it appears like there something in
the protocol broke and the client thread is waiting for something that is never
going to be received.
Also suspicious is that 5 seconds before in the derby.log is:
There was an XA transaction associated with the connection being closed. The
transaction is going to be rolled back. The transaction Xid is
(4871251,f19f0201d1f69e27636776776e6a73767230312c7365727665722c5033373030,636776776e6a73767230312c7365727665722c50333730302c00).
Exception in thread "DRDAConnThread_61" java.lang.NullPointerException
at
org.apache.derby.impl.jdbc.EmbedConnection.cancelRunningStatement(Unknown
Source)
at org.apache.derby.jdbc.XATransactionState.cancel(Unknown Source)
at
org.apache.derby.jdbc.ResourceAdapterImpl.cancelXATransaction(Unknown Source)
at
org.apache.derby.impl.drda.DRDAXAProtocol.rollbackCurrentTransaction(Unknown
Source)
at org.apache.derby.impl.drda.DRDAConnThread.closeSession(Unknown
Source)
at org.apache.derby.impl.drda.DRDAConnThread.run(Unknown Source)
Which indicates a NPE during the cleanup of an XA transaction. So I suppose I
should look at how the NPE can happen and what effect that might have as far as
responding to a client. Also I suppose I should look at what happens with the
protocol exception and what effect is might have on responding to a client.
Neither of the two threads referenced "DRDAConnThread_23" or "
DRDAConnThread_61" are present in the Derby Network Server stack trace either.
I do appreciate the you taking a look at this Dag.
-----Original Message-----
From: Dag H. Wanvik [mailto:[email protected]]
Sent: Monday, May 14, 2012 3:12 PM
To: [email protected]
Subject: Re: Deadlock problem with Derby 10.8.2.2 in production
"Bergquist, Brett" <[email protected]> writes:
> In our production environment, a deadlock occurred. The problem was
> triggered by some problem in derby, but unfortunately, the error message
> returned to the client is:
>
> [#|2012-05-10T08:10:47.761-0400|WARNING|sun-appserver2.1|org.eclipse.p
> ersistence.session.file:/opt/canoga/canogaview/glassfish/domains/domai
> n1/applications/j2ee-apps/csem/csem-core-api.jar-csem-PU|_ThreadID=66;
> _ThreadName=PMCollectionActivationManager.samResetDetected;_RequestID=
> 9e8852a5-9136-4096-94b1-b3ccbf95d810;|
> Local Exception Stack:
> Exception [EclipseLink-4002] (Eclipse Persistence Services -
> 1.1.3.v20091002-r5404):
> org.eclipse.persistence.exceptions.DatabaseException
> Internal Exception: java.sql.SQLException: DERBY SQL error: SQLCODE:
> -1, SQLSTATE: XJ001, SQLERRMC: java.lang.NullPointerException XJ001.U
Do you have access to derby.log at the time this happens?
Of the server sessions threads DRDAConnThreads (597): 555 are cached on waiting
on the server side waiting to start servering new sessions, 42 seem to be
involved in current connections (blocked on socket read). Looks normal..
Dag