That is exactly what I just saw looking at the code. Thanks for pointing this to me.
-----Original Message----- From: Kristian Waagan [mailto:[email protected]] Sent: Thursday, March 01, 2012 10:12 AM To: [email protected] Subject: Re: How to present a heap dump to the list - Was: Had Derby 10.8.2.2 fail today and need some thoughts On 01.03.2012 15:55, Bergquist, Brett wrote: > I think I can explain the > "org.apache.derby.jdbc.XATransactionState$CancelXATransactionTask" instances > that are piling up in memory. Haven't read what you say below, I really have to run, but see also DERBY-4137 [1]. Maybe you're a good reason why we should implement the more complex solution :) Cheers, -- Kristian [1] https://issues.apache.org/jira/browse/DERBY-4137 > > I have derby.properties configured to timeout XA transactions at 15 minutes. > The system is processing about 6M or so XA transactions a day or about > 70/second and 99.9% of these are going to work and not be canceled by the > Derby XA transaction timer. So each of these invokes "cancel" when the XA > transaction is finalized or canceled or committed. The > CancelXATransactionTask instance is marked as canceled when this done. > > The java.util.Timer does not remove these canceled tasks until their original > schedule time occurs however. When that happens, the task is seen as > canceled and removed from the queue that is maintained by java.util.Timer. > > So from what I see, CancelXATransactionTask are being added to the > java.util.Timer queue at a rate of about 70/second and are being canceled but > each of these is sticking around in the queue for 15 minutes. So these build > up in the queue for at least 15 minutes of time and then start being purged > out. I guess this should reach a steady state after a while where adding to > the queue and the removing from the queue reaches an equilibrium. > > The java.util.Timer class does have "purge" method. Since most XA > transactions are not going to timeout and the CancelXATransactionTask is > going to be canceled, I am wondering if it might not be wise to schedule a > call to "purge" periodically to remove these canceled timers from the Timer's > queue early? > > > > -----Original Message----- > From: Bergquist, Brett [mailto:[email protected]] > Sent: Thursday, March 01, 2012 9:01 AM > To: [email protected] > Subject: How to present a heap dump to the list - Was: Had Derby > 10.8.2.2 fail today and need some thoughts > > I have a couple of heap dumps that I can bring up in jvisualvm. One is an > 8Gb dump of where Derby had an OOM when doing a large query and one is just a > snapshot of the heap of a running system that I see about 121K instances of > "org.apache.derby.jdbc.XATransactionState$CancelXATransactionTask". > Jvisualvm is able to load these heap dumps but does not seem to have a good > way of outputting a report. > > I could do a screen capture and attach a JPEG, but I want to make sure this > is proper etiquette for the list before doing so. > > So any advice on this will be appreciated. > > Brett > > -----Original Message----- > From: Bergquist, Brett [mailto:[email protected]] > Sent: Tuesday, February 28, 2012 7:41 PM > To: [email protected] > Subject: RE: Had Derby 10.8.2.2 fail today and need some thoughts > > Hardware is Oracle M5000 running Solaris 10. Database engine is controlled > through SMF of Solaris. derby.properties is setup to always append to the > log, never truncate. There definitely is not more than one process starting > the Network Server. > > Here is another tidbit. Even though the log says the Derby engine was > shutting down, the process was still alive (prstat showed this) and the SMF > log also showed that the Network Server process never terminated. > > From a previous capture of the "derby.log", it also appeared this happened > on the Feb 24'th as well, but seemed to recover and continue running. > Strange? > > On an separate thought in case it has to do with an out of memory condition, > I was using jvisualvm the other day and monitoring the heap/garbage > collection on a test system that tries to mimic this setup (Oracle M3000). > I did a heap dump an noticed many objects relating to XA transaction timeout > timers. About 600K of these. They seemed to be owned by > java.util.TimerTask. I remember seeing some Derby issue updated yesterday > or the day before about java.util.TimerTask holding on to some things. I > bring this up because recently we added the property to derby.properties to > cancel XA transactions if they take to long. We probably do about 6M XA > transactions a day and I am wondering if maybe there is some sort of leak > here. I will try my setup and look again tomorrow using jvisualvm. > > I appreciate your thoughts. > > Brett > ________________________________________ > From: Mike Matrigali [[email protected]] > Sent: Tuesday, February 28, 2012 6:40 PM > To: [email protected] > Subject: Re: Had Derby 10.8.2.2 fail today and need some thoughts > > Don't know by more info might help. A catch 22 is that usually the top of > the derby.log usually has all the environment info that could be useful. > > Can you list OS, JVM. I/O handling to the log is likely an OS thing. What > properties do you have set, any special for error logging? > > I would likely look first for multiple starts and stops of derby and a > setting default for one of them to truncate the derby.log. > Maybe leading to multiple processes trying to write to same derby.log. > > /mikem > > Bergquist, Brett wrote: >> Our customer called and said the server was not working. Before >> restarting, I retrieved the derby.log and it seems strange because >> this was right at the top of the log: >> >> >> >> ---------------------------------------------------------------- >> >> Tue Feb 28 15:33:58 EST 2012: Shutting down Derby engine >> >> ---------------------------------------------------------------- >> >> Tue Feb 28 15:37:28 EST 2012 Thread[DRDAConnThread_245,5,main] (XID = >> 1482003981), (SESSIONID = 133257672), (DATABASE = csemdb), (DRDAID = >> ????????.??-4471791624540787385{335696}), Cleanup action starting >> >> Tue Feb 28 15:37:28 EST 2012 Thread[DRDAConnThread_245,5,main] (XID = >> 1482003981), (SESSIONID = 133257672), (DATABASE = csemdb), (DRDAID = >> ????????.??-4471791624540787385{335696}), Failed Statement is: null >> >> java.lang.NullPointerException >> >> Cleanup action completed >> >> >> >> The strange part is "Shutting down Derby engine" is the first thing in >> the log. The server was up and running since 2/16/2012 with no >> problems with about 60M transactions processed, a backup performed each >> night, etc. And then this. >> >> >> >> So any thoughts on how the derby.log could be recreated and have this as >> the first thing in the log? I am thinking maybe an OutOfMemory >> condition but all traces of whatever when wrong are gone, so I am >> trying to work backwards and see how the derby.log could be created >> with this as the first thing in the log. >> >> >> >> Thanks for any help! >> >> >> >> Brett >> >> >> >> >> > > > > > > >
