JBoss [Zion] 4.0.3SP1 (build: CVSTag=JBoss_4_0_3_SP1 date=200510231054)
ClockDaemon is used by org.jboss.mg.Connection to periodically run
Connection$PingTask. ClockDaemon holds a Heap that contains references to
PingTask instances. Normally, the PingTask instances are removed from the
ClockDaemon's Heap by ClockDaemon$RunLoop, which contains an infinite loop that
runs each PingTask.
Ocassionally, after running my web application for 20-30 hours, I start to see
a memory leak that is caused by references to objects in the ClockDaemon's
Heap.
My application has a scheduler that issues a JMS event every 30 seconds, and
each one instantiates a new Connection. The constructor for Connection calls
startPingThread(), which runs
| clockDaemon.executePeriodically(pingPeriod, new PingTask(), true);
This method call inserts a new ClockDaemon$TaskNode on the ClockDaemon's Heap.
When I put my debugger on the leaking application, I see that
ClockDaemon$RunLoop is hung on the following line:
task.command.run();
The task is a TaskNode instance with a command that is a PingTask instance. As
a result, the loop is stopped, so none of the objects can be removed from the
ClockDaemon's Heap.
Drilling down into the PingTask.run() method, I see that it is hung at the
following line:
pingTaskSemaphore.acquire();
My debugger shows that pingTaskSemaphore has zero permits, so the acquire()
method will block until another thread calls pingTaskSemaphore.release(). But
apparently release() is never called.
It seems to me that this should never be allowed to happen, because a low
priority ping task is effectively hijacking the ClockDaemon's loop that
dereferences objects. Once the application reaches this state, the JVM
ultimately fails with an OutOfMemory error.
I don't immediately see why pingTaskSemaphore.release() is never called, except
that pingTaskSemaphore.acquire() is not in the same try-finally block. The
PingTask.run() method is below:
| /**
| * The ping task
| */
| class PingTask implements Runnable
| {
| /**
| * Main processing method for the PingTask object
| */
| public void run()
| {
| try
| {
| pingTaskSemaphore.acquire();
| }
| catch (InterruptedException e)
| {
| log.debug("Interrupted requesting ping semaphore");
| return;
| }
| try
| {
| if (ponged == false)
| {
| // Server did not pong use with in the timeout
| // period.. Assuming the connection is dead.
| throw new SpyJMSException("No pong received", new
IOException("ping timeout."));
| }
|
| ponged = false;
| pingServer(System.currentTimeMillis());
| }
| catch (Throwable t)
| {
| asynchFailure("Unexpected ping failure", t);
| }
| finally
| {
| pingTaskSemaphore.release();
| }
| }
| }
|
Notice that pingTaskSemaphore.release() is in a finally block, but it isn't the
same try block as the pingTaskSemaphore.acquire() method call, which means that
it is possible for pingTaskSemaphore.acquire() to decrement the Semaphore's
permits and then throw an exception that is not InterruptedException. In that
case, the finally block would never be executed because the Exception would be
thrown from the first try block. However, this last paragraph is speculation.
I don't really know what is happening. I also don't know why it takes 20-30
hours of operation for this problem to appear. I don't have a reproducible
test case because I can't consistently reproduce this problem.
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3992141#3992141
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3992141
_______________________________________________
jboss-user mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/jboss-user