Re: Broker Leak

Jerry Cwiklik Thu, 13 Dec 2012 09:18:30 -0800

Christian, after much pain and suffering I finally figured out what is going
on. Our system is quite complicated and involves many producers that send
large messages (600K-1.5M) to a relatively few multi-threaded consumers
(services) which run "forever". The producers are transient and can be
killed by our custom job scheduler at any time via kill -9 to make room for
other producers. We run the broker with 10G heap.


The consumer is coded to group and cache Sessions with a Connection which
has an inactivity timer  associated with it. Every time a message is sent,
the timer is restarted. If the timer pops (default 30minutes), the Sessions,
MessageProducers and a Connection are closed due to inactivity. 

This worked perfectly fine until about 4 weeks ago when we started
experiencing broker OOM problem. While the broker was running we could see a
steady (fast) rise in the heap usage in a jConsole. After a couple of days
the broker's jvm would OOM. 

The problem started happening when we introduced pingers for the Consumers.
Every minute a pinger sends a message to a Consumer to make sure its alive.
The Consumer replies to the pinger request and restarts inactivity timer. It
took me awhile to see the bug in our application, but eventually I
determined that our timer behaves incorrectly as it is associated with a
Connection not individual Sessions. The Sessions go stale due to producer
getting killed, and any messages in the broker referenced by
ProducerExchange object are retained indefinitely causing a leak in the
broker. As you explained it to me, the broker uses lazy approach to cleanup.
Meaning it cleans up on a new message from the Producer. In our case, the
Producer never sends anything and thus no cleanup is ever done.

The fix for this is to create a timestamp for every Session when it was last
used to message to the broker. At fixed intervals a Session Reaper thread
wakes up and checks the timestamp of every Session to determine if it has
been inactive for a max allowed time and if so, to close it.

So the problem was caused by an application bug and the fact that the broker
takes a lazy approach to cleanup. As a side note, under the described
scenario, I've noticed that the broker memory usage (shown in jConsole)
indicated 0 even though there were ton of messages in the heap with valid
references (held by ProducerExchange). 

Thanks Christian for your help

-Jerry C
 





--
View this message in context: 
http://activemq.2283324.n4.nabble.com/Broker-Leak-tp4660437p4660618.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Broker Leak

Reply via email to