What JVM are you using, and what GC strategy with which options? And for that matter, what broker version?
With Hotspot 7u21 and G1GC while running a long-running performance stress test I've observed that Old Gen use increases over time (despite the fact that G1GC is supposed to collect Old Gen during its normal collection operations), and GCs against Old Gen happen semi-continually after Old Gen hits a certain memory threshold. However, unlike what you're observing, 1) the GCs I saw were Old Gen GCs but not full GCs (G1 allows GCing Old Gen during incremental GCs), 2) the broker remains responsive with reasonable pause times close to my target, and 3) once Old Gen hits the 90% threshold that forces a full GC, that full GC is able to successfully collect nearly all of the Old Gen memory. My conclusion from that was that although objects were being promoted to Old Gen (and I tried unsuccessfully to prevent that from occurring, see http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-td4686450.html), nearly all of them were unreachable by the time a full GC actually occurred. So if you're seeing continual full GCs (not just Old Gen GCs if you're using G1) that don't actually free any Old Gen memory, then what you're seeing different behavior than I saw, and it means that the objects in Old Gen are still reachable. One possible reason for that would be messages still being held in destinations waiting to be consumed; look for queues without consumers (especially DLQs), as well as durable subscribers that are offline. If you're certain that's not the case, maybe you can post some of the results of analyzing the heap snapshot so that people who know the codebase better could see if anything jumps out? On Sat, Dec 20, 2014 at 1:51 PM, Kevin Burton <bur...@spinn3r.com> wrote: > I’m trying to diagnose a long term memory leak with ActiveMQ. > > Basically , my app runs fine for about a week or so, then goes to 100% CPU > doing continually full GCs back to back. > > No work is done during that period. > > I have a large number of sessions to the AMQ box, but things are fine on > startup. > > It’s entirely possible that y app isn’t releasing resources, but I”m trying > to figure out the best way to track that down. > > I’m using org.apache.activemq.UseDedicatedTaskRunner=false so that thread > pools are used. Which apparently can cause a bit of wasted memory. > > I have a heap snapshot. I loaded that into the Eclipse Memory Analyzer and > didn’t see any obvious candidates but of course I’m not an expert on the > ActiveMQ code base. > > Are there any solid JMX counters I can track during this process? Number > of sessions? etc. > > -- > > Founder/CEO Spinn3r.com > Location: *San Francisco, CA* > blog: http://burtonator.wordpress.com > … or check out my Google+ profile > <https://plus.google.com/102718274791889610666/posts> > <http://spinn3r.com> >