Re: Diagnosing long term ActiveMQ memory leaks?

Tim Bain Tue, 23 Dec 2014 16:31:36 -0800

Yeah, parallel is still the default one even in Java 8, as far as I know.
So the CMS concerns sound like a non-issue.


On Tue, Dec 23, 2014 at 4:15 PM, Kevin Burton <bur...@spinn3r.com> wrote:

> The default GC which I think is still parallel.  But I should explicitly
> set it.  I’m working on getting JMX monitors up so that I can track
> ActiveMQ counters but also GC state.
>
> I can’t get a reliable and short term failure to occur so right now I’ve
> focused on mitigating the issue and easily rebuilding my queue when it
> crashes.
>
> On Tue, Dec 23, 2014 at 8:55 AM, Tim Bain <tb...@alumni.duke.edu> wrote:
>
> > BTW, you never did answer the question about which GC strategy you're
> > using, and it occurred to me that if you're using CMS, lots of full GCs
> > that don't actually reclaim much memory after a long time being up is the
> > classic failure scenario for CMS.  It happens when Old Gen gets
> fragmented,
> > which in turn happens because CMS is a non-compacting GC strategy in Old
> > Gen.  If you're using CMS and seeing continual full GCs, you should look
> at
> > whether G1 GC would be better for your needs.
> >
> > On Mon, Dec 22, 2014 at 11:03 PM, Kevin Burton <bur...@spinn3r.com>
> wrote:
> >
> > > Great feedback.  Thanks btw. I’m working on getting up better JMX
> > monitors
> > > so I can track memory here more aggressively.  Bumping up memory by
> 1.5G
> > > temporarily fixed the problem. However, it seems correlated to the
> number
> > > of connections.  So I suspect I’ll just hit this again in the next few
> > > weeks.
> > >
> > > By that time I plan to have better JMX monitors in place to resolve
> this.
> > >
> > > On Sat, Dec 20, 2014 at 10:28 PM, Tim Bain <tb...@alumni.duke.edu>
> > wrote:
> > > >
> > > > What JVM are you using, and what GC strategy with which options?  And
> > for
> > > > that matter, what broker version?
> > > >
> > > > With Hotspot 7u21 and G1GC while running a long-running performance
> > > stress
> > > > test I've observed that Old Gen use increases over time (despite the
> > fact
> > > > that G1GC is supposed to collect Old Gen during its normal collection
> > > > operations), and GCs against Old Gen happen semi-continually after
> Old
> > > Gen
> > > > hits a certain memory threshold.  However, unlike what you're
> > observing,
> > > 1)
> > > > the GCs I saw were Old Gen GCs but not full GCs (G1 allows GCing Old
> > Gen
> > > > during incremental GCs), 2) the broker remains responsive with
> > reasonable
> > > > pause times close to my target, and 3) once Old Gen hits the 90%
> > > threshold
> > > > that forces a full GC, that full GC is able to successfully collect
> > > nearly
> > > > all of the Old Gen memory.  My conclusion from that was that although
> > > > objects were being promoted to Old Gen (and I tried unsuccessfully to
> > > > prevent that from occurring, see
> > > >
> > > >
> > >
> >
> http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-td4686450.html
> > > > ),
> > > > nearly all of them were unreachable by the time a full GC actually
> > > > occurred.
> > > >
> > > > So if you're seeing continual full GCs (not just Old Gen GCs if
> you're
> > > > using G1) that don't actually free any Old Gen memory, then what
> you're
> > > > seeing different behavior than I saw, and it means that the objects
> in
> > > Old
> > > > Gen are still reachable.  One possible reason for that would be
> > messages
> > > > still being held in destinations waiting to be consumed; look for
> > queues
> > > > without consumers (especially DLQs), as well as durable subscribers
> > that
> > > > are offline.  If you're certain that's not the case, maybe you can
> post
> > > > some of the results of analyzing the heap snapshot so that people who
> > > know
> > > > the codebase better could see if anything jumps out?
> > > >
> > > > On Sat, Dec 20, 2014 at 1:51 PM, Kevin Burton <bur...@spinn3r.com>
> > > wrote:
> > > >
> > > > > I’m trying to diagnose a long term memory leak with ActiveMQ.
> > > > >
> > > > > Basically , my app runs fine for about a week or so, then goes to
> > 100%
> > > > CPU
> > > > > doing continually full GCs back to back.
> > > > >
> > > > > No work is done during that period.
> > > > >
> > > > > I have a large number of sessions to the AMQ box, but things are
> fine
> > > on
> > > > > startup.
> > > > >
> > > > > It’s entirely possible that y app isn’t releasing resources, but
> I”m
> > > > trying
> > > > > to figure out the best way to track that down.
> > > > >
> > > > > I’m using org.apache.activemq.UseDedicatedTaskRunner=false so that
> > > thread
> > > > > pools are used.  Which apparently can cause a bit of wasted memory.
> > > > >
> > > > > I have a heap snapshot.  I loaded that into the Eclipse Memory
> > Analyzer
> > > > and
> > > > > didn’t see any obvious candidates but of course I’m not an expert
> on
> > > the
> > > > > ActiveMQ code base.
> > > > >
> > > > > Are there any solid JMX counters I can track during this process?
> > > Number
> > > > > of sessions? etc.
> > > > >
> > > > > --
> > > > >
> > > > > Founder/CEO Spinn3r.com
> > > > > Location: *San Francisco, CA*
> > > > > blog: http://burtonator.wordpress.com
> > > > > … or check out my Google+ profile
> > > > > <https://plus.google.com/102718274791889610666/posts>
> > > > > <http://spinn3r.com>
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Founder/CEO Spinn3r.com
> > > Location: *San Francisco, CA*
> > > blog: http://burtonator.wordpress.com
> > > … or check out my Google+ profile
> > > <https://plus.google.com/102718274791889610666/posts>
> > > <http://spinn3r.com>
> > >
> >
>
>
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
> <http://spinn3r.com>
>

Re: Diagnosing long term ActiveMQ memory leaks?

Reply via email to