Thanks a lot Todd, your conclusion does confirm the feeling I had around large heaps with HBase.
On Thu, Jun 13, 2013 at 7:30 AM, Todd Lipcon <[email protected]> wrote: > Hey Nicolas, > > I've corresponded with that guy a few times in the past -- back when i > was attempting to hack some patches into G1 for better performance on > HBase. The end result of that investigation was the MSLAB feature > which made it into 0.90.x. > > The main thing I learned about GC is that big heaps aren't in > themselves problematic -- they don't tend to make young gen pauses > take longer. The only problem is if you eventually hit a > stop-the-world CMS pause, the size of the heap linearly effects the > length of the pause. So, the trick is avoiding stop-the-world CMS. > > In order to avoid that, you need to do a few things: > - make sure you don't have any short-lived super-large objects: when > large objects are promoted from the young generation, they need to > find contiguous space in the old gen. If you allocate, say, a 400MB > array, even if it's short lived, it's unlikely you'll find 400MB of > contiguous space in the old gen without defragmenting. This will cause > a STW pause. > > If you have some super-large objects allocated at startup, that's OK, > they'll just park themselves in the old gen and not cause trouble. > > - make sure that most of your objects are "around the same size". This > prevents fragmentation build-up in the old gen. > > - move big memory consumers off-heap if possible > > We've done a pretty good job of the above so far, and with a bit more > careful analysis I think it's possible to fully avoid old-gen STW > pauses. > > -Todd > > > On Wed, Jun 12, 2013 at 8:35 PM, Nicolas Liochon <[email protected]> > wrote: > > Hi there, > > > > During the hackathon I had some discussions around GC on large heaps. > > > > This guy, who seems to know what he is talking about, and had a patch > > accepted in hotspot jdk, said in 2011 that he's got a configuration > working > > reasonably well with large heaps at that time : > > > > "I was able to keep GC pause on 32Gb Oracle Coherence storage node below > > 150ms on 8 core server." > > > > (in http://java.dzone.com/articles/how-tame-java-gc-pauses) > > > > There is a lot of stuff in his blog, some of it in Russian only, but at > > least one of us will understand it. > > > > > http://blog.ragozin.info/2011/07/openjdk-patch-cutting-down-gc-pause.html > > http://fr.slideshare.net/aragozin/garbage-collection-in-jvm-mailru-fin > > > > Cheers, > > > > Nicolas > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
