On Mon, Jan 3, 2011 at 9:40 AM, Stack <[email protected]> wrote: > zookeeper.session.timeout is the config. to toggle. Its set to > 180seconds in 0.90.0RC. Is it not so in your deploy? > > On Mon, Jan 3, 2011 at 5:13 AM, Wayne <[email protected]> wrote: > > > > Any help or suggestions would be appreciated. Parnew was getting large > and > > taking too long (> 100ms) so I will try to limit the size with the > > suggestion from the performance tuning page (-XX:NewSize=6m > > -XX:MaxNewSize=6m). > > > > The CMS concurrent mode failure will be about trying to promote from > new space up into the tenured heap but there's not the space in > tenured heap to take the promotion because of fragmentation. You > could try putting an upper bound on the new size (What size had your > eden space grown too?). That would put off the CMF some but in long > running app., CMF seems unavoidable, yeah. >
Still working on this one on a backgroud thread over here, bugging the hotspot guys :) I think our best bet is going to be basically doing a slow rolling full GC in the cluster - if we can detect when the heap is fragmented, we can shed regions gracefully, do GC, then pick them back up. Detecting the fragmentation is possible from within the JVM source code, but can't quite figure out how to expose it. > A newsize of 6M is way too small given the heap sizes you've been > bandy'ing about (You were thinking 64M? Even then, that seems too > small). > +1. I'd recommend at least 64m new size.. if reasonably frequent 200-300ms pauses are acceptable, go to 128m or larger. You can also tune SurvivorRatio down and use a larger new size for some workloads, but it's a little messy to figure this out. -Todd -- Todd Lipcon Software Engineer, Cloudera
