On Sat, Aug 14, 2010 at 12:21 PM, Sean Bigdatafun <[email protected]> wrote: > On Sat, Aug 14, 2010 at 11:36 AM, Stack <[email protected]> wrote: >> You don't answer my questions above. >> > I ran 6GB now, but I really wonder why I should not run 12GB. (People said > if I have too much heapsize, GC will kill me) >
The bigger the heap, the longer it takes to sweep. So, what have you experienced? A long GC pause that caused your RegionServers to shutdown? You using default GC settings? You played with them at all? Have you had a look at http://wiki.apache.org/hadoop/PerformanceTuning or seen what others have reported on this list lately the GC configs that work for their hw/loading? Are you swapping at all? Whats your loading like? All writes? A mix? Realtime? Analytics? Whats your hardware like? GC is a pain. Its the bane of all java apps. HBase seems to be particularly taxing of GC algos. > It would be really helpful if we can get a detail suggestion on the heapsize > configuration. Theoretically, a) the more heap size I configure, the more > data I can hold in memory; b) but does GC pause insanely long on a large > heap size JVM? Sorry. We (or should I say java) doesn't have a one answer that fits all deploys/hardward/loadings. Yeah, a.) is true above -- it means you can cache more so you'll get better read performance. b.) it can happen, for sure. Thats the JVM for you. See the cited, if a little stale, link above and what others have posted mitigating b.). Ongoing there is HBASE-2902 wherein we hope to improve our default shipping GC config. and a few of the lads are going beyond deep diving on this stuff. We'll report back if they find any pearls. Yours, St.Ack
