Swappiness is set to zero, but swap is not disabled all together, so when RAM gets over utilized the boxes will start swapping.
I am running G1 with defaults (no max pause given) on JVM 1.6.0_21. When not swapping, it shows pauses of around 10s for full GCs on a 16GB heap, which do not happen very often. Mostly during massive insert. I have run a write only job that was hitting our 8 node cluster from 15 reducers for six days continuously without anything dying. Non full GCs never show multi second pauses. I have yet to experience a hard crash / segfault. I wanted to look into the density of short pauses after enabling WAL on the write loads and then possibly do some tuning to make it nicer. I did try setting the max pause before, but I found that it has a hard time actually providing the max pause guarantee. Right now, however, I am in the unpleasant situation that my NN won't come up anymore after a restart (throws NPE), so I need to get that fixed first (without formatting, because I am not very keen on running the 6 day job again). I did a restart of everything to make sure that anything that was swapped out before got back to memory, but I guess restarting the NN could have better been left for another time... On 19 dec 2010, at 07:37, Stack wrote: > Setting swap to zero is going to an extreme but changing it so its not > the default -- 60% IIRC -- could help. > > I made HBASE-3376 for the NPE. I'll take a look into it. > > Congrats on the 113second G1 pause. I thought G1 was to be the end of > such pauses? (Smile). Its good to hear that its staying up for you. > What do your JVM options look like for G1? Are you running with > -XX:MaxGCPauseMillis? > > Good on you Friso, > St.Ack > > On Sat, Dec 18, 2010 at 12:12 AM, Friso van Vollenhoven > <[email protected]> wrote: >> Hi J-D, >> >> I redid the job as before to once more verify. The real problem appears to >> be something I had not noticed, because I never expected it. The machines >> started swapping during the job. I did not expect that, because there is a >> total of about 45GB heap allocated on a 64GB machine and nothing much else >> running, so I had not thought of that immediately (although I should have >> checked nonetheless). Network and IO graphs look normal. We run on 10 disks >> / datanode, so there are some spindles available. I will try to get the >> swapping out of the way and then try again and see if it fixes it. >> >> Now the GC log shows some obvious evil doers: >> 2010-12-18T00:17:42.274+0000: 14275.374: [Full GC (System.gc()) >> 6913M->4722M(15742M), 113.1350880 secs] >> [Times: user=6.99 sys=1.80, real=113.12 secs] >> >> A (partial) log of one of the RS is here: >> GC log: http://pastebin.com/GPRVj8u5 >> RS log: http://pastebin.com/Vj5K26ss >> >> The GC is G1, so it may look different from what you expect from a GC log. I >> know it is considered experimental, but I like the concept of it and think >> it's nice to gain some experience with it. >> >> Also, in the RS log I see some NPEs: >> 2010-12-18 04:31:24,043 ERROR org.apache.hadoop.hbase.executor.EventHandler: >> Caught throwable while processing event M_RS_OPEN_REGION >> java.lang.NullPointerException >> at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75) >> at >> org.apache.hadoop.hbase.executor.RegionTransitionData.fromBytes(RegionTransitionData.java:198) >> at >> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:672) >> at >> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNodeOpening(ZKAssign.java:552) >> at >> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNodeOpening(ZKAssign.java:545) >> at >> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.transitionZookeeperOfflineToOpening(OpenRegionHandler.java:208) >> at >> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:89) >> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:151) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:619) >> >> These I had not seen before. Any idea? >> >> >> Friso >> >> >> >> On 17 dec 2010, at 20:11, Jean-Daniel Cryans wrote: >> >> If it wasn't GC it would be core dumps and whatnot, nothings free :) >> >> I will reiterate what I said in my first answer, I'd like to see your >> GC log since at this point I haven't seen direct evidence of GC >> activity. >> >> J-D >> >> On Fri, Dec 17, 2010 at 1:27 AM, Friso van Vollenhoven >> <[email protected]<mailto:[email protected]>> wrote: >> Hi J-D, >> >> Thanks for your comments and clarification. I guess GC does blow (especially >> when writing things like databases and filesystems). >> >> Right now I will dive into GC tuning once more and probably lower the number >> of reducers on the insert jobs. >> >> >> Thanks, >> Friso >> >>
