Strange machine behavior

Robert Dyer Sat, 08 Dec 2012 16:09:32 -0800

Has anyone experienced a TaskTracker/DataNode behaving like the attached
image?


This was during a MR job (which runs often).  Note the extremely high
System CPU time.  Upon investigating I saw that out of 64GB ram the system
had allocated almost 45GB to cache!

I did a sudo sh -c "sync ; echo 3 > /proc/sys/vm/drop_cache ; sync" which
is roughly where the graph goes back to normal (much lower System, much
higher User).

This has happened a few times.

I have tried playing with the sysctl vm.swappiness value (default of 60) by
setting it to 30 (which it was at when the graph was collected) and now to
10.  I am not sure that helps.

Any ideas?  Anyone else run into this before?

24 cores
64GB ram
4x2TB sata3 hdd

Running Hadoop 1.0.4, with a DataNode (2gb heap), TaskTracker (2gb heap) on
this machine.

24 map slots (1gb heap each), no reducers.

Also running HBase 0.94.2 with a RS (8gb ram) on this machine.

<<attachment: cpu-use.png>>

Strange machine behavior

Reply via email to