JVM is crashing for systems with NFS

Mihail Ionescu Sat, 08 Dec 2012 09:02:27 -0800

I have a small cluster of 15 machines, running hadoop-1-0-2. Each machine
runs kernel 2.6.35, has a root disk mounted under nfs (all machines have
the same root file system) and  a local disk (mounted under
/mnt/localdisk). I installed hadoop under /mnt/localdisk/hadoop, which the
conf directory shared for all machines (in order to change the
configuration for all machines in an easy manner). I am using jdk 1.6.23,
installed locally on /mnt/localdisk/jdk. On each machine a datanode and a
tasktracker are running, each task tracker has 2 slots for mapper and 2
slots for reducer.


The problem is that, after running various map-reduce tasks, JVM crashes
pretty frequently on many machines. There is no rule I could find,
sometimes the datanode is crashing, other times tasktracker, or maybe even
both. They generate a hs_err file, with SIGBUS 0x7, if needed I can post
the contents of that file, I could not find anything interesting there.

Does anyone had this problem? Maybe because the root file system is shared
and hadoop tries to writes some files in /tmp or something and because the
file system is shared across all machines? Any help would be greatly
appreciated.

Thanks,

Mihail

JVM is crashing for systems with NFS

Reply via email to