Hello,

I'm new in Hadoop and I'm learning on a small cluster made of VMs.
I have a strange behavior and I don't know what's going on.

When I run some mapred tasks, sometimes, a nodemanager gets killed. It receives a signal 15 according to the logs and dies. I think all processes of a node runing as the nodemanager user (named yarn here) receive the signal 15 as I saw my ssh connection to a node as yarn be killed in the same time.

I don't see anything on system side. the nodemanager's logs show the reception of the signal 15 but not much. I don't know who send it and why.

I don't really know what relevant information to provide, so don't hesitate to ask for more :-)

yarn@hadoop5:/home/hadoop/hadoop-2.7.1/logs$ java -version
java version "1.8.0_51"
Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)

yarn@hadoop5:/home/hadoop/hadoop-2.7.1$ ./bin/hadoop version
Hadoop 2.7.1
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 15ecc87ccf4a0228f35af08fc56de536e6ce657a
Compiled by jenkins on 2015-06-29T06:04Z
Compiled with protoc 2.5.0
From source with checksum fc0a1a23fc1868e4d5ee7fa2b28a58a
This command was run using /home/hadoop/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar


Log of a killed nodemanager
http://pastebin.com/FEyQVanp

Conf of nodemanagers
http://pastebin.com/k2VS7v8y

Thanks,

--
Nicolas

Reply via email to