A Heartbeat mechanism
Hadoop cluster is a master/slave mode, master includes namenode+resoucemanager,
slave includes datanodes+nodemanager
Master starts, will open a IPC server there, waiting for the slave heartbeat.
Slave startup, will connect to the master, and every 3 seconds to master sends
a heartbeat;, this time through ” heartbeat.recheck.interval” property to set .
To their own state information told master, then master is through the heart of
the return value, to the slave node to convey instructions..
You can tune heartbeat.recheck.interval configuration..
The NameNode updates this detail after 10.5 minutes by default. You can see the
dead and live datanodes at that time.
It computes this heartbeatExpireInterval time by the following formula
heartbeatExpireInterval = 2 * heartbeatRecheckInterval +
10 * heartbeatInterval
where heartbeatRecheckInterval is defined by the configuration
heartbeat.recheck.interval which is 5 minutes by default and heartbeatInterval
by dfs.heartbeat.interval which is 3 seconds by default.
Hence
heartbeatExpireInterval = 10.5 minutes
Same for Yarn Also.
Thanks & Regards
Brahma Reddy Battula
________________________________
From: MrAsanjar . [[email protected]]
Sent: Wednesday, July 02, 2014 11:12 PM
To: [email protected]
Subject: namenode doesn't receive datanode deactivate event
Hi all,
I have a small hadoop 2.2.0 development cluster consist of a master node (
namenode+resoucemanager ), and 4 slave nodes ( datanodes+nodemanager).
My configuration is as such that it enables me dynamically add slave nodes by
executing commands:
.../sbin/hadoop-daemons.sh start datanode
.../sbin/yarn-daemons.sh start nodemanage
I could verify the activation of the new slave node by executing "jps" command
(datanode and nodemanager are active ) on the newly created node and by
monitoring namenode health on http://{masternode_ip}/50070
However when I deactivate any of the hadoop slave nodes by executing commands:
../sbin/hadoop-daemons.sh stop datanode
../sbin/yarn-daemons.sh stop nodemanager
Namenode heath at http://{masternode_ip}/50070 does not show the deactivation
of the slave node. But I could verify the shut-down of datanode and nodemanager
jvm processes by executing "jps" on the slave node.
Namenode eventually after 20-30 minutes marks the salve node dead.
What am I missing here? Why namenode and resourcemanager are not getting
notified of
the datanode and nodemanager deactivation?
Please help, thanks