Steve Loughran wrote:
Raghu Angadi wrote:
A heartBeat is also an RPC. When you pause Namenode for 30 sec the
datanode's heartbeat thread just waits for 30 sec for its heartbeat
RPC to return. Note that when you pause Namenode, the RPCs to it don't
fail immediately. During this wait, DNs can perform other transactions
like serving data to clients.
If the heartbeat were just telling the NN that the DN is alive,
Thats not the case in Hadoop. Central servers don't actively contact
their slaves. It's been a long standing problem. For e.g. anything that
NN want to tell a DN, it has to be in the form of response to a
heartbeat or another RPC.
Raghu.
you
could do it with a UDP that didn't block the DN. If, however, the DN
needs to know/care that the NN is up, then you do need to care about the
state of the namenode. But you don't have to do it blocking; looking for
a UDP back some time later is all you need to do.