[
https://issues.apache.org/jira/browse/HDFS-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002293#comment-15002293
]
Kihwal Lee commented on HDFS-9239:
----------------------------------
bq. NN heartbeat processing with a lockless + tryLock implementation would make
it ideally suited for the existing client and/or service servers.
NN should still enforce a max number of skips and guarantee commands are sent
in bounded time. Replication or block recovery is done through an asynchronous
protocol, but oftentimes clients expect them to be done "soon".
> DataNode Lifeline Protocol: an alternative protocol for reporting DataNode
> liveness
> -----------------------------------------------------------------------------------
>
> Key: HDFS-9239
> URL: https://issues.apache.org/jira/browse/HDFS-9239
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode, namenode
> Reporter: Chris Nauroth
> Assignee: Chris Nauroth
> Attachments: DataNode-Lifeline-Protocol.pdf, HDFS-9239.001.patch
>
>
> This issue proposes introduction of a new feature: the DataNode Lifeline
> Protocol. This is an RPC protocol that is responsible for reporting liveness
> and basic health information about a DataNode to a NameNode. Compared to the
> existing heartbeat messages, it is lightweight and not prone to resource
> contention problems that can harm accurate tracking of DataNode liveness
> currently. The attached design document contains more details.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)