[
https://issues.apache.org/jira/browse/HDFS-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth updated HDFS-9239:
--------------------------------
Release Note: This release adds a new feature called the DataNode Lifeline
Protocol. If configured, then DataNodes can report that they are still alive
to the NameNode via a fallback protocol, separate from the existing heartbeat
messages. This can prevent the NameNode from incorrectly marking DataNodes as
stale or dead in highly overloaded clusters where heartbeat processing is
suffering delays. For more information, please refer to the hdfs-default.xml
documentation for several new configuration properties:
dfs.namenode.lifeline.rpc-address, dfs.namenode.lifeline.rpc-bind-host,
dfs.datanode.lifeline.interval.seconds, dfs.namenode.lifeline.handler.ratio and
dfs.namenode.lifeline.handler.count.
> DataNode Lifeline Protocol: an alternative protocol for reporting DataNode
> liveness
> -----------------------------------------------------------------------------------
>
> Key: HDFS-9239
> URL: https://issues.apache.org/jira/browse/HDFS-9239
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode, namenode
> Reporter: Chris Nauroth
> Assignee: Chris Nauroth
> Fix For: 2.8.0
>
> Attachments: DataNode-Lifeline-Protocol.pdf, HDFS-9239.001.patch,
> HDFS-9239.002.patch, HDFS-9239.003.patch
>
>
> This issue proposes introduction of a new feature: the DataNode Lifeline
> Protocol. This is an RPC protocol that is responsible for reporting liveness
> and basic health information about a DataNode to a NameNode. Compared to the
> existing heartbeat messages, it is lightweight and not prone to resource
> contention problems that can harm accurate tracking of DataNode liveness
> currently. The attached design document contains more details.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)