[ 
https://issues.apache.org/jira/browse/HDFS-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9239:
--------------------------------
    Attachment: HDFS-9239.002.patch

I'd like to proceed with this feature, as it has been mentioned as potentially 
relevant in comments on other JIRAs.  I'm attaching patch v002 with just a few 
small changes:
# Rebase on current trunk.
# Address comments from Anu.
# Fix a few Checkstyle warnings.  I think the remaining Checkstyle warnings 
flagged in the last pre-commit run are not worth addressing, but I'll review 
the next pre-commit run for new warnings.

There had been a suggestion of changing the existing heartbeat handling to use 
tryLock.  I explored this a bit, but I'm reluctant to alter mainline heartbeat 
processing at all.  Overall, I think this feature is less intrusive as 
currently implemented, despite the fact that another RPC server adds some 
operational complexity.  Perhaps a tryLock-based implementation of heartbeat 
handling could be done in a separate JIRA, again gated by a configuration flag, 
to enable further experimentation in large clusters.


> DataNode Lifeline Protocol: an alternative protocol for reporting DataNode 
> liveness
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-9239
>                 URL: https://issues.apache.org/jira/browse/HDFS-9239
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, namenode
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: DataNode-Lifeline-Protocol.pdf, HDFS-9239.001.patch, 
> HDFS-9239.002.patch
>
>
> This issue proposes introduction of a new feature: the DataNode Lifeline 
> Protocol.  This is an RPC protocol that is responsible for reporting liveness 
> and basic health information about a DataNode to a NameNode.  Compared to the 
> existing heartbeat messages, it is lightweight and not prone to resource 
> contention problems that can harm accurate tracking of DataNode liveness 
> currently.  The attached design document contains more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to