[
https://issues.apache.org/jira/browse/HDFS-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15176961#comment-15176961
]
Tsz Wo Nicholas Sze commented on HDFS-9239:
-------------------------------------------
Should the try-catch be restructured like below?
{code}
@Override
public void run() {
try {
initialRegistrationComplete.await();
while (shouldRun()) {
try {
if (lifelineNamenode == null) {
lifelineNamenode = dn.connectToLifelineNN(lifelineNnAddr);
}
sendLifelineIfDue();
} catch (IOException e) {
LOG.warn("IOException in LifelineSender for " +
BPServiceActor.this, e);
}
Thread.sleep(scheduler.getLifelineWaitTime());
}
} catch (InterruptedException e) {
LOG.warn("LifelineSender interrupted", e);
}
LOG.info("LifelineSender for " + BPServiceActor.this + " exiting.");
}
{code}
> DataNode Lifeline Protocol: an alternative protocol for reporting DataNode
> liveness
> -----------------------------------------------------------------------------------
>
> Key: HDFS-9239
> URL: https://issues.apache.org/jira/browse/HDFS-9239
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode, namenode
> Reporter: Chris Nauroth
> Assignee: Chris Nauroth
> Attachments: DataNode-Lifeline-Protocol.pdf, HDFS-9239.001.patch,
> HDFS-9239.002.patch
>
>
> This issue proposes introduction of a new feature: the DataNode Lifeline
> Protocol. This is an RPC protocol that is responsible for reporting liveness
> and basic health information about a DataNode to a NameNode. Compared to the
> existing heartbeat messages, it is lightweight and not prone to resource
> contention problems that can harm accurate tracking of DataNode liveness
> currently. The attached design document contains more details.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)