[
https://issues.apache.org/jira/browse/HDFS-11830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014386#comment-16014386
]
Mukul Kumar Singh commented on HDFS-11830:
------------------------------------------
Thanks for the patch [~cheersyang], Following are my comments
1) HeartbeatEndpointTask.java:134
I feel that the rpc endpoint state here should only be HEARTBEAT, and we should
transition the state to re-register only if the current state is HEARTBEAT. We
should also raise an exception if the endpoint is in any other state apart from
HEARTBEAT.
2) Also can you please rename this command to reregister, this would help in
differentiating with the registered command.
> Ozone: Datanode needs to re-register to SCM if SCM is restarted
> ---------------------------------------------------------------
>
> Key: HDFS-11830
> URL: https://issues.apache.org/jira/browse/HDFS-11830
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Reporter: Weiwei Yang
> Assignee: Weiwei Yang
> Priority: Critical
> Attachments: HDFS-11830-HDFS-7240.001.patch
>
>
> Problem description:
> # Start NN, DN, SCM
> # Restart SCM and will see following warnings in SCM log
> 17/05/02 00:47:08 WARN node.SCMNodeManager: SCM receive heartbeat from
> unregistered datanode
> Datanode could not re-establish communication with SCM afterwards. Propose to
> fix this by adding a new command in HB handling telling datanode to
> re-register with SCM. Datanode once received this command transits to
> REGISTER state again to proceed.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]