[
https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017316#comment-16017316
]
Vinayakumar B commented on HDFS-11856:
--------------------------------------
Solution:
1. Consider the Non-Local restarting nodes as BAD for current pipeline update.
Remove from bad nodes once the successful pipeline update happens.
2. Allow datanode to accept replica transfers, if same replica with old
genstamp exists already on node, irrespective of state of existing replica.
> Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline
> updates
> ----------------------------------------------------------------------------------
>
> Key: HDFS-11856
> URL: https://issues.apache.org/jira/browse/HDFS-11856
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client, rolling upgrades
> Affects Versions: 2.7.3
> Reporter: Vinayakumar B
> Assignee: Vinayakumar B
>
> During rolling upgrade if the DN gets restarted, then it will send special
> OOB_RESTART status to all streams opened for write.
> 1. Local clients will wait for 30 seconds to datanode to come back.
> 2. Remote clients will consider these nodes as bad nodes and continue with
> pipeline recoveries and write. These restarted nodes will be considered as
> bad, and will be excluded for lifetime of stream.
> In case of small cluster, where total nodes itself is 3, each time a remote
> node restarts for upgrade, it will be excluded.
> So a stream writing to 3 nodes initial, will end-up writing to only one node
> at the end, there are no other nodes to replace.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]