[
https://issues.apache.org/jira/browse/HDFS-11284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15793992#comment-15793992
]
Rakesh R commented on HDFS-11284:
---------------------------------
bq. The #3 still exists. Say we have these datanodes in our cluster:
Thanks [~yuanbo] for the detailed analysis. In your example, it is reducing the
replication factor from 4 to 3. IIUC, {{ReplicaNotFoundException}} is occurred
for the extra replica block and that is expected due to block deletion. It
would be great if you could explore the impact of this exception and retries.
Also, appreciate adding/contribute a unit test case to show the behavior.
Thanks!
If it is a case of under replicated blocks then coordinator datanode will hit
exception while movement and send this error result to SPS. Later SPS, will
schedule for retries, right?
> [SPS]: Avoid running SPS under safemode and fix issues in target node
> choosing.
> -------------------------------------------------------------------------------
>
> Key: HDFS-11284
> URL: https://issues.apache.org/jira/browse/HDFS-11284
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode, namenode
> Reporter: Yuanbo Liu
> Assignee: Yuanbo Liu
>
> Recently I've found in some conditions, SPS is not stable:
> * SPS runs under safe mode.
> * There're some overlap nodes in the chosen target nodes.
> * The real replication number of block doesn't match the replication factor.
> For example, the real replication is 2 while the replication factor is 3.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]