Rakesh R created HDFS-11032:
-------------------------------
Summary: [SPS]: Handling of block movement failure at the
coordinator datanode
Key: HDFS-11032
URL: https://issues.apache.org/jira/browse/HDFS-11032
Project: Hadoop HDFS
Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
The idea of this jira is to discuss and implement an efficient failure(block
movement failure) handling logic at the datanode cooridnator. [Code
reference|https://github.com/apache/hadoop/blob/HDFS-10285/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/StoragePolicySatisfyWorker.java#L243].
Following are the possible errors during block movement:
# Network errors(IOException) - provide retries(may be a hard coded 2 time
retries) if the block storage movement is failed due to network errors. If its
still end up with errors after 2 retries then marked as failure/retry to NN.
# No disk space(IOException) - no retries maked as failure/retry to NN.
# Block pinned - no retries marked as success/no-retry to NN. It is not
possible to relocate this block to another datanode.
# Gen_Stamp mismatches - no retries marked as failure/retry to NN. Could be a
case that the file might have re-opened.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]