Manoj Govindassamy created HDFS-11749:
-----------------------------------------
Summary: Ongoing file write fails when its pipeline DataNode is
pulled out for maintenance
Key: HDFS-11749
URL: https://issues.apache.org/jira/browse/HDFS-11749
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs
Affects Versions: 3.0.0-alpha1
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy
HDFS Maintenance State HDFS-7877 is suppose to put DataNodes first to
ENTERING_MAINTENANCE state and when all blocks are sufficiently replicated, DNs
transition to IN_MAINTENANCE state. Also, the UNDER_CONSTRUCTION files and any
ongoing writes to these files should not fail by the maintenance state
transition. But, in few runs I have seen the ongoing writes to open files fail
as its pipeline DNs are pulled out via Maintenance State feature. Test case is
attached.
{code}
java.io.IOException: Failed to replace a bad datanode on the existing pipeline
due to no more good datanodes being available to try. (Nodes:
current=[DatanodeInfoWithStorage[127.0.0.1:49306,DS-eeca7153-fba2-4f2e-a044-0a292fc6dc6d,DISK],
DatanodeInfoWithStorage[127.0.0.1:49302,DS-a5adf33c-81d0-413b-879c-9c4d9acbb72a,DISK]],
original=[DatanodeInfoWithStorage[127.0.0.1:49306,DS-eeca7153-fba2-4f2e-a044-0a292fc6dc6d,DISK],
DatanodeInfoWithStorage[127.0.0.1:49302,DS-a5adf33c-81d0-413b-879c-9c4d9acbb72a,DISK]]).
The current failed datanode replacement policy is DEFAULT, and a client may
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy'
in its configuration.
at
org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1299)
at
org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1365)
at
org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1545)
at
org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1460)
at
org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1443)
at
org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1251)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:668)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]