ZanderXu created HDFS-16601:
-------------------------------
Summary: Failed to replace a bad datanode on the existing pipeline
due to no more good datanodes being available to try
Key: HDFS-16601
URL: https://issues.apache.org/jira/browse/HDFS-16601
Project: Hadoop HDFS
Issue Type: Bug
Reporter: ZanderXu
Assignee: ZanderXu
In our production environment, we found a bug and stack like:
{code:java}
java.io.IOException: Failed to replace a bad datanode on the existing pipeline
due to no more good datanodes being available to try. (Nodes:
current=[DatanodeInfoWithStorage[127.0.0.1:59687,DS-b803febc-7b22-4144-9b39-7bf521cdaa8d,DISK],
DatanodeInfoWithStorage[127.0.0.1:59670,DS-0d652bc2-1784-430d-961f-750f80a290f1,DISK]],
original=[DatanodeInfoWithStorage[127.0.0.1:59670,DS-0d652bc2-1784-430d-961f-750f80a290f1,DISK],
DatanodeInfoWithStorage[127.0.0.1:59687,DS-b803febc-7b22-4144-9b39-7bf521cdaa8d,DISK]]).
The current failed datanode replacement policy is DEFAULT, and a client may
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy'
in its configuration.
at
org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1418)
at
org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1478)
at
org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1704)
at
org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1605)
at
org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1587)
at
org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1371)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:674)
{code}
And the root cause is that DFSClient cannot perceive the exception of
TransferBlock during PipelineRecovery. If failed during TransferBlock, the
DFSClient will retry all datanodes in the cluster and then failed.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]