Write pipeline does not recover from first node failure.
--------------------------------------------------------
Key: HADOOP-3234
URL: https://issues.apache.org/jira/browse/HADOOP-3234
Project: Hadoop Core
Issue Type: Bug
Affects Versions: 0.16.0
Reporter: Raghu Angadi
Priority: Blocker
While investigating HADOOP-3132, we had a misconfiguration that resulted in
client writing to first datanode in the pipeline with 15 second write timeout.
As a result, client breaks the pipeline marking the first datanode (DN1) as the
bad node. It then restarts the next pipeline pipeline with the rest of the of
the datanodes. But the next (second) datanode was stuck waiting waiting for the
the earlier block-write to complete. So the client repeats this procedure until
it runs out the datanodes and fails the write.
I think this should be a blocker either for 0.16 or 0.17.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.