Write pipeline does not recover from first node failure.
--------------------------------------------------------

                 Key: HADOOP-3234
                 URL: https://issues.apache.org/jira/browse/HADOOP-3234
             Project: Hadoop Core
          Issue Type: Bug
    Affects Versions: 0.16.0
            Reporter: Raghu Angadi
            Priority: Blocker



While investigating HADOOP-3132, we had a misconfiguration that resulted in 
client writing to first datanode in the pipeline with 15 second write timeout. 
As a result, client breaks the pipeline marking the first datanode (DN1) as the 
bad node. It then restarts the next pipeline pipeline with the rest of the of 
the datanodes. But the next (second) datanode was stuck waiting waiting for the 
the earlier block-write to complete. So the client repeats this procedure until 
it runs out the datanodes and fails the write.

I think this should be a blocker either for 0.16 or 0.17.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to