Client logic for 1st phase and 2nd phase failover are different
---------------------------------------------------------------

                 Key: HDFS-1237
                 URL: https://issues.apache.org/jira/browse/HDFS-1237
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs client
    Affects Versions: 0.20.1
            Reporter: Thanh Do


- Setup:
number of datanodes = 4
replication factor = 2 (2 datanodes in the pipeline)
number of failure injected = 2
failure type: crash
Where/When failures happen: There are two scenarios: First, is when two 
datanodes crash at the same time in the first phase of the pipeline. Second, 
when two datanodes crash at the second phase of the pipeline.
 
- Details:
 
In this setting, we set the datanode's heartbeat message to be 1 second to the 
namenode.
This is just to show that if the NN has declared a datanode dead, the DFSClient 
will not
get that dead datanode from the server. Here's our observations:
 
1. If the two crashes happen during the first phase,
the client will wait for 6 seconds (which is enough time for NN to detect
dead datanodes in this setting). So after waiting for 6 seconds, the client
asks the NN again, and the NN is able to give a fresh two healthy datanodes.
and the experiment is successful!
 
2. BUT, If the two crashes happen during the second phase (e.g. renameTo).
The client *never waits for 6 secs* which implies that the logic of the client
for 1st phase and 2nd phase are different.  What happens here, DFSClient gives
up and (we believe) it never falls back to the outer while loop to contact the
NN again.  So the two crashes in this second phase are not masked properly,
and the write operation fails. 
 
In summary, scenario (1) is good, but scenario (2) is not successful. This shows
a bad retry logic during the second phase.  (We note again that we change
the setup a bit by setting the DN's hearbeat interval to 1 second.  If we use
the default interval, scenario (1) will fail too because the NN will give the
client the same dead datanodes).

This bug was found by our Failure Testing Service framework:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
For questions, please email us: Thanh Do (than...@cs.wisc.edu) and
Haryadi Gunawi (hary...@eecs.berkeley.edu)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to