[ 
https://issues.apache.org/jira/browse/HDFS-14004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657564#comment-16657564
 ] 

Wei-Chiu Chuang commented on HDFS-14004:
----------------------------------------

Thanks [~ayushtkn] for the root causing the test failure. Good job, because I 
couldn't reproduce it locally.

The whole point of the test was to examine the exact sequence of conditions 
where a client issues recovery, closing the file, and DN completes recovery and 
report back to the NN. In which case, prior to the fix HDFS-10240, NN would 
increment genstamp when DN reports back, despite the file has already closed, 
causing corruption (because of the mismatch of genstamp). After the fix, NN 
rejects closing of the file if the file is under recovery.

 

If you let IBR to continue, this exact sequence demonstrated above can't be 
guaranteed, because NN may receive the block report of the recovered block 
before client requests the closing of the file. I feel like solution #2 is more 
appropriate given the scenario under test.

> TestLeaseRecovery2#testCloseWhileRecoverLease fails intermittently in trunk
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-14004
>                 URL: https://issues.apache.org/jira/browse/HDFS-14004
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Ayush Saxena
>            Assignee: Ayush Saxena
>            Priority: Major
>         Attachments: HDFS-14004-01.patch
>
>
> Reference
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/930/testReport/junit/org.apache.hadoop.hdfs/TestLeaseRecovery2/testCloseWhileRecoverLease/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to