Nsupyq opened a new pull request #3705:
URL: https://github.com/apache/hadoop/pull/3705


   <!--
     Thanks for sending a pull request!
       1. If this is your first time, please read our contributor guidelines: 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
       2. Make sure your PR title starts with JIRA issue id, e.g., 
'HADOOP-17799. Your PR title ...'.
   -->
   
   ### Description of PR
   This PR fix [HDFS-16322](https://issues.apache.org/jira/browse/HDFS-16322).
   
   The NameNode implementation of ClientProtocol.truncate(...) can cause data 
loss. If dfsclient drops the first response of a truncate RPC call, the retry 
by retry cache will truncate the file again and cause data loss. Specifically, 
under concurrency, after the first execution of truncate(...), concurrent 
requests from other clients may append new data and change the file length. 
When truncate(...) is retried after that, it will truncate the file again, 
which causes data loss.
   
   This patch utilized retry cache to avoid such data loss. When the truncate 
operation is applied for the first time, the status of this operation and the 
return value from server is recorded in retry cache. If this truncate is 
retried, server will directly read from retry cache and perform no operation 
that may cause non-idempotence.
   
   See [HDFS-16322](https://issues.apache.org/jira/browse/HDFS-16322) 
description for more details.
   
   ### How was this patch tested?
   
   We added a new unit test for the idempotency of truncate operation under 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeRetryCache.java.
 This test will issue a truncate operation on an existing file, remove the 
whole file and then issue a retry of previous operation. If the truncate is 
idempotent, the retry should return successfully and does not throw an 
exception saying it truncates on a non-existing file.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to