Kazuyuki Tanimura created HADOOP-14239:
------------------------------------------

             Summary: S3A Retry Multiple S3 Key Deletion
                 Key: HADOOP-14239
                 URL: https://issues.apache.org/jira/browse/HADOOP-14239
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs/s3
    Affects Versions: 3.0.0-alpha2, 3.0.0-alpha1, 2.8.0, 2.8.1
         Environment: EC2, AWS
            Reporter: Kazuyuki Tanimura


When fs.s3a.multiobjectdelete.enable == true, It tries to delete multiple S3 
keys at once.

Although this is a great feature, it becomes problematic when AWS fails 
deleting some S3 keys out of the deletion list. The aws-java-sdk internally 
retries to delete them, but it does not help because it simply retries the same 
list of S3 keys including the successfully deleted ones. In that case, all 
successive retries fail deleting previously deleted keys since they do not 
exist any more. Eventually it throws an Exception and leads to a job failure 
entirely.

Luckily, the AWS API reports which keys it failed to delete. We should retry 
only for the keys that failed to be deleted from S3A



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to