[
https://issues.apache.org/jira/browse/HADOOP-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941466#comment-15941466
]
ASF GitHub Bot commented on HADOOP-14239:
-----------------------------------------
GitHub user kazuyukitanimura opened a pull request:
https://github.com/apache/hadoop/pull/208
HADOOP-14239. S3A Retry Multiple S3 Key Deletion
Hi @steveloughran
Sorry for sending may requests.
I explained the problem here
https://issues.apache.org/jira/browse/HADOOP-14239
This pull requests recursively retries to delete only S3 keys that are
previously failed to delete during the multiple object deletion because
aws-java-sdk retry does not help. If it still fails, it will fall back to the
single deletion.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/bloomreach/hadoop HADOOP-14239
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/hadoop/pull/208.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #208
----
commit 707773b6e14b61b31ecd5473eaafa75dd5217707
Author: kazu <[email protected]>
Date: 2017-03-25T01:29:19Z
HADOOP-14239. S3A Retry Multiple S3 Key Deletion
----
> S3A Retry Multiple S3 Key Deletion
> ----------------------------------
>
> Key: HADOOP-14239
> URL: https://issues.apache.org/jira/browse/HADOOP-14239
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 2.8.0, 3.0.0-alpha1, 3.0.0-alpha2, 2.8.1
> Environment: EC2, AWS
> Reporter: Kazuyuki Tanimura
>
> When fs.s3a.multiobjectdelete.enable == true, It tries to delete multiple S3
> keys at once.
> Although this is a great feature, it becomes problematic when AWS fails
> deleting some S3 keys out of the deletion list. The aws-java-sdk internally
> retries to delete them, but it does not help because it simply retries the
> same list of S3 keys including the successfully deleted ones. In that case,
> all successive retries fail deleting previously deleted keys since they do
> not exist any more. Eventually it throws an Exception and leads to a job
> failure entirely.
> Luckily, the AWS API reports which keys it failed to delete. We should retry
> only for the keys that failed to be deleted from S3A
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]