[jira] [Commented] (HADOOP-18420) Optimise S3A’s recursive delete to drop successful S3 keys on retry of S3 DeleteObjects

Steve Loughran (Jira) Tue, 04 Oct 2022 09:56:03 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-18420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612686#comment-17612686
 ]


Steve Loughran commented on HADOOP-18420:
-----------------------------------------

I think when deleting fake dirs we should also use an Invoke.once() over 
retrying
* it's only cleanup
* retained markers aren't a problem for the recent clients
* maybe permissions problems are the cause of this

> Optimise S3A’s recursive delete to drop successful S3 keys on retry of S3 
> DeleteObjects
> ---------------------------------------------------------------------------------------
>
>                 Key: HADOOP-18420
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18420
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Daniel Carl Jones
>            Priority: Major
>
> S3A users with large filesystems performing renames or deletes can run into 
> throttling when S3A performs a bulk delete on keys. These are currently 
> batches of 250 
> ([https://github.com/apache/hadoop/blob/c1d82cd95e375410cb0dffc2931063d48687386f/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java#L319-L323]).
> When the bulk delete ([S3 
> DeleteObjects|https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html])
>  fails, it provides a list of keys that failed and why. Today, S3A recovers 
> from throttles by sending the DeleteObjects request again with no change. 
> This can result in additional deletes and counts towards throttling limits.
> Instead, S3A should retry only the keys that failed, limiting the number of 
> mutations against the S3 bucket, and hopefully mitigate errors when deleting 
> a large number of objects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-18420) Optimise S3A’s recursive delete to drop successful S3 keys on retry of S3 DeleteObjects

Reply via email to