[ https://issues.apache.org/jira/browse/HADOOP-16430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Akira Ajisaka resolved HADOOP-16430. ------------------------------------ Resolution: Fixed > S3AFilesystem.delete to incrementally update s3guard with deletions > ------------------------------------------------------------------- > > Key: HADOOP-16430 > URL: https://issues.apache.org/jira/browse/HADOOP-16430 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.2.0, 3.3.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Major > Fix For: 3.3.0 > > Attachments: Screenshot 2019-07-16 at 22.08.31.png > > > Currently S3AFilesystem.delete() only updates the delete at the end of a > paged delete operation. This makes it slow when there are many thousands of > files to delete ,and increases the window of vulnerability to failures > Preferred > * after every bulk DELETE call is issued to S3, queue the (async) delete of > all entries in that post. > * at the end of the delete, await the completion of these operations. > * inside S3AFS, also do the delete across threads, so that different HTTPS > connections can be used. > This should maximise DDB throughput against tables which aren't IO limited. > When executed against small IOP limited tables, the parallel DDB DELETE > batches will trigger a lot of throttling events; we should make sure these > aren't going to trigger failures -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org