amogh-jahagirdar commented on code in PR #5379:
URL: https://github.com/apache/iceberg/pull/5379#discussion_r938272527
##########
aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java:
##########
@@ -168,56 +169,42 @@ public Map<String, String> properties() {
@Override
public void deleteFiles(Iterable<String> paths) throws
BulkDeletionFailureException {
if (awsProperties.s3DeleteTags() != null &&
!awsProperties.s3DeleteTags().isEmpty()) {
+ Set<Tag> deleteTags = awsProperties.s3DeleteTags();
Tasks.foreach(paths)
.noRetry()
.executeWith(executorService())
.suppressFailureWhenFinished()
.onFailure(
- (path, exc) ->
- LOG.warn(
- "Failed to add delete tags: {} to {}",
- awsProperties.s3DeleteTags(),
- path,
- exc))
- .run(path -> tagFileToDelete(path, awsProperties.s3DeleteTags()));
+ (path, exc) -> LOG.warn("Failed to add delete tags: {} to {}",
deleteTags, path, exc))
+ .run(path -> tagFileToDelete(path, deleteTags));
}
- if (!awsProperties.isS3DeleteEnabled()) {
- return;
+ if (awsProperties.isS3DeleteEnabled()) {
+ SetMultimap<String, String> bucketToObjects =
computeBucketToObjects(paths);
Review Comment:
@jackye1995 I don't think we need a separate API actually, we are already
given the iterable of paths. This iterable can be the ListObjectsV2Iterable
(which is done as of today in the deletePrefix).
We can still do it lazily by basically using the previous approach and
constructing the bucket/key from the path and keeping track of when the objects
for a bucket hits a certain batch size. It's just that using the Tasks
framework in its current state becomes infeasible since it operates on a list
and also will wait for completion. We want to just submit the batch deletion
and move on, and then at the very end check the statuses of all those tasks. So
using the underlying executor service fits that pattern. Let me know what you
think.
Will update the PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]