amogh-jahagirdar commented on code in PR #5379:
URL: https://github.com/apache/iceberg/pull/5379#discussion_r933901614


##########
aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java:
##########
@@ -168,56 +169,42 @@ public Map<String, String> properties() {
   @Override
   public void deleteFiles(Iterable<String> paths) throws 
BulkDeletionFailureException {
     if (awsProperties.s3DeleteTags() != null && 
!awsProperties.s3DeleteTags().isEmpty()) {
+      Set<Tag> deleteTags = awsProperties.s3DeleteTags();
       Tasks.foreach(paths)
           .noRetry()
           .executeWith(executorService())
           .suppressFailureWhenFinished()
           .onFailure(
-              (path, exc) ->
-                  LOG.warn(
-                      "Failed to add delete tags: {} to {}",
-                      awsProperties.s3DeleteTags(),
-                      path,
-                      exc))
-          .run(path -> tagFileToDelete(path, awsProperties.s3DeleteTags()));
+              (path, exc) -> LOG.warn("Failed to add delete tags: {} to {}", 
deleteTags, path, exc))
+          .run(path -> tagFileToDelete(path, deleteTags));
     }
 
-    if (!awsProperties.isS3DeleteEnabled()) {
-      return;
+    if (awsProperties.isS3DeleteEnabled()) {
+      SetMultimap<String, String> bucketToObjects = 
computeBucketToObjects(paths);
+      List<String> failures = 
Collections.synchronizedList(Lists.newArrayList());
+      bucketToObjects
+          .asMap()
+          .entrySet()
+          .forEach(

Review Comment:
   Technically, we could get even more throughput if the data is spread over 
multiple buckets, and parallelize this. But it's important that we do not 
parallelize over the buckets using the deletion threadpool otherwise we could 
easily deadlock. Another pool would need to be defined/used. For now, I'm 
opting for keeping it simple.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to