amogh-jahagirdar commented on code in PR #5379:
URL: https://github.com/apache/iceberg/pull/5379#discussion_r938272527


##########
aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java:
##########
@@ -168,56 +169,42 @@ public Map<String, String> properties() {
   @Override
   public void deleteFiles(Iterable<String> paths) throws 
BulkDeletionFailureException {
     if (awsProperties.s3DeleteTags() != null && 
!awsProperties.s3DeleteTags().isEmpty()) {
+      Set<Tag> deleteTags = awsProperties.s3DeleteTags();
       Tasks.foreach(paths)
           .noRetry()
           .executeWith(executorService())
           .suppressFailureWhenFinished()
           .onFailure(
-              (path, exc) ->
-                  LOG.warn(
-                      "Failed to add delete tags: {} to {}",
-                      awsProperties.s3DeleteTags(),
-                      path,
-                      exc))
-          .run(path -> tagFileToDelete(path, awsProperties.s3DeleteTags()));
+              (path, exc) -> LOG.warn("Failed to add delete tags: {} to {}", 
deleteTags, path, exc))
+          .run(path -> tagFileToDelete(path, deleteTags));
     }
 
-    if (!awsProperties.isS3DeleteEnabled()) {
-      return;
+    if (awsProperties.isS3DeleteEnabled()) {
+      SetMultimap<String, String> bucketToObjects = 
computeBucketToObjects(paths);

Review Comment:
   @jackye1995 I don't think we need a separate API actually, we are already 
given the iterable of paths.
   
    We can still do it lazily by basically using the previous approach and 
constructing the bucket/key from the path and keeping track of when the objects 
for a bucket hits a certain batch size. It's just that using the Tasks 
framework in its current state becomes infeasible since it operates on a list 
and also will wait for completion. We want to just submit the batch deletion 
and move on, and then at the very end check the statuses of all those tasks. So 
using the underlying executor service fits that pattern. Let me know what you 
think. 
    
    Will update the PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to