amogh-jahagirdar commented on code in PR #5379:
URL: https://github.com/apache/iceberg/pull/5379#discussion_r933876576


##########
aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java:
##########
@@ -167,6 +168,29 @@ public Map<String, String> properties() {
    */
   @Override
   public void deleteFiles(Iterable<String> paths) throws 
BulkDeletionFailureException {
+    deleteFilesWithBatchSize(paths, awsProperties.s3FileIoDeleteBatchSize());
+  }
+
+  /**
+   * Deletes the given paths in a batched manner with the given executor 
service
+   *
+   * @param paths paths to delete
+   * @param svc executor service to use for the deletion
+   */
+  @Override
+  public void deleteFiles(Iterable<String> paths, ExecutorService svc)
+      throws BulkDeletionFailureException {
+    Iterable<List<String>> pathBatches =
+        Iterables.partition(paths, awsProperties.s3FileIoDeleteBatchSize());
+    Tasks.foreach(pathBatches)
+        .noRetry()

Review Comment:
   
https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/core/retry/RetryMode.html#defaultRetryMode--
   
   So there are a few locations and an environment variable which are attempted 
to be read for determining the default retry mode before defaulting to the 
"LEGACY" setting which is the case we would hit for our default S3 client 
provided in the default aws client factory. The legacy setting performs 3 
retries; the docs mention that no token is subtracted from the token bucket 
when a request gets throttled but considering the max 3 retries and the fact 
that throttles are retried.
   
   Considering that the s3 client is pluggable at runtime, users can also tune 
according to their desire. Although maybe for max retries, and policies we 
could also have aws properties to avoid the user having to create the s3 client 
themselves and pass it in, which we could tackle separately?
   
   Let me know your thoughts and if I'm missing something! 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to