dramaticlly commented on a change in pull request #4052:
URL: https://github.com/apache/iceberg/pull/4052#discussion_r800302779



##########
File path: aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java
##########
@@ -309,6 +317,14 @@ public String s3FileIoSseKey() {
     return s3FileIoSseKey;
   }
 
+  public int s3DeletionBatchSize() {
+    return s3DeletionBatchSize;
+  }
+
+  public void setS3DeletionBatchSize(int deletionBatchSize) {
+    this.s3DeletionBatchSize = deletionBatchSize;

Review comment:
       curious to know, shall we allow it also to be set from aws properties  
with string k/v Map?

##########
File path: aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java
##########
@@ -213,13 +213,20 @@
    * Enables eTag checks for S3 PUT and MULTIPART upload requests.
    */
   public static final String S3_CHECKSUM_ENABLED = "s3.checksum-enabled";
+
+  /**
+   * Sets the size for batch deletion.
+   */
+  public static final int S3_DELETION_BATCH_SIZE_DEFAULT = 10000;

Review comment:
       nit, might worth putting a link to s3 documentation to reference where 
is 1000 come from 
https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html

##########
File path: aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java
##########
@@ -94,6 +106,53 @@ public void deleteFile(String path) {
     client().deleteObject(deleteRequest);
   }
 
+  /**
+   * Performs a batch deletion of the given paths. This implementation does 
not require
+   * that the passed in paths refer to objects which are in the same bucket.
+   * @param paths poths to delete

Review comment:
       nit: I would probably worded a bit differently like `Performs a batch 
deletion of given paths grouped by the bucket. Deletion will trigger whenever 
configured batch size reached for given bucket or for each bucket at the end.
   
   also probably typo in param comment? `s/poths/paths`

##########
File path: api/src/main/java/org/apache/iceberg/io/FileIO.java
##########
@@ -68,6 +68,15 @@ default void deleteFile(OutputFile file) {
   default void initialize(Map<String, String> properties) {
   }
 
+  /**
+   * Delete the files at the given path.
+   */
+  default void deleteFiles(Iterable<String> pathsToDelete) {

Review comment:
       I guess it's fine to leave it as Iterable as I feel the java set also 
implements the Iterable interface, so if needed the consumer of this method can 
pass the set for de-duplication purpose. If there's a need to preserve the 
ordering, then they can pass other iterable like list as well. From correctness 
perspective, the concrete file system shall be handle the duplicate file path 
for deletion.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to