amogh-jahagirdar opened a new pull request #4052:
URL: https://github.com/apache/iceberg/pull/4052


   Starting a draft PR for performing batch deletion for S3 objects. Related 
issue: https://github.com/apache/iceberg/issues/4012
   
   This can be useful for the expire snapshots and remove orphan files 
operations. In this PR we update the FileIO interface and add a S3 
implementation to perform the batch removal. 
   
   In other PRs we can tackle updating the actions for expiring snapshots and 
removing orphan files. My thoughts there are we do have a separate batching 
mechanism within the action implementation (for an action a user would specify 
a batch size (defaulting to 1). Tasks looks like it is generic and for the 
parameter we could partition the given input list into batches, and then we 
could pass in a function which accepts a list of strings for performing the 
batch deletion. This would only be done in the case the batch size is greater 
than 1. Any thoughts?
   
   @szehon-ho @dramaticlly


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to