dungdm93 commented on PR #5375:
URL: https://github.com/apache/iceberg/pull/5375#issuecomment-1202890150

   @szehon-ho I'm also thinking about **retry** and **service pool** mechanism 
for bulkDelete.
   There are 2 ways:
   * First option is let caller split delete files into chunks, and pass each 
chunk to bulkDelete API as the same way as normal delete.
       ```java
       Iteratable<List<String>> deleteFileChunked = ....
       Tasks.foreach(files)
           .retry(3)
           .executeWith(service)
           .run(io::deleteFiles);
       ````
      The drawback of this approach is
         * chunk is calculate one in the caller, one in FileIO (as S3FileIO 
implementation)
         * other cons is if a request fails, there is noway to retry only 
failed files
   * Second option is bulkDelete API return a `TaskLike` object, then let 
caller decided how it should run (no of retry, executor service, error handler, 
etc...)
       ```java
       io.deleteFiles(files) // return a TaskLike object
         .retry(3)
         .executeWith(service)
         .onFailure(...)
         .execute()  // now the task will be executed
       ```
   
   Second option introduce a breaking change in bulkDelete API. However I still 
prefer this because it's using in no-where now.
   WDTY?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to