dungdm93 commented on PR #5375:
URL: https://github.com/apache/iceberg/pull/5375#issuecomment-1202890150
@szehon-ho I'm also thinking about **retry** and **service pool** mechanism
for bulkDelete.
There are 2 ways:
* First option is let caller split delete files into chunks, and pass each
chunk to bulkDelete API as the same way as normal delete.
```java
Iteratable<List<String>> deleteFileChunked = ....
Tasks.foreach(files)
.retry(3)
.executeWith(service)
.run(io::deleteFiles);
````
The drawback of this approach is
* chunk is calculate one in the caller, one in FileIO (as S3FileIO
implementation)
* other cons is if a request fails, there is noway to retry only
failed files
* Second option is bulkDelete API return a `TaskLike` object, then let
caller decided how it should run (no of retry, executor service, error handler,
etc...)
```java
io.deleteFiles(files) // return a TaskLike object
.retry(3)
.executeWith(service)
.onFailure(...)
.execute() // now the task will be executed
```
Second option introduce a breaking change in bulkDelete API. However I still
prefer this because it's using in no-where now.
WDTY?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]