slfan1989 opened a new issue, #658: URL: https://github.com/apache/iceberg-cpp/issues/658
## Background When running `ExpireSnapshots`, iceberg-cpp may need to clean up files that are no longer referenced by expired snapshots. These files can include: - data files - delete files - manifest files - manifest list files - statistics files Currently, file deletion in iceberg-cpp is primarily based on single-file deletion through: ``` FileIO::DeleteFile(...) ``` When a large number of files need to be deleted, deleting them one by one can be inefficient, especially for object stores or remote filesystems where each delete request may involve non-trivial network latency. Java Iceberg already has a similar abstraction: ``` SupportsBulkOperations#deleteFiles ``` This allows cleanup logic to use bulk deletion when supported by the underlying FileIO, and fall back to regular per-file deletion otherwise. iceberg-cpp should consider adding a similar mechanism. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
