slfan1989 opened a new issue, #658:
URL: https://github.com/apache/iceberg-cpp/issues/658

   ## Background
   
   When running `ExpireSnapshots`, iceberg-cpp may need to clean up files that 
are no longer referenced by expired snapshots. These files can include:
   
   - data files
   - delete files
   - manifest files
   - manifest list files
   - statistics files
   
   Currently, file deletion in iceberg-cpp is primarily based on single-file 
deletion through:
   
   ```
   FileIO::DeleteFile(...)
   ```
   
   When a large number of files need to be deleted, deleting them one by one 
can be inefficient, especially for object stores or remote filesystems where 
each delete request may involve non-trivial network latency.
   
   Java Iceberg already has a similar abstraction:
   
   ```
   SupportsBulkOperations#deleteFiles
   ```
   
   This allows cleanup logic to use bulk deletion when supported by the 
underlying FileIO, and fall back to regular per-file deletion otherwise.
   
   iceberg-cpp should consider adding a similar mechanism.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to