ankitkpandey opened a new issue #2342: URL: https://github.com/apache/iceberg/issues/2342
Hi, I'm trying to use Spark along with Iceberg to capture differential data, using Spark SQL's MERGE INTO command. But I see around 200 files each with roughly 1mb size. Is there a configuration var which I can use to reduce the number of files? Also, my current use-case doesn't require time-travel and old snapshots, so is there a way to automatically delete them while merging the new data. Maybe just keeping the last snapshot. I have looked extensively through the docs but could only find methods using the Table and Actions API. Any help would be appreciated. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
