vamsikarnika commented on PR #621: URL: https://github.com/apache/incubator-xtable/pull/621#issuecomment-2656450069
> Thanks for the PR @vamsikarnika Could you please provide more details about the motivation behind this change and which table formats it impacts? At the moment, it seems to be specific to Iceberg and the some more context would be helpful. This change only impacts Iceberg table format. Currently when a Hudi table is synced to Iceberg, we're observing that sync is taking too long when the retention periods are high this is because iceberg cleaner is taking too long to process all the manifest files and fetch the data files it needs to delete. But at the end we filter out data files from the list of deletable files to make sure we dont delete any data files (since it's responsibility of Hudi i.e source to clean data files). In this approach we've added custom Iceberg cleaner which skips this stage to process manifest files. And allows us to configure the size of the thread pool to process and delete metadata files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@xtable.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org