vamsikarnika commented on PR #621:
URL: https://github.com/apache/incubator-xtable/pull/621#issuecomment-2656450069

   > Thanks for the PR @vamsikarnika Could you please provide more details 
about the motivation behind this change and which table formats it impacts? At 
the moment, it seems to be specific to Iceberg and the some more context would 
be helpful.
   
   This change only impacts Iceberg table format. Currently when a Hudi table 
is synced to Iceberg, we're observing that sync is taking too long when the 
retention periods are high this is because iceberg cleaner is taking too long 
to process all the manifest files and fetch the data files it needs to delete. 
But at the end we filter out data files from the list of deletable files to 
make sure we dont delete any data files (since it's responsibility of Hudi i.e 
source to clean data files). In this approach we've added custom Iceberg 
cleaner which skips this stage to process manifest files. And allows us to 
configure the size of the thread pool to process and delete metadata files
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@xtable.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to