WTa-hash commented on issue #2229: URL: https://github.com/apache/hudi/issues/2229#issuecomment-722808651
> Just want to make sure if you understood compaction vs cleaning in Hudi. Why do you want to wait for 30 days before running compaction ? Do you mean cleaning the old versions 30 days back ? Hello. I was just giving an example of one of my concerns with the number of commits to trigger compaction approach. Here is my scenario: 1) Spark structured stream queries Kinesis. 2) Spark processed a batch 3) Batch contains data from table X, Y, Z. My foreachBatch logic will group these records by table and Hudi will run 3 times using foreach table loop where Hudi will process each table sequentially. Hudi has INLINE_COMPACT_NUM_DELTA_COMMITS_PROP set to 10. 4) Next 9 batches have data for table X and Y, but none of these batches contain data for table Z Does this mean compaction will run for Table X and Y to compact data from batch 1 to 10, and Table Z will not be compact? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
