nsivabalan commented on issue #3739:
URL: https://github.com/apache/hudi/issues/3739#issuecomment-932199950


   awesome, good to know. So here is the thing. 
   Hudi has something called active timeline and archived timeline. Archival 
will kick in for every commit and move some older commits from active to 
archived. This is to keep the no of active commits within bounds. (contents of 
.hoodie folder)
   all operations within hudi will operation only on commits in active 
timeline(inserts, upserts, compaction, cleaning etc). 
   Since we have very aggressive archival configs, archival keeps the no of 
commits in active timeline to a lower no always. And so when cleaner tries to 
check if there are any eligible commits to be cleaned up, it can't find more 
than the configured value. so, if we increase the max commits for archival to 
kick in, lets say 20, until there are 20 commits, archival will not kick in. 
And so cleaner should be able to find more than configured value for cleaning 
to trigger. 
   May be we can add a warning or even throw exception if archival configs are 
aggressive compared to cleaner. bcoz, then silently cleaning becomes moot. 
   
   I have filed a [jira](https://issues.apache.org/jira/browse/HUDI-2511) on 
this regard and will follow up.
   
   Let me know if you need any more details. If not, can we close this issue 
out. 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to