hudi-bot opened a new issue, #16014:
URL: https://github.com/apache/hudi/issues/16014

   In CleanPlanner, KEEP_LATEST_BY_HOURS is setting earliestCommitToRetain 
value by consider timestamp directly, this will introduce bug if there are out 
of order commits where commit with lower timestamp is completed much later than 
commits with higher timestamps.
   
   This policy's implementation needs to be revisit.
   
   It should basically store the timestamp until which it cleaned let this be 
t1. Next cleaner instant should consider all the partitions and files that are 
modified from the point of t1 until (currentime-x) hours. Whichever files are 
not valid they should be removed.
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-6352
   - Type: Bug


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to