Ytimetravel opened a new issue, #11647:
URL: https://github.com/apache/hudi/issues/11647

   **Describe the problem you faced**
   
   Dear community, when using the cow table, I found that it may trigger an OOM 
error in driver when clean. I find that this is due to cow table rarely 
updating data, so there are usually no files that need to be cleaned. However, 
the clean operation is still called every time data is written, and when the 
number of files in the list reaches a certain threshold, it may cause OOM. 
   
   Case like:
   instant1  instant2  instant3  instant4(scan 1、2、3、4 but no clean)
   instant1  instant2  instant3  instant4  instant5 (scan 1、2、3、4、5 but no 
clean)
   
   Is it possible to add an empty clean instant to mark it, to avoid rescanning 
every time, or are there any better ideas? Looking forward to your reply~ 
   
   **Environment Description**
   
   * Hudi version :0.14.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to