nsivabalan commented on issue #7910:
URL: https://github.com/apache/hudi/issues/7910#issuecomment-1454046187

   Is its a COW or MOR table? 
   COW:
   if you look at S3 directly, you might find older files too. Hudi after 
rewriting to a newer version of the base file, will not delete the older file 
immediately. Cleaner will take care of it. But your queries/reader will only 
read the latest version of the data file. 
   
   But if you w/ MOR table, its more nuanced. 
   By default only one file group (w/o any log files) are considered for small 
file bin packing. 
   If you wish more files to be picked up, you can try tweaking 
https://hudi.apache.org/docs/configurations/#hoodiemergesmallfilegroupcandidateslimit
 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to