parisni commented on issue #6373:
URL: https://github.com/apache/hudi/issues/6373#issuecomment-1218559061

   > But as you can imagine, this is going to result in huge no of file groups 
in general and puts lot of pressure on the system
   
   Do you mean pressure when cleaning or pressure when reading or in general ?
   
   Also insert produces same number of file groups since iam in a case of 
append only table with no new data in the past.
   
   Anyway cleaning is much faster w/o metadata table and that would help to 
allow to specify configure cleaning to work on disk only
   
   On August 17, 2022 9:59:31 PM UTC, Sivabalan Narayanan ***@***.***> wrote:
   >wrt bulk_insert, I understand cleaning is not going to be any use. coz, 
every new commit goes into new file groups. Hence there won't be any file 
groups which will have more file slices which might be eligible for cleaning. 
But as you can imagine, this is going to result in huge no of file groups in 
general and puts lot of pressure on the system. 
   >
   >
   >-- 
   >Reply to this email directly or view it on GitHub:
   >https://github.com/apache/hudi/issues/6373#issuecomment-1218528854
   >You are receiving this because you authored the thread.
   >
   >Message ID: ***@***.***>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to