koldic commented on issue #2620:
URL: https://github.com/apache/hudi/issues/2620#issuecomment-1333488726

   Hi, I have the same problem with slow stages. Firstly it runs well, however 
when more and more small files are inserted it slows, and the `Getting Small 
files` stage with the `Doing partition and writing data` stage takes even an 
hour to finish. 
   I tried to change `hoodie.parquet.small.file.limit` to the smallest possible 
value (1MB) to limit the small files that it collects, but it won´t help. When 
I changed it to 0 it helped, since the stage and collecting small files is 
disabled with this value. Is there any way how to turn on this setting back 
without slowing down all jobs or just try to use offline compaction?
   I also use a simple Index, as the key is random.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to