ad1happy2go commented on issue #12116:
URL: https://github.com/apache/hudi/issues/12116#issuecomment-2447201837

   @dataproblems 
   For the Exepriment 2, you can try increasing executor memory overhead. You 
cam also check the GC time under stages if that is a problem. I see you are 
already tuning your GC mentioned on this doc -  
https://hudi.apache.org/docs/tuning-guide/
   
   For the Experiment 3 - i can clearly see the problem is there with 
parallelim. Its just creating 100 tasks and they are running from 1.6h. Can you 
try to increase the parallelism in this case. To do this you have to increase 
the repartition factor along with the dataset which is 100 in your case.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to