ad1happy2go commented on issue #12116: URL: https://github.com/apache/hudi/issues/12116#issuecomment-2447201837
@dataproblems For the Exepriment 2, you can try increasing executor memory overhead. You cam also check the GC time under stages if that is a problem. I see you are already tuning your GC mentioned on this doc - https://hudi.apache.org/docs/tuning-guide/ For the Experiment 3 - i can clearly see the problem is there with parallelim. Its just creating 100 tasks and they are running from 1.6h. Can you try to increase the parallelism in this case. To do this you have to increase the repartition factor along with the dataset which is 100 in your case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
