garyli1019 commented on issue #800: Performance tuning URL: https://github.com/apache/incubator-hudi/issues/800#issuecomment-514332156 Sure, I can try that. The delta data was very dirty for sure(many incoming old data need to rewrite existing parquet files). The task duration seems to increase exponentially with the shuffle read size. Also, this job is not releasing executors when the tasks were finished. e.g. I gave this job 100 executors. Two tasks are running for 20 hours and others finished in minutes. This job will keep 100 executors for 20 hours. Is that possible to improve this? 
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
