HyukjinKwon commented on issue #28168: [SPARK-31395][CORE]reverse preferred location to make schedule more even URL: https://github.com/apache/spark/pull/28168#issuecomment-611519035 What cluster mode do you use? If `xxx.93 (driver)` is a driver, and the problem is that the data is copied into that node, you should separate the driver out of the HDFS cluster or use Yarn cluster mode to evenly distribute in production. What I am saying is, how reserving hosts can solve the problem. The last node xxx.102, executor can be a driver too.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
