squito commented on issue #23951: [SPARK-13704][CORE][YARN] Re-implement RackResolver to reduce resolving time URL: https://github.com/apache/spark/pull/23951#issuecomment-474950599 btw I happened to encounter this issue on a ~1200 node cluster, where this exact problem was causing a 3 minute slowdown (6 minutes in client mode because of the repeated rack resolution). I played around with the script a bit using different number of arguments. I found that with the changes here, it would go down to about 3 seconds (6 seconds in client mode). Furthermore, the script spent almost the entire time in initialization, and took virtually the same amount of time whether it was given 1 argument or 1200. Adjusting the number of arguments to the script would bring the runtime down to ~0.2 seconds. So your change would bring a huge improvement, and all the other stuff I was commenting about w/ the extra resolution in AMRMClientImpl etc. isn't really so important, and I'd just ignore it for now.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
