tgravescs commented on pull request #35858: URL: https://github.com/apache/spark/pull/35858#issuecomment-1072423356
> all the tasks in the active stage may finish off before a new executor can get allocated This means application master has to know task times so be able to judge this, if it judges wrong you just wasted time waiting to allocate as well. it has to know the time to get new node and time of other applications to release containers. Yes I'm sure there are cases it makes sense but there are a lot of factors that come into play. > and Yarn needs to request new nodes Are you running in some environment where yarn cluster grows dynamically? normally yarn doesn't get new nodes, you have a set of nodes and it allocates containers on them. I don't understand why starting a container on one would take much longer then another other then downloading artifacts, which the majority should be in distributed cache and overhead should be small. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
