tgravescs commented on pull request #35858:
URL: https://github.com/apache/spark/pull/35858#issuecomment-1072423356


   
   
   > all the tasks in the active stage may finish off before a new executor can 
get allocated
   
   This means application master has to know task times so be able to judge 
this, if it judges wrong you just wasted time waiting to allocate as well.  it 
has to know the time to get new node and time of other applications to release 
containers.  Yes I'm sure there are cases it makes sense but there are a lot of 
factors that come into play.
   
   > and Yarn needs to request new nodes
   
   Are you running in some environment where yarn cluster grows dynamically?   
normally yarn doesn't get new nodes, you have a set of nodes and it allocates 
containers on them. I don't understand why starting a container on one would 
take much longer then another other then downloading artifacts, which the 
majority should be in distributed cache and overhead should be small.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to