cloud-fan commented on issue #26696: [WIP][SPARK-18886][CORE] Only reset scheduling delay timer if allocated slots are fully utilized URL: https://github.com/apache/spark/pull/26696#issuecomment-561039532 Sufficient discussions are needed for this problem. AFAIK, the issue of delay scheduling is: it has a timer per task set manager, and the timer gets reset as soon as there is one task from this task set manager gets scheduled on a preferred location. A stage may keep waiting for locality and not leverage available nodes in the cluster, if its task duration is shorter than the locality wait time (3 seconds by default). A simple solution is: we never reset the timer. When a stage has been waiting long enough for locality, this stage should not wait for locality anymore. However, this may hurt performance if the last task is scheduled to a non-preferred location, and a preferred location becomes available right after this task gets scheduled, and locality can bring 50x speed up. I don't have a good idea now. cc @JoshRosen @tgravescs @vanzin @jiangxb1987
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
