tgravescs commented on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized URL: https://github.com/apache/spark/pull/26696#issuecomment-562653311 so looking at the fairscheduler side of this it brings me back to the real question of what is the definition of a task wait. If you look at Kay's example: > This relates to your idea because of the following situation: suppose you have a cluster with 10 machines, the job has locality preferences for 5 of them (with ids 1, 2, 3, 4, 5), and fairness dictates that the job can only use 3 slots at a time (e.g., it's sharing equally with 2 other jobs). Suppose that for a long time, the job has been running tasks on slots 1, 2, and 3 (so local slots). At this point, the times for machines 6, 7, 8, 9, and 10 will have expired, because the job has been running for a while. But if the job is now offered a slot on one of those non-local machines (e.g., 6), the job hasn't been waiting long for non-local resources: until this point, it's been running it's full share of 3 slots at a time, and it's been doing so on machines that satisfy locality preferences. So, we shouldn't accept that slot on machine 6 – we should wait a bit to see if we can get a slot on 1, 2, 3, 4, or 5. She is essentially saying that I want my task to wait a bit when a slot becomes available and a task could be scheduled on it. This mostly comes into play when you have multiple tasksets of jobs being scheduled, if you only have a single job using fifo with one taskset it really doesn't matter. If you have the case Kay mentions and the next time you get a slot offered to you is over the delay when the task set fell all the way back to ANY locality, it would immediately schedule on that and it wouldn't delay at all. Now if you strictly go by the definition of how long a task waiting that would be fine, but if its from when it could have been scheduled that is to aggressive.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
