[GitHub] [spark] bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized
bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized URL: https://github.com/apache/spark/pull/26696#issuecomment-571476226 @tgravescs @cloud-fan Thanks for taking a look! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized
bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized URL: https://github.com/apache/spark/pull/26696#issuecomment-568617329 I think I came up with a much better approach [here](https://github.com/apache/spark/compare/master...bmarcott:nmarcott-fulfill-slots-2?expand=1). It avoids trying to simulate scheduling logic like the previous approach, which had a lot of discrepancies as well as high time complexity. This change makes the `TaskSetManager.resourceOffer` return an explicit boolean saying whether it rejected the resource due to delay scheduling or not An `isAllFreeResources` boolean parameter was also added to `TaskSchedulerImpl.resourceOffers` which tells the scheduler the offers represent all free resources as opposed to a single resource. Then, timers will be reset only if there were no resources rejected due to scheduling delay since the last offer which included all free resources. example event sequence: offer 1 resource that was rejected - no timer reset offer all resources with no rejects - timer is reset offer 1 resource, no reject - timer is reset offer 1 resource that was rejected - no timer reset offer 1 resource, no reject - no timer reset because previous offer was rejected Here is a breakdown of when resources are offered (not changed): Single executors are offered when: - a task finishes - new executor launched All free resources are offered when: - continually every spark.scheduler.revive.interval seconds (default 1 second) - on taskset submit - when a task fails - speculationScheduler on fixed delay revives if there are any speculative tasks - executor lost One remaining case that isn't handled: Before any "all free resource" offer, all free resources are offered one by one and all not rejected. This case should reset the timer, but won't with current impl. Thoughts or know of any other issues with this approach? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized
bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized URL: https://github.com/apache/spark/pull/26696#issuecomment-568617329 I think I came up with a much better approach [here](https://github.com/apache/spark/compare/master...bmarcott:nmarcott-fulfill-slots-2?expand=1). It avoids trying to simulate scheduling logic like the previous approach, which had a lot of discrepancies as well as high time complexity. This change makes the `TaskSetManager.resourceOffer` return an explicit boolean saying whether it rejected the resource due to delay scheduling or not An `isAllFreeResources` boolean parameter was also added to `TaskSchedulerImpl.resourceOffers` which tells the scheduler the offers represent all free resources as opposed to a single resource. Then, timers will be reset only if there were no resources rejected due to scheduling delay since the last offer which included all free resources. example case: offer 1 resource that was rejected - no timer reset offer all resources with no rejects - timer is reset offer 1 resource, no reject - timer is reset offer 1 resource that was rejected - no timer reset offer 1 resource, no reject - no timer reset because previous offer was rejected Here is a breakdown of when resources are offered (not changed): Single executors are offered when: - a task finishes - new executor launched All free resources are offered when: - continually every spark.scheduler.revive.interval seconds (default 1 second) - on taskset submit - when a task fails - speculationScheduler on fixed delay revives if there are any speculative tasks - executor lost One remaining case that isn't handled: Before any "all free resource" offer, all free resources are offered one by one and all not rejected. This case should reset the timer, but won't with current impl. Thoughts or know of any other issues with this approach? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized
bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized URL: https://github.com/apache/spark/pull/26696#issuecomment-568617329 [Here is an approach](https://github.com/apache/spark/compare/master...bmarcott:nmarcott-fulfill-slots-2?expand=1) which avoids trying to simulate any type of scheduling logic. This change introduces a `isAllFreeResources` boolean parameter for `TaskSchedulerImpl.resourceOffers` which tells the scheduler the offers represent all free resources as opposed to a single resource. Timers will be reset only if there were no resources rejected due to scheduling delay since the last offer which included all free resources. example case: offer 1 resource that was rejected - no timer reset offer all resources with no rejects - timer is reset offer 1 resource, no reject - timer is reset offer 1 resource that was rejected - no timer reset offer 1 resource, no reject - no timer reset because previous offer was rejected Here is a breakdown of when resources are offered (not changed): Single executors are offered when: - a task finishes - new executor launched All free resources are offered when: - continually every spark.scheduler.revive.interval seconds (default 1 second) - on taskset submit - when a task fails - speculationScheduler on fixed delay revives if there are any speculative tasks - executor lost One remaining case that isn't handled: Before any "all free resource" offer, all free resources are offered one by one and all not rejected. This case should reset the timer, but won't with current impl. Thoughts or know of any other issues with this approach? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized
bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized URL: https://github.com/apache/spark/pull/26696#issuecomment-568617329 [Here is an approach](https://github.com/apache/spark/compare/master...bmarcott:nmarcott-fulfill-slots-2?expand=1) which avoids trying to simulate any type of scheduling logic. This change resets locality timers only when no resources have been rejected due to delay scheduling. The complication in this change is that resetting the locality timer is dependent on when `TaskSchedulerImpl.resourceOffers` is called with all resources, rather than a single resource. That is guaranteed to happen only as frequent as `spark.scheduler.revive.interval` which defaults to 1s. This means the `locality.wait.time` wouldn't we completely respected. Ways around this would be to always offer all free resources on calls to `TaskSchedulerImpl.resourceOffers`. Thoughts or know of any other issues with this approach? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized
bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized URL: https://github.com/apache/spark/pull/26696#issuecomment-562798874 inline responses for @tgravescs > arguably you should never reset back to a lower level. Yea I was thinking the same thing, but I decided not to touch that part yet, since I have mostly thought about **when** to reset. > Note on the fairscheduler side - I think you do have the issue Kay brought up in v2 of comment: https://issues.apache.org/jira/browse/SPARK-18886?focusedCommentId=15931009&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15931009 I do not believe this PR has that issue, as once the non-greedy taskset has waited its locality wait time it will increase locality level since it wasn't able to take advantage of its slots. Her proposed v1 solution had that problem since it only increased locality level if there were slots not being used by **any taskset**. > I assume it fixes the performance on #26633? thanks for adding more unit tests, did run any manual tests as well? I haven't looked in depth at their problem, but it does seem it would solve most of the problem. I have not yet had the chance to run manual tests. > so looking at the fairscheduler side of this it brings me back to the real question of what is the definition of a task wait. If you look at Kay's example: I did not catch your point from the comment that started with above quote. This PR handles the case she mentioned since locality level would not increase because it is utilizing all its available slots (timers will keep getting reset). That means when the taskset is offered the new non-local resource, it wouldn't immediately take advantage of it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized
bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized URL: https://github.com/apache/spark/pull/26696#issuecomment-562798874 inline responses for @tgravescs > arguably you should never reset back to a lower level. Yea I was thinking the same thing, but I decided not to touch that part yet, since I have mostly thought about **when** to reset. > Note on the fairscheduler side - I think you do have the issue Kay brought up in v2 of comment: https://issues.apache.org/jira/browse/SPARK-18886?focusedCommentId=15931009&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15931009 I do not believe this PR has that issue, as once the non-greedy taskset has waited its locality wait time it will increase locality level since it wasn't able to take advantage of its slots. Her proposed v1 solution had that problem since it only increased locality level if there were slots not being used by **any** taskset. > I assume it fixes the performance on #26633? thanks for adding more unit tests, did run any manual tests as well? I haven't looked in depth at their problem, but it does seem it would solve most of the problem. I have not yet had the chance to run manual tests. > so looking at the fairscheduler side of this it brings me back to the real question of what is the definition of a task wait. If you look at Kay's example: I did not catch your point from the comment that started with above quote. This PR handles the case she mentioned since locality level would not increase because it is utilizing all its available slots (timers will keep getting reset). That means when the taskset is offered the new non-local resource, it wouldn't immediately take advantage of it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized
bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized URL: https://github.com/apache/spark/pull/26696#issuecomment-562798874 inline responses for @tgravescs > arguably you should never reset back to a lower level. Yea I was thinking the same thing, but I decided not to touch that part yet, since I have mostly thought about **when** to reset. > Note on the fairscheduler side - I think you do have the issue Kay brought up in v2 of comment: https://issues.apache.org/jira/browse/SPARK-18886?focusedCommentId=15931009&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15931009 I do not believe this PR has that issue, as once the non-greedy taskset has waited its locality wait time it will increase locality level since it wasn't able to take advantage of its slots. Her proposed v1 solution had that problem since it only increased locality level if there were slots not being used by **any job**. > I assume it fixes the performance on #26633? thanks for adding more unit tests, did run any manual tests as well? I haven't looked in depth at their problem, but it does seem it would solve most of the problem. I have not yet had the chance to run manual tests. > so looking at the fairscheduler side of this it brings me back to the real question of what is the definition of a task wait. If you look at Kay's example: I did not catch your point from the comment that started with above quote. This PR handles the case she mentioned since locality level would not increase because it is utilizing all its available slots (timers will keep getting reset). That means when the taskset is offered the new non-local resource, it wouldn't immediately take advantage of it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org