[GitHub] [spark] bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#issuecomment-591813279 No problem. I believe what's remaining is just some tests on TaskSetManager. Will remove WIP once I add. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#issuecomment-590155884 checking back in, who is up for another round of review :)? @tgravescs I added the check for pending speculative tasks, but haven't added a test for it yet. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#issuecomment-583947229 > Is it possible to centralize the delay scheduling code? Now it's in both `TaskSchedulerImpl` and `TaskSetManager`, which makes it a bit hard to understand as you need to think about the interactions between them. I am not sure a good way to centralize because 1. The TSM is called multiple times with various offers and we need to keep track of what happened across those calls 2. TSM today doesn't differentiate whether it didn't launch a task due to blacklisting or due to delay scheduling, hence the new boolean returned. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#issuecomment-581758371 @Ngone51 thanks for taking a look. I wouldn't say it puts one thing before another. The idea is to move closer to what a reasonable definition of scheduling **delay** (locality.wait time) is: how long you want to sacrifice "fairness" (using your fair share of resources) in favor of data locality. I'll do the touch ups you mentioned after there is enough support on the design. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#issuecomment-581102206 @tgravescs I updated with the new config to switch to legacy behavior as well as some doc/variable renaming. After reading more through `SchedulerBackend` impls: 1. decided to assume resource offers are full resource by default, to match previous behavior for any `SchedulerBackend`s not touched in this PR. 2. I only found `MesosFineGrainedSchedulerBackend` which doesn't use `CoarseGrainedSchedulerBackend` and decided not to touch it since it is deprecated and from the design looks like "all free resources" can't be tracked. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#issuecomment-580842627 yea, I'll try to take a look this weekend. Thanks for checking in. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#issuecomment-578006160 @cloud-fan thanks for the input @tgravescs yep, that sequence and explanation matches with my understanding > I was also wondering about leaving the old logic in there but configured off by default. While I don't like it, it would be safer if we haven't thought of some corner case and user could turn it back on if necessary, thoughts? I'm not opposed to adding a new config that disables new way and enables old way, but is this standard practice in spark for the more risky changes? I would feel more comfortable knowing there was a fallback too ;) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#issuecomment-576995950 That may be reasonable to do, but I'd like to avoid adding more tracking/accounting if possible. I already don't like the boolean map I added. What do you think about adding back the old condition of "must have launched a task". so the new condition for reset would be must have no rejects **and launch a task** on an all resource offer This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#issuecomment-575011886 @tgravescs Thanks for the comments. > so please update the description with information from the other PR Which one of my snippets from the previous PR was most clear to you? I can put that one in the description. > One thing I don't think I like Really good point on this scenario. It's bad even for the same scenario that you described but where the all resource offer has only 1 executor, and the first taskset accepts it. Let me know if you have any good ideas here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org