[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-76289582 @li-zhihui I didn't realize the `spark.scheduler.minRegisteredResourcesRatio` defaulted to 0 in standalone mode. Given that I think it's safe to remove support for it, since it will only affect people who explicitly set this. For 1.4 we can deprecate it in standalone mode and for 1.5+ we can remove it. Given that would you mind closing this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user li-zhihui closed the pull request at: https://github.com/apache/spark/pull/1462 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user li-zhihui commented on a diff in the pull request: https://github.com/apache/spark/pull/1462#discussion_r25405036 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala --- @@ -62,6 +62,11 @@ private[spark] class MesosSchedulerBackend( var classLoader: ClassLoader = null + if (!sc.getConf.getOption(spark.scheduler.minRegisteredResourcesRatio).isEmpty) { --- End diff -- Done, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user li-zhihui commented on a diff in the pull request: https://github.com/apache/spark/pull/1462#discussion_r25405040 --- Diff: docs/configuration.md --- @@ -831,7 +831,7 @@ Apart from these, the following properties are also available, and may be useful td0/td td The minimum ratio of registered resources (registered resources / total expected resources) -(resources are executors in yarn mode, CPU cores in standalone mode) +(resources are executors in yarn mode, CPU cores in standalone mode and coarse mesos mode) --- End diff -- Done, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-76123507 Add some new commits to fix code conflict and some issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1462#discussion_r25022847 --- Diff: docs/configuration.md --- @@ -831,7 +831,7 @@ Apart from these, the following properties are also available, and may be useful td0/td td The minimum ratio of registered resources (registered resources / total expected resources) -(resources are executors in yarn mode, CPU cores in standalone mode) +(resources are executors in yarn mode, CPU cores in standalone mode and coarse mesos mode) --- End diff -- coarse-grained mesos mode --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-75130879 @pwendell @kayousterhout what is the verdict of this? Should we just remove the ratio altogether? What about backward compatibility? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1462#discussion_r25022830 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala --- @@ -62,6 +62,11 @@ private[spark] class MesosSchedulerBackend( var classLoader: ClassLoader = null + if (!sc.getConf.getOption(spark.scheduler.minRegisteredResourcesRatio).isEmpty) { --- End diff -- `sc.conf.contains(...)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-74011208 This PR has gone stale. Do we want to update it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-74014949 @pwendell Do we need the feature in mesos mode? I am pleasure to update it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-54694587 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-53229624 Rollback old commits, add a new commit base on latest code. @pwendell @tgravescs @kayousterhout @tnachen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-51024346 @pwendell I created https://github.com/apache/spark/pull/1762 for your judgment of what the right thing to do here is! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-51026034 Okay let me run it by some more people tomorrow and figure it out. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-51019363 @pwendell removing support for this in standalone mode is just keeping totalExpectedExecutors zero. https://github.com/li-zhihui/spark/commit/fa5af15d982e86c880302e8b9ef38645944be13f I think it just make user use spark more easily. (And sometimes user isn't aware of the problem unless we show them by docs or conf). Anyway, I think you are the authority on how to make the tradeoff. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user tnachen commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-50404462 Are we trying to figure out the top level issue of the race before we get this in? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-50423094 @tnachen I add a new PR to try to fix the issue, https://github.com/apache/spark/pull/1525 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-49616178 cc @kayousterhout as I think she is more familiar with standalone mode and scheduler details. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-49643818 If we change the name of the config you'll need to upmerge as https://github.com/apache/spark/pull/634 set some defaults on the yarn side. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-49661602 @tgravescs I actually mentioned this race condition in the previous PR: https://github.com/apache/spark/pull/900#diff-for-comment-14205738 . In the future we should try to be more careful about merging things that have un-replied to comments (I'm about to send an email to the dev list about this). @li-zhihui if someone points out a problem in a pull request you submit, the expectation is that it will be fixed when you reply to the comment. Can you please submit a new pull request that fixes the race condition with standalone mode, before we proceed with adding this functionality to Mesos mode? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-49669188 Sorry @kayousterhout I totally missed it, I didn't read close enough and thought it had been addressed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-49669448 No worries I should have just made a top-level comment -- the code-level comments are easy to miss once they get compressed because the code is out of date. On Mon, Jul 21, 2014 at 2:35 PM, Tom Graves notificati...@github.com wrote: Sorry @kayousterhout https://github.com/kayousterhout I totally missed it, I didn't read close enough and thought it had been addressed. â Reply to this email directly or view it on GitHub https://github.com/apache/spark/pull/1462#issuecomment-49669188. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-49690527 Sorry @tgravescs @kayousterhout I am not aware of the issue's seriousness at that time. thanks @kayousterhout for your coach. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-49572335 @tgravescs I tested it on a cluster with mesos-0.18.1(fine-grained and coarse-grained), it work well. I think you are right. In fact, user don't have any idea about expected executors in mesos mode (and standalone mode), they only expect CPU cores(codespark.cores.max/code). So we need check total registered executors' cores and codespark.cores.max/code to judge whether SchedulerBackend is ready, and modify codespark.scheduler.minRegisteredExecutorsRatio/code to codespark.scheduler.minRegisteredResourcesRatio/code. How do you think about it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-49434048 Did you test it on a cluster? I unfortunately don't have access to one and am not an expert on mesos. Is there a race condition between when the scheduler backend increments totalExpectedExecutors and when we actually do the check to see if we have enough? Meaning in this case we increment it as they come in as resource offers, we start the executors, it registers, then we do the check, so its possible once we get 1 in that totalExpectedExecutors =1 * minRegisteredRatio (say 100) == executorActor.size() (1) even though we really expect say 10 to come in? I think the same thing actually applies in standalone mode too but I missed it in previous pr. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-49284642 @tgravescs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---