Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/1525#issuecomment-50950169
Hey all @kayousterhout asked me to look at this. It doesn't make semantic
sense to expose `spark.scheduler.minRegisteredExecutorsRatio` in standalone
mode because applications in standalone mode to not request a fixed number of
executors a-priori. So my proposal is to remove this feature in standalone mode
alltogether. The semantics of `spark.cores.max` is just the maximum number. In
some cases users run jobs with this set well above the number of available
cores (because they decided to run on a smaller cluster) and it is fully
supported. If we enforce a minimum, it will cause all jobs to hang for those
users.
If users in standalone mode want to wait, for now they should add their own
code to sleep until the number of desired executors appears. They can do this
by calling `sc.getExecutorStorageStatus.size()` or we can also add an API
called `sc.getNumExecutors` that does this.
Also curious what @mateiz and @aarondav think about this. I haven't been
following this patch previously.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---