Github user tgravescs commented on a diff in the pull request:
https://github.com/apache/spark/pull/900#discussion_r14823260
--- Diff: docs/configuration.md ---
@@ -699,6 +699,22 @@ Apart from these, the following properties are also
available, and may be useful
(in milliseconds)
</td>
</tr>
+</tr>
+ <td><code>spark.scheduler.minRegisteredExecutorsRatio</code></td>
+ <td>0</td>
+ <td>
+ Submit tasks only after (registered executors / total expected
executors)
+ is equal to at least this value, which is double between 0 and 1.
+ </td>
+</tr>
+<tr>
+ <td><code>spark.scheduler.maxRegisteredExecutorsWaitingTime</code></td>
+ <td>30000</td>
+ <td>
+ Whatever (registered executors / total expected executors) is reached
--- End diff --
I think we should clarify both of these a bit because its really you start
when either one is hit so I think adding reference to
maxRegisteredExecutorsWaitingTime from the description of
minRegisteredExecutorsRatio would be good.
How about something like below? Note I'm not a doc writer so I'm fine with
changing.
for spark.scheduler.minRegisteredExecutorsRatio:
The minimum ratio of registered executors (registered executors / total
expected executors) to wait for before scheduling begins. Specified as a double
between 0 and 1. Regardless of whether the minimum ratio of executors has been
reached, the maximum amount of time it will wait before scheduling begins is
controlled by config
<code>spark.scheduler.maxRegisteredExecutorsWaitingTime</code> .
Then for spark.scheduler.maxRegisteredExecutorsWaitingTime:
Maximum amount of time to wait for executors to register before scheduling
begins (in milliseconds).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---