Github user tgravescs commented on a diff in the pull request:
https://github.com/apache/spark/pull/19881#discussion_r174126562
--- Diff: docs/configuration.md ---
@@ -1795,6 +1796,19 @@ Apart from these, the following properties are also
available, and may be useful
Lower bound for the number of executors if dynamic allocation is
enabled.
</td>
</tr>
+<tr>
+ <td><code>spark.dynamicAllocation.fullParallelismDivisor</code></td>
+ <td>1</td>
+ <td>
+ By default, the dynamic allocation will request enough executors to
maximize the
+ parallelism according to the number of tasks to process. While this
minimizes the
+ latency of the job, with small tasks this setting wastes a lot of
resources due to
+ executor allocation overhead, as some executor might not even do any
work.
+ This setting allows to set a divisor that will be used to reduce the
number of
+ executors w.r.t. full parallelism
+ Defaults to 1.0
--- End diff --
I think we should define that maxExecutors trumps this setting.
If I have 10000 tasks, divisor 2, I would expect 5000 executors, but if max
executors is 1000, that is all I get.
we should add a test for this interaction as well
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]