Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/2746#issuecomment-59869868
I think the main objection @pwendell and I have against the policy is not
that it is complicated in terms of implementation, but that it is complicated
in terms of its resulting behavior. The main reason for this complexity, then,
is because the policy does not take into account the running time of each task.
In Spark, the most common model of execution is that a stage has many
short-running tasks. Under this policy, we will try to add as many executors as
possible to run all remaining pending tasks in parallel. However, once we do
this, each task likely finishes promptly and we will have many executors that
are now idle. In other words, under this kind of workload, this policy is
basically equivalent to the "add-to-max" policy.
Even for longer-running tasks, we still have a problem. If the add timer
expires multiple times while these tasks are being run, we may end up double
counting the number of executors needed. This is because each round of addition
thinks the remaining pending tasks have not already been accounted for. In this
case, the semantics of how many executors are added relative to how many
pending tasks are outstanding are less clear to me.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]