Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/2746#issuecomment-59956545
> Even for longer-running tasks, we still have a problem. If the add timer
expires multiple times while these tasks are being run, we may end up double
counting the number of executors needed.
My assumption was that we would keep the "target" number of executors equal
to or less than the number of tasks within the currently running stages, which
would solve this problem, IIUC.
That said, I'll stop arguing for Kay's approach. The exponential approach
sounds very reasonable to me as well.
What I do feel really strongly about is that we shouldn't add new
configuration knobs that are required to get decent utilization for the average
app. Most of the users I've spoken to find a config as simple as
`--executor-cores` difficult to reason about. And even if they completely
grasp how to set it, it's just one other thing they have to think about that
distracts them from the job they're actually trying to complete. This is one
of the core complaints form Alex Rubinsteyn's infamous [blog
post](http://blog.explainmydata.com/2014/05/spark-should-be-better-than-mapreduce.html).
I think some simple rules could remove this burden from the user. For
example, we could cap the number of outstanding executor requests at the number
that would be able to handle all the pending tasks - there's no reason it
should ever need to exceed this. A simple way to explain this to a user would
be that Spark will never try to acquire more resources than it would need to
run all the work it'
s ready to run at this moment.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]