Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/663#issuecomment-42353112
Ah, misunderstood how ThreadPoolFactory worked.
To take a step back, we expect to have a burst (max low thousands) of
executors to run at the beginning of an application. Then the only time we'll
use the thread pool is when executors fail and we need to start new ones.
So the behavior we want is to be able to scale the pool up to a large
number of threads, and then shrink it after the initial burst is done. We'd
probably like to limit this number of threads at some number, but keep queueing
up containers to start.
I think the right solution is probably to create a pool with a large core
size (500?), and then, after the initial containers are launched, bring this
down to something more manageable like 10.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---