Github user squito commented on the issue:
https://github.com/apache/spark/pull/17854
It looks to me like this is actually making 2 behavior changes:
1) throttle the requests for new containers, as you describe in your
description
2) drop newly received containers if they are over the limit (not in the
description).
is that correct? Did you find that (2) was necessary as well?
I understand the problem you are describing, but I'm surprised this really
helps the driver scale up to more executors. Maybe this will let the executors
start, but won't it just lead to the driver getting swamped when you've got
2500 executors sending heartbeats and task updates? I'm not saying its bad to
make this improvement, just trying to understand. I'd feel better about just
doing (1) -- if you found (2) is necessary, I would want to think through the
implications a bit more.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]