Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/17854
to slow down launching you could just set
spark.yarn.containerLauncherMaxThreads to be smaller. that isn't guaranteed
but neither is this really. Just an alternative or something you can do
immediately.
I don't see any reason to drop the containers yarn gives you unless you are
really slowing it down such that is wasting a lot of resource, it will just
cause more overhead. Asking for less to start with could be ok although again
its just going to slow down the entire thing. How long is it taking you to
launch these?
Also can you put some more details about exactly what you are seeing? I
assume its getting timeout exceptions? Exactly where is it timing out and why.
It would be nice to really fix or improve that as well longer term. What is
your timeout set to? I want to see details so we can determine if other things
should be done, like make the registration retry more, are current timeout's
sufficient, etc.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]