Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/17238
> Actually, to play devil's advocate, the problem @morenn520 is describing
is a little more involved. You have a driver running, which has its own view of
what the cluster topology is, and then the cluster topology changes underneath
it. Deploying a new configuration on the new nodes being added does not fix the
driver, unless your "topology discovery script" is fully dynamic and always
goes to a central location to figure out what's the current topology.
Maybe I'm misunderstanding what you are saying, but the only way the AM
gets a bad topology is if its wrong in the first place. Or are you just
saying app starts and host is in one rack, host gets moved to another rack and
brought back up? I guess that is possible, but I'm not really sure that
applies to this case here anyway with default_rack. Any existing executors
would have gone away on it when the host was moved so yarn should re-resolve
when it gets a new container anyway. If your script isn't dynamic to handle
that then its also a configuration issue to update all the other hosts and you
should do that before bringing the host up. Again unless you aren't using HDFS
the rack resolve affects more then just spark on yarn here. Its going to
affect HDFS block placements and other types of apps.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]