Github user juanrh commented on the issue:
https://github.com/apache/spark/pull/19267
Hi Tom, thanks for your answer.
Regarding use cases for the Spark admin command, I think it would be a good
fit for cloud environments, where single job clusters are common, because
creating and destroying clusters is easy. Also, to cover the case for clusters
with many applications, we could add some option to the admin command to
decommission the specified nodes for all Spark applications running in the
cluster, and implement that using features of the configured cluster manager to
discover Spark applications. For example for Yarn we could use
[`YarnClient.getApplications`](https://hadoop.apache.org/docs/r2.7.3/api/org/apache/hadoop/yarn/client/api/YarnClient.html#getApplications())
to discover the application master / Spark driver address for all running
Spark applications, using
[`ApplicationReport.getApplicationType`](https://hadoop.apache.org/docs/r2.7.3/api/org/apache/hadoop/yarn/api/records/ApplicationReport.html#getApplicationType())
to consider just applications with type `"SPARK"`.
It would be nice not having to support a temporary command, but an
advantage of doing so is that it allows to get this feature in Spark without
having to wait for it to be available in the cluster manager. Also, for Spark
Standalone mode I don't see any option than having an admin command, in the
line of existing commands like `./sbin/start-master.sh`, because in that case
Spark also provides the cluster manager.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]