Github user skonto commented on a diff in the pull request:
https://github.com/apache/spark/pull/20945#discussion_r178952630
--- Diff:
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
---
@@ -506,6 +506,10 @@ private[spark] class MesosClusterScheduler(
options ++= Seq("--class", desc.command.mainClass)
}
+ desc.conf.getOption("spark.mesos.proxyUser").foreach { v =>
+ options ++= Seq("--proxy-user", v)
--- End diff --
There is a separation. The container launch and how DC/OS manages that is
irrelevant to Spark. From the point that the dispatcher launches the driver and
on ,then this is Spark's concern. The OS user DC/OS uses to launch the
container is not related to the DC/OS user who submits the job or the Hadoop
user. The DC/OS user who submits the spark job in cluster mode owns then
superuser credentials and this is not leaked. Users here are not like Yarn
users.
If I use the DC/OS user who owns superuser's credentials all the way inside
the container then
effectively I am ok. There is no requirement that impersonation should be a
cluster wide characteristic. We are not solving the DC/OS impersonation issue
here.
There are limitations btw under DC/OS but this is how it is right now check
here:
https://docs.mesosphere.com/services/spark/2.3.0-2.2.1-2/limitations.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]