[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963559#comment-15963559 ]
Marcelo Vanzin commented on SPARK-16742: ---------------------------------------- bq. But in Spark, this isn't currently derived from the Kerberos principal. It's configured by the user. That sounds problematic. The way YARN works is that it actually authenticates the user. Are you saying that Mesos doesn't do user authentication? The overarching point I'm trying to make with my comments is that for kerberos support to be properly secure, the cluster manager needs to be secure. That means running applications from different users in a way that doesn't allow them to hack each other. YARN does that by doing authentication when users request applications to run, and by running the containers as the requested user. The exact way in which YARN achieves that seems kinda tangential to the actual question, which is: What is the story for Mesos? Basically, the way in which you support Kerberos will depend on how your cluster manager does security. If Mesos behaves more like Spark Standalone than it does like YARN, then any solution that requires distributing user credentials is a non-starter, because it just becomes a security liability. bq. It would be a vulnerability, for example, if the Linux user for the executors is simply derived from that of the driver, because two human users running as the same Linux user, but logged in via different Kerberos principals, would be able to see each others' tokens. Are you saying that for YARN or Mesos? When YARN runs in Kerberos mode, Kerberos dictates the user. That's how the user is authenticating to YARN. There's a requirement that an OS user exists matching that particular user, but that's just a configuration detail. The security comes from the fact that the user is authenticating to the KDC. bq. You're right that we could implement cluster mode in some form, but I'd rather keep the initial PR small. I hope that's acceptable. The main point I'm trying to convey here is that running things in client and cluster mode should be exactly the same from the point of view of distributing tokens. The use case you mention ("user starting an application in cluster mode with no kerberos credentials") sounds actually worrying, because what's authenticating the user? > Kerberos support for Spark on Mesos > ----------------------------------- > > Key: SPARK-16742 > URL: https://issues.apache.org/jira/browse/SPARK-16742 > Project: Spark > Issue Type: New Feature > Components: Mesos > Reporter: Michael Gummelt > > We at Mesosphere have written Kerberos support for Spark on Mesos. We'll be > contributing it to Apache Spark soon. > Mesosphere design doc: > https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6 > Mesosphere code: > https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org