[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963559#comment-15963559
 ] 

Marcelo Vanzin commented on SPARK-16742:
----------------------------------------

bq. But in Spark, this isn't currently derived from the Kerberos principal. 
It's configured by the user. 

That sounds problematic. The way YARN works is that it actually authenticates 
the user. Are you saying that Mesos doesn't do user authentication?

The overarching point I'm trying to make with my comments is that for kerberos 
support to be properly secure, the cluster manager needs to be secure. That 
means running applications from different users in a way that doesn't allow 
them to hack each other. YARN does that by doing authentication when users 
request applications to run, and by running the containers as the requested 
user. The exact way in which YARN achieves that seems kinda tangential to the 
actual question, which is:

What is the story for Mesos?

Basically, the way in which you support Kerberos will depend on how your 
cluster manager does security. If Mesos behaves more like Spark Standalone than 
it does like YARN, then any solution that requires distributing user 
credentials is a non-starter, because it just becomes a security liability.

bq. It would be a vulnerability, for example, if the Linux user for the 
executors is simply derived from that of the driver, because two human users 
running as the same Linux user, but logged in via different Kerberos 
principals, would be able to see each others' tokens.

Are you saying that for YARN or Mesos? When YARN runs in Kerberos mode, 
Kerberos dictates the user. That's how the user is authenticating to YARN. 
There's a requirement that an OS user exists matching that particular user, but 
that's just a configuration detail. The security comes from the fact that the 
user is authenticating to the KDC.

bq. You're right that we could implement cluster mode in some form, but I'd 
rather keep the initial PR small. I hope that's acceptable.

The main point I'm trying to convey here is that running things in client and 
cluster mode should be exactly the same from the point of view of distributing 
tokens. The use case you mention ("user starting an application in cluster mode 
with no kerberos credentials") sounds actually worrying, because what's 
authenticating the user?

> Kerberos support for Spark on Mesos
> -----------------------------------
>
>                 Key: SPARK-16742
>                 URL: https://issues.apache.org/jira/browse/SPARK-16742
>             Project: Spark
>          Issue Type: New Feature
>          Components: Mesos
>            Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to