[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-08-25 Thread Arthur Rand (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142337#comment-16142337
 ] 

Arthur Rand commented on SPARK-16742:
-

Gotcha, https://issues.apache.org/jira/browse/SPARK-21842 is to track work. 

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>Assignee: Arthur Rand
> Fix For: 2.3.0
>
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-08-24 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16140293#comment-16140293
 ] 

Marcelo Vanzin commented on SPARK-16742:


Both renewal and creating new tickets after the TTL (those are different 
things).

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>Assignee: Arthur Rand
> Fix For: 2.3.0
>
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-08-24 Thread Arthur Rand (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16140111#comment-16140111
 ] 

Arthur Rand commented on SPARK-16742:
-

Hello [~vanzin], I'm assuming you're talking about automatic ticket renewal, 
correct? I was just starting to look into that w.r.t. Mesos, I'll create a 
ticket. 

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>Assignee: Arthur Rand
> Fix For: 2.3.0
>
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-07-30 Thread Arthur Rand (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106501#comment-16106501
 ] 

Arthur Rand commented on SPARK-16742:
-

Hello [~vanzin], I addressed the comments for the second PR 
(https://github.com/apache/spark/pull/18519). It is ready for final review. 

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-07-03 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072970#comment-16072970
 ] 

Apache Spark commented on SPARK-16742:
--

User 'mgummelt' has created a pull request for this issue:
https://github.com/apache/spark/pull/18519

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-17 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15971897#comment-15971897
 ] 

Apache Spark commented on SPARK-16742:
--

User 'mgummelt' has created a pull request for this issue:
https://github.com/apache/spark/pull/17665

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-14 Thread Michael Gummelt (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969341#comment-15969341
 ] 

Michael Gummelt commented on SPARK-16742:
-

[~jerryshao] No, but you can look at our solution here: 
https://github.com/mesosphere/spark/commit/0a2cc4248039ca989e177e96e92a594a025661fe#diff-79391110e9f26657e415aa169a004998R129

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-13 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15968545#comment-15968545
 ] 

Saisai Shao commented on SPARK-16742:
-

[~mgummelt], do you have a design doc of the kerberos support for Spark on 
Mesos, so that my work of SPARK-19143 could be based on yours.

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Michael Gummelt (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963601#comment-15963601
 ] 

Michael Gummelt commented on SPARK-16742:
-

bq. So, assuming that Mesos is configured properly, then it should be OK for 
Spark code to distribute user credentials.

Right.  It's just a matter of the cluster admin syncing Mesos credentials and 
kerberos credentials properly.  In summary, it's simpler in YARN because YARN 
is Kerberos-aware, whereas Mesos isn't.

bq. That sounds like you might need the current code that distributes keytabs 
and logs in the cluster to make even client mode work in this setup.

Since client mode requires network access to the Mesos master, we generally 
assume that the user is on the same network as their datacenter, and can thus 
kinit against the KDC.


> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963592#comment-15963592
 ] 

Marcelo Vanzin commented on SPARK-16742:


bq. It authenticates the Mesos principal, and this principal is allowed to 
launch processes only as certain Linux users. It's up the cluster admin to 
setup this mapping appropriately.

Ok, that sounds similar then. Basically, you *can* set up Mesos so that it can 
do this securely, which was my initial question. (Being able to set things up 
in an insecure way is not actually that interesting; I just wanted to make sure 
there *is* a way to set things up securely.)

So, assuming that Mesos is configured properly, then it should be OK for Spark 
code to distribute user credentials.

bq. I actually said a "user might not be kinit'd". They may, however, have 
access to the keytab.

That sounds like you might need the current code that distributes keytabs and 
logs in the cluster to make even client mode work in this setup.

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Michael Gummelt (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963583#comment-15963583
 ] 

Michael Gummelt commented on SPARK-16742:
-

bq. That sounds problematic. The way YARN works is that it actually 
authenticates the user. Are you saying that Mesos doesn't do user 
authentication?

AFAICT, YARN doesn't authenticate the Linux user.  The KDC authenticates the 
kerberos principal, and YARN maps this principal to a Linux user via 
{{hadoop.security.auth_to_local}}.  So if a user authenticated to the KDC via a 
principal "Joe", and the {{auth_to_local}} rule maps "Joe" to "root", then 
"Joe" can launch processes as "root", even though he never provided "root" 
credentials.  It's up to the cluster administrator to properly setup this 
Kerberos -> Linux mapping.

It's a similar story with Mesos.  Mesos doesn't authenticate the Linux user.  
It authenticates the Mesos principal, and this principal is allowed to launch 
processes only as certain Linux users.  It's up the cluster admin to setup this 
mapping appropriately.

The big difference is that, by default, YARN will map the kerberos principal to 
the linux user with the same name, so there's no problem.  Whereas Mesos will 
allow the driver to launch executors as any user that their Mesos principal is 
allowed to launch users as.  So it's up to the admin to only provide users with 
consistent Mesos and Kerberos credentials.

bq. Are you saying that for YARN or Mesos? When YARN runs in Kerberos mode, 
Kerberos dictates the user.

I'm talking about YARN.  See the above comment.  If {{auth_to_local}} is used 
like I think it is, then that's what ultimately determines the Linux user, not 
just Kerberos.

bq.  The use case you mention ("user starting an application in cluster mode 
with no kerberos credentials") sounds actually worrying

I actually said a "user might not be kinit'd".  They may, however, have access 
to the keytab.  But since they're not on the same network as the KDC, they 
can't authenticate directly.  But they do have the creds.


> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963559#comment-15963559
 ] 

Marcelo Vanzin commented on SPARK-16742:


bq. But in Spark, this isn't currently derived from the Kerberos principal. 
It's configured by the user. 

That sounds problematic. The way YARN works is that it actually authenticates 
the user. Are you saying that Mesos doesn't do user authentication?

The overarching point I'm trying to make with my comments is that for kerberos 
support to be properly secure, the cluster manager needs to be secure. That 
means running applications from different users in a way that doesn't allow 
them to hack each other. YARN does that by doing authentication when users 
request applications to run, and by running the containers as the requested 
user. The exact way in which YARN achieves that seems kinda tangential to the 
actual question, which is:

What is the story for Mesos?

Basically, the way in which you support Kerberos will depend on how your 
cluster manager does security. If Mesos behaves more like Spark Standalone than 
it does like YARN, then any solution that requires distributing user 
credentials is a non-starter, because it just becomes a security liability.

bq. It would be a vulnerability, for example, if the Linux user for the 
executors is simply derived from that of the driver, because two human users 
running as the same Linux user, but logged in via different Kerberos 
principals, would be able to see each others' tokens.

Are you saying that for YARN or Mesos? When YARN runs in Kerberos mode, 
Kerberos dictates the user. That's how the user is authenticating to YARN. 
There's a requirement that an OS user exists matching that particular user, but 
that's just a configuration detail. The security comes from the fact that the 
user is authenticating to the KDC.

bq. You're right that we could implement cluster mode in some form, but I'd 
rather keep the initial PR small. I hope that's acceptable.

The main point I'm trying to convey here is that running things in client and 
cluster mode should be exactly the same from the point of view of distributing 
tokens. The use case you mention ("user starting an application in cluster mode 
with no kerberos credentials") sounds actually worrying, because what's 
authenticating the user?

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Michael Gummelt (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963469#comment-15963469
 ] 

Michael Gummelt commented on SPARK-16742:
-

[~jerryshao] Great! The current RPC used in Mesos is very simple.  The executor 
just periodically requests the latest credentials from the driver, which uses 
the keytab to periodically renew.  We can swap in a different mechanism once 
that exists.

I left a comment on your design doc.

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Michael Gummelt (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963446#comment-15963446
 ] 

Michael Gummelt commented on SPARK-16742:
-

bq. The most basic feature needed for any kerberos-related work is user 
isolation (different users cannot mess with each others' processes). I was 
under the impression that Mesos supported that.

Mesos of course supports configuring the Linux user that process runs as.  But 
in Spark, this isn't currently derived from the Kerberos principal.  It's 
configured by the user, and the *Mesos* principal of the scheduler, along with 
ACLs configured in Mesos, is what determines which Linux users are allowed.  
That's why I was asking about {{hadoop.security.auth_to_local}}, to understand 
how YARN determines what Linux user to run executors as.  It would be a 
vulnerability, for example, if the Linux user for the executors is simply 
derived from that of the driver, because two human users running as the same 
Linux user, but logged in via different Kerberos principals, would be able to 
see each others' tokens.

bq. I don't know where this notion that cluster mode requires you to distribute 
keytabs comes from

As you said, it's mostly the renewal use case that requires distributing the 
keytab, but that's not all.  In many Mesos setups, and certainly in DC/OS, the 
submitting user might not already be kinit'd.  They may be running from outside 
the datacenter entirely, without network access to the KDC.

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963136#comment-15963136
 ] 

Marcelo Vanzin commented on SPARK-16742:


bq. The problem is then that a kerberos-authenticated user submitting their job 
would be unaware that their credentials are being leaked to other users.

That's the gist of it, yes. But note that it isn't restricted to files. If all 
the user processes are running as the same user, one can just dump the other's 
heap, or connect using JVMTI, and get the credentials. Same problem.

The most basic feature needed for any kerberos-related work is user isolation 
(different users cannot mess with each others' processes). I was under the 
impression that Mesos supported that.

bq. I'm assuming that hadoop.security.auth_to_local is what maps the Kerberos 
user to the Unix user...

I'm not exactly familiar with all the YARN settings but yes, the result you get 
is that the submitting user runs YARN containers as their own user (nor as some 
generic, shared user). Without that, you shouldn't even bother thinking about 
inserting Kerberos in the picture, IMO.

bq. We avoid the shared-file problem for keytabs entirely

See my first comment above, that's not enough.

bq. We're probably going to punt on cluster mode for now

You don't need to punt on cluster mode. I don't know where this notion that 
cluster mode requires you to distribute keytabs comes from; Spark works just 
fine in YARN cluster mode without distributing keytabs. All you need to 
distribute are delegation tokens. Keytabs aren't even necessary to log in and 
submit the app at all (you can use passwords with kinit, after all).

The only thing distributing keytabs buys you is running applications for longer 
than the delegation tokens' max lifetime (normally 7 days by default).

bq. If you see any blockers

Lack of user isolation is always a blocker; without that there's no way to 
prevent one user from seeing another's credentials. But I've asked this in the 
past and the answer I got is that Mesos supports it...

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-09 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962455#comment-15962455
 ] 

Saisai Shao commented on SPARK-16742:
-

Hi [~mgummelt], I'm working on the design of SPARK-19143, by looking at your 
comments, I think part of the works are overlapped, especially the RPC part to 
propagate Credentials. Here is my current WIP design 
(https://docs.google.com/document/d/1Y8CY3XViViTYiIQO9ySoid0t9q3H163fmroCV1K3NTk/edit?usp=sharing).
 In my current design I offer a standard RPC solution to support different 
cluster managers.

It would be great if we could collaborate together to meet the same goal. My 
main concern is that if Mesos's implementation is quite different from Yarn's, 
then it requires more effort to align with different cluster managers, if your 
proposal is similar to what I proposed here, then my work can be based on yours.

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-09 Thread Michael Gummelt (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962450#comment-15962450
 ] 

Michael Gummelt commented on SPARK-16742:
-

Also, note that the above Mesos implementation is not dependent on Mesos in any 
way.  It just uses Spark's existing RPC mechanisms to transmit delegation 
tokens.  I see that there's a related effort here to standardize this RPC 
mechanism: https://issues.apache.org/jira/browse/SPARK-19143.  We'd be more 
than happy to adopt that standard once it exists.  But hopefully our one-off 
RPC that we're currently using is acceptable in the interim.

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-09 Thread Michael Gummelt (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962440#comment-15962440
 ] 

Michael Gummelt commented on SPARK-16742:
-

Hi [~vanzin],

[~ganger85] and Strat.io are pulling back their Mesos Kerberos implementation 
for now, and we at Mesosphere are about to submit a PR to upstream our 
implementation.  I have a few questions I'd like to run by you to make sure 
that PR goes smoothly.

1) I've been following your comments on this Spark Standalone Kerberos PR: 
https://github.com/apache/spark/pull/17530.  It looks like your concern is that 
in *cluster mode*, the keytab is written to a file on the host running the 
driver, and is owned by the user of the Spark Worker, which will be the same 
for each job.  So jobs submitted by multiple users will be able to read each 
other's keytabs.  In *client mode*, it looks like the delegation tokens are 
written to a file (HADOOP_TOKEN_FILE_LOCATION) on the host running the 
executor, which suffers from the same problem as the keytab in cluster mode.

The problem is then that a kerberos-authenticated user submitting their job 
would be unaware that their credentials are being leaked to other users.  Is 
this an accurate description of the issue?  

2) I understand that YARN writes delegation tokens via 
{{amContainer.setTokens()}}, which ultimately results in the delegation token 
being written to a file owned by the submitting user.  However, since the 
"submitting user" is a Kerberos user, not a Unix user, I'm assuming that 
{{hadoop.security.auth_to_local}} is what maps the Kerberos user to the Unix 
user who runs the ApplicationMaster and owns that file.  Is that correct?

To avoid the shared-file problem for delegation tokens, our Mesos 
implementation currently has the Executor issue an RPC call to fetch the 
delegation token from the driver.  There therefore isn't any need for at-rest 
encryption, and if in-motion encryption is in the user's threat model, then can 
be sure to run Spark with SSL.

We avoid the shared-file problem for keytabs entirely, because there's no need 
to distribute the keytab, at least in client mode.  Unlike YARN, the driver and 
the equivalent of the "ApplicationMaster" in Mesos are one and the same.  They 
both exist in the same process, the {{spark-submit}} process.

We're probably going to punt on cluster mode for now, just for simplicity, but 
we should be able to solve this in cluster mode as well, because unlike 
standalone, and much like YARN, Mesos controls what user the driver runs as.

What do you think of the above approach?  If you see any blockers, I would very 
much appreciate teasing those out now rather than during the PR.  Thanks!

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-02-15 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867916#comment-15867916
 ] 

Abel Rincón commented on SPARK-16742:
-

Hi all, we recent push our new implementation, and you can take a look over the 
code at the PR.
I'm creating a little doc to explain the solution.

BTW Some headlines.

Enable using standard principal and keytab args, also allow to use proxy user 
over the real principal with --proxy-user  arg.
Diver side uses kerberos authentication 
DAGScheduler get the hadoop delegation tokens related, using kerberos 
authentication and create each task with those tokens.
Executors side uses hadoop tickets authentication 

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-02-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867374#comment-15867374
 ] 

Abel Rincón commented on SPARK-16742:
-

Hi all we are working on a solution with hadoop delegation tokens and without 
proxy users, I hope that today you can take a look over the new code.

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-02-13 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863371#comment-15863371
 ] 

Saisai Shao commented on SPARK-16742:
-

The proposed solution is quite different from what existed in Spark on YARN. 
IIUC this solution looks doesn't honor delegation token, and wraps every HDFS 
operation with {{executeSecure}}, I simply doubt that this approach requires 
other components, like sql, streaming, should also know the existence of such 
APIs and try to wrap them. Also if newly added codes ignore this wrapper, this 
will lead to error. From my understanding it is quite intrusive.

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-02-03 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851539#comment-15851539
 ] 

Apache Spark commented on SPARK-16742:
--

User 'arinconstrio' has created a pull request for this issue:
https://github.com/apache/spark/pull/16788

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-01-12 Thread Jorge Lopez-Malla (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820530#comment-15820530
 ] 

Jorge Lopez-Malla commented on SPARK-16742:
---

In Stratio we have had a very busy end of the year releasing our product and we 
are now resuming the development.

In fact, if someone will go to the Spark Summit East, Abel and I will talk 
about Stratio Kerberos Spark integration solution.

> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-01-10 Thread Mohammad Kamrul Islam (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15816742#comment-15816742
 ] 

Mohammad Kamrul Islam commented on SPARK-16742:
---

any update on this effort?


> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2016-09-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490657#comment-15490657
 ] 

Abel Rincón commented on SPARK-16742:
-

We at Stratio are working on this issue, 

Stratio design doc:
https://docs.google.com/document/d/1h9UvLCQ5e6s8L9jqRAuPowJAom_We1f5LFSPvrimDqM/edit?usp=sharing


> Kerberos support for Spark on Mesos
> ---
>
> Key: SPARK-16742
> URL: https://issues.apache.org/jira/browse/SPARK-16742
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Reporter: Michael Gummelt
>
> We at Mesosphere have written Kerberos support for Spark on Mesos.  We'll be 
> contributing it to Apache Spark soon.
> Mesosphere design doc: 
> https://docs.google.com/document/d/1xyzICg7SIaugCEcB4w1vBWp24UDkyJ1Pyt2jtnREFqc/edit#heading=h.tdnq7wilqrj6
> Mesosphere code: 
> https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org