[ 
https://issues.apache.org/jira/browse/SPARK-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5158:
-----------------------------------
    Description: 
There have been a handful of patches for allowing access to Kerberized HDFS 
clusters in standalone mode. The main reason we haven't accepted these patches 
have been that they rely on insecure distribution of token files from the 
driver to the other components.

As a simpler solution, I wonder if we should just provide a way to have the 
Spark driver and executors independently log in and acquire credentials using a 
keytab. This would work for users who have a dedicated, single-tenant, Spark 
clusters (i.e. they are willing to have a keytab on every machine running Spark 
for their application). It wouldn't address all possible deployment scenarios, 
but if it's simple I think it's worth considering.

This would also work for Spark streaming jobs, which often run on dedicated 
hardware since they are long-running services.

  was:
There have been a handful of patches for allowing access to Kerberized HDFS 
clusters in standalone mode. The main reason we haven't accepted these patches 
have been that they rely on insecure distribution of token files from the 
driver to the other components.

As a simpler solution, I wonder if we should just provide a way to have the 
Spark driver and executors independently log in and acquire credentials using a 
keytab. This would work for users who are build dedicated, single-tenant, Spark 
clusters (i.e. they are willing to have a keytab on every machine running Spark 
for their application). It wouldn't address all possible deployment scenarios, 
but if it's simple I think it's worth considering.

This would also work for Spark streaming jobs, which often run on dedicated 
hardware since they are long-running services.


> Allow for keytab-based HDFS security in Standalone mode
> -------------------------------------------------------
>
>                 Key: SPARK-5158
>                 URL: https://issues.apache.org/jira/browse/SPARK-5158
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Patrick Wendell
>            Assignee: Matthew Cheah
>            Priority: Critical
>
> There have been a handful of patches for allowing access to Kerberized HDFS 
> clusters in standalone mode. The main reason we haven't accepted these 
> patches have been that they rely on insecure distribution of token files from 
> the driver to the other components.
> As a simpler solution, I wonder if we should just provide a way to have the 
> Spark driver and executors independently log in and acquire credentials using 
> a keytab. This would work for users who have a dedicated, single-tenant, 
> Spark clusters (i.e. they are willing to have a keytab on every machine 
> running Spark for their application). It wouldn't address all possible 
> deployment scenarios, but if it's simple I think it's worth considering.
> This would also work for Spark streaming jobs, which often run on dedicated 
> hardware since they are long-running services.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to