[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

liyinan926 Mon, 08 Oct 2018 10:20:55 -0700

Github user liyinan926 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21669#discussion_r223433719
  
    --- Diff: docs/security.md ---
    @@ -722,7 +722,84 @@ with encryption, at least.
     The Kerberos login will be periodically renewed using the provided 
credentials, and new delegation
     tokens for supported will be created.
     
    +## Secure Interaction with Kubernetes
    +
    +When talking to Hadoop-based services behind Kerberos, it was noted that 
Spark needs to obtain delegation tokens
    +so that non-local processes can authenticate. These delegation tokens in 
Kubernetes are stored in Secrets that are 
    +shared by the Driver and its Executors. As such, there are three ways of 
submitting a Kerberos job: 
    +
    +In all cases you must define the environment variable: `HADOOP_CONF_DIR` 
as well as either 
    +`spark.kubernetes.kerberos.krb5.location` or 
`spark.kubernetes.kerberos.krb5.configMapName`.
    +
    +It also important to note that the KDC needs to be visible from inside the 
containers if the user uses a local
    +krb5 file. 
    +
    +If a user wishes to use a remote HADOOP_CONF directory, that contains the 
Hadoop configuration files, this could be
    +achieved by mounting a pre-defined ConfigMap in the desired location that 
you can point to via the appropriate configs.
    +This method is useful for those who wish to not rebuild their Docker 
images, but instead point to a ConfigMap that they
    +could modify. This strategy is supported via the pod-template feature. 
    +
    +1. Submitting with a $kinit that stores a TGT in the Local Ticket Cache:
    +```bash
    +/usr/bin/kinit -kt <keytab_file> <username>/<krb5 realm>
    +/opt/spark/bin/spark-submit \
    +    --deploy-mode cluster \
    +    --class org.apache.spark.examples.HdfsTest \
    +    --master k8s://<KUBERNETES_MASTER_ENDPOINT> \
    +    --conf spark.executor.instances=1 \
    +    --conf spark.app.name=spark-hdfs \
    +    --conf spark.kubernetes.container.image=spark:latest \
    +    --conf spark.kubernetes.kerberos.krb5.locationn=/etc/krb5.conf \
    +    local:///opt/spark/examples/jars/spark-examples_<VERSION>-SNAPSHOT.jar 
\
    +    <HDFS_FILE_LOCATION>
    +```
    +2. Submitting with a local keytab and principal
    +```bash
    +/opt/spark/bin/spark-submit \
    +    --deploy-mode cluster \
    +    --class org.apache.spark.examples.HdfsTest \
    +    --master k8s://<KUBERNETES_MASTER_ENDPOINT> \
    +    --conf spark.executor.instances=1 \
    +    --conf spark.app.name=spark-hdfs \
    +    --conf spark.kubernetes.container.image=spark:latest \
    +    --conf spark.kerberos.keytab=<KEYTAB_FILE> \
    +    --conf spark.kerberos.principal=<PRINCIPLE> \
    +    --conf spark.kubernetes.kerberos.krb5.location=/etc/krb5.conf \
    +    local:///opt/spark/examples/jars/spark-examples_<VERSION>-SNAPSHOT.jar 
\
    +    <HDFS_FILE_LOCATION>
    +```
     
    +3. Submitting with pre-populated secrets, that contain the delegation 
token, already existing within the namespace
    +```bash
    +/opt/spark/bin/spark-submit \
    +    --deploy-mode cluster \
    +    --class org.apache.spark.examples.HdfsTest \
    +    --master k8s://<KUBERNETES_MASTER_ENDPOINT> \
    +    --conf spark.executor.instances=1 \
    +    --conf spark.app.name=spark-hdfs \
    +    --conf spark.kubernetes.container.image=spark:latest \
    +    --conf spark.kubernetes.kerberos.tokenSecret.name=<SECRET_TOKEN_NAME> \
    +    --conf spark.kubernetes.kerberos.tokenSecret.itemKey=<SECRET_ITEM_KEY> 
\
    +    --conf spark.kubernetes.kerberos.krb5.location=/etc/krb5.conf \
    +    local:///opt/spark/examples/jars/spark-examples_<VERSION>-SNAPSHOT.jar 
\
    +    <HDFS_FILE_LOCATION>
    +```
    +
    +3b. Submitting like in (3) however specifying a pre-created krb5 config map
    --- End diff --
    
    Nit: `confg map` -> `ConfigMap`.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

Reply via email to