Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22911#discussion_r231256397
  
    --- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala
 ---
    @@ -123,7 +126,11 @@ private[spark] class KubernetesClusterSchedulerBackend(
       }
     
       override def createDriverEndpoint(properties: Seq[(String, String)]): 
DriverEndpoint = {
    -    new KubernetesDriverEndpoint(rpcEnv, properties)
    +    new KubernetesDriverEndpoint(sc.env.rpcEnv, properties)
    +  }
    +
    +  override protected def createTokenManager(): 
Option[HadoopDelegationTokenManager] = {
    +    Some(new HadoopDelegationTokenManager(conf, sc.hadoopConfiguration))
    --- End diff --
    
    I'm not sure I follow your train of thought here so I'll comment on what I 
understand.
    
    First, the code that creates the secret is in 
`KerberosConfDriverFeatureStep`. As far as I know, that class is not used in 
client mode. In client mode the keytab stays in the client machine, with the 
driver, and the driver just sends DTs to executors. So the whole discussion 
about secrets is irrelevant in that case.
    
    In cluster mode, you need the driver to have access to the keytab for this 
feature to work. There are a few ways to achieve that:
    
    - the current YARN mode, which is the keytab lives in the submission host, 
and is distributed with the application. In k8s this would amount to what I 
have here: the submission code creates a secret for the driver pod and stashes 
the keytab in it.
    
    - add the ability to store the keytab in an external place (like HDFS or an 
HTTP server). That has drawbacks (e.g. people probably wouldn't like that, and 
there's a chicken & egg problem in HDFS, so you'd still need a kerberos TGT to 
bootstrap things).
    
    - add a k8s-specific feature of mounting a pre-defined secret in the driver 
pod. I believe this is what you're suggesting?
    
    I think supporting the first is easy as this change shows, and keeps 
feature parity with what's already supported in YARN. Unless there's a glaring 
issue with using secrets that I'm not aware of, I don't see a reason for not 
doing it.
    
    The third option (pre-defined secret) could also be added. My hope is that 
you could do it with pre-existing configs (`spark.kubernetes.driver.secrets.` & 
company), but I don't know how you'd set the `spark.kerberos.keytab` and 
`spark.kerberos.principal` configs just in the driver - and not in the 
submission client. So it seems we'd need at least a little bit of code here to 
support that scenario.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to