Github user echarles commented on the issue:

    https://github.com/apache/zeppelin/pull/2637
  
    @matyix I have tested your last commits and was able to make it work in my 
env (with both zeppelin `in-` and `out` k8s cluster).
    
    Your implement a new (specific for spark-k8s) launch and remote executor. 
In another local branch, I have tried to stick as much as possible to the 
current zeppelin paradigm (thrift servers both sides of the interpreters 
processes with CallbackInfo) and 2 parameters (host, port) for interpreter.sh - 
I still have issue with the callback, so I finally think the approach you 
propose is good and does the job.
    
    My feedbacks:
    
    + The branch as-such need basic updates: I had to fix compilation issue 
with the new classes (`SparkK8sInterpreterLauncher` and 
`SparkK8sRemoteIntepreterManagedProcess`) and had to add 
`${ZEPPELIN_SPARK_CONF}` in the `interpreter.sh` script.
    + To find the running driver pod, you actually poll on regular basis. The 
ideal would be to be notified when the pod is ready (not sure if the k8s client 
support this. We would closely map the current mechanism of the thrift 
notification via the CallbackInfo, but here with a pure k8s mechanism. This 
could be also extended to other interpreters we would want to see in k8s.
    + We need to set `spark.app.name` with must a value starting with `zri-` - 
If you don't set this in the interpreter settings, the k8s client will not find 
the driver pod - I wonder if we can make this more configurable, let's say 
using metada or simply using the InterpreterContext with contains a 
`properties` attributes with all the given props - the launcher could retrieve 
this and search for a pod starting with a dynamic prefix rather than with this 
hardcoded one.
    + The current vanilla zeppelin supports out-of-the-box the spark-k8s 
`client-mode` (assuming you are using 
https://github.com/apache/zeppelin/pull/2637). The condition to use the 
`SparkK8sInterpreterLauncher` needs to check for `spark.submit.deployMode` 
being `cluster` and continue to use the normal ManagedProcess for `client`.
    + On documentation level, certainly mention that the app name must start 
with `zri-`. Also, relying on the kubespark docker image would be better to 
ensure nothing special is added in the docker image.
    
    WDYT?
    
    Do you prefer me to submit a PR on your PR and will you make another push?



---

Reply via email to