Github user echarles commented on the issue: https://github.com/apache/zeppelin/pull/2637 @matyix I have tested your last commits and was able to make it work in my env (with both zeppelin `in-` and `out` k8s cluster). Your implement a new (specific for spark-k8s) launch and remote executor. In another local branch, I have tried to stick as much as possible to the current zeppelin paradigm (thrift servers both sides of the interpreters processes with CallbackInfo) and 2 parameters (host, port) for interpreter.sh - I still have issue with the callback, so I finally think the approach you propose is good and does the job. My feedbacks: + The branch as-such need basic updates: I had to fix compilation issue with the new classes (`SparkK8sInterpreterLauncher` and `SparkK8sRemoteIntepreterManagedProcess`) and had to add `${ZEPPELIN_SPARK_CONF}` in the `interpreter.sh` script. + To find the running driver pod, you actually poll on regular basis. The ideal would be to be notified when the pod is ready (not sure if the k8s client support this. We would closely map the current mechanism of the thrift notification via the CallbackInfo, but here with a pure k8s mechanism. This could be also extended to other interpreters we would want to see in k8s. + We need to set `spark.app.name` with must a value starting with `zri-` - If you don't set this in the interpreter settings, the k8s client will not find the driver pod - I wonder if we can make this more configurable, let's say using metada or simply using the InterpreterContext with contains a `properties` attributes with all the given props - the launcher could retrieve this and search for a pod starting with a dynamic prefix rather than with this hardcoded one. + The current vanilla zeppelin supports out-of-the-box the spark-k8s `client-mode` (assuming you are using https://github.com/apache/zeppelin/pull/2637). The condition to use the `SparkK8sInterpreterLauncher` needs to check for `spark.submit.deployMode` being `cluster` and continue to use the normal ManagedProcess for `client`. + On documentation level, certainly mention that the app name must start with `zri-`. Also, relying on the kubespark docker image would be better to ensure nothing special is added in the docker image. WDYT? Do you prefer me to submit a PR on your PR and will you make another push?
---