Hi, I deployed JupyterHub and Enterprise Gateway on a Kubernetes cluster. I also have a remote YARN cluster (Cloudera), it is not on my Kubernetes cluster.
I'm trying to launch "Spark - Python (YARN Cluster Mode)" kernel (after configured EG_YARN_ENDPOINT) The problem is that the enterprise gateway is listening on an *internal IP with a random port *and waits that the remote kernel will communicate with him. *from the log*: (remark - the 10.244.1.195 IP is the IP of enterprise gateway pod, inside Kubernetes cluster) "Response socket launched on '*10.244.1.195:38997*' using 5.0s timeout" "++ exec /usr/hdp/current/spark2-client/bin/*spark-submit --master yarn --deploy-mode cluster *--name 6c20ed44-0394-4548-9f88-bddb6e5752e8 --conf spark.yarn.submit.waitAppCompletion=false --conf spark.yarn.appMasterEnv.PYTHONUSERBASE=/home/jovyan/.local --conf spark.yarn.appMasterEnv.PYTHONPATH=/.local/lib/python3.7/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip --conf spark.yarn.appMasterEnv.PATH=/opt/conda/bin:/opt/conda/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin /usr/local/share/jupyter/kernels/spark_python_yarn_cluster/scripts/launch_ipykernel.py --RemoteProcessProxy.kernel-id 6c20ed44-0394-4548-9f88-bddb6e5752e8 *--RemoteProcessProxy.response-address **10.244.1.195:38997* --RemoteProcessProxy.port-range 0..0 --RemoteProcessProxy.spark-context-initialization-mode lazy" then it copies launch_ipykernel.py to YARN cluster "INFO yarn.Client: Uploading resource file:/usr/local/share/jupyter/kernels/spark_python_yarn_cluster/scripts/launch_ipykernel.py -> hdfs://master1.cluster2.local:8020/user/hdfs/.sparkStaging/application_1590301476347_0115/launch_ipykernel.py" On the YARN cluster, launch_ipykernel.py runs and tries to communicate with the IP address it received (*10.244.1.195:38997*) but fails https://github.com/jupyter/enterprise_gateway/blob/v2.0.0/etc/kernel-launchers/python/scripts/launch_ipykernel.py (method return_connection_info) I thought of adding Kubernetes service with remote IP (LoadBalacer) and to set EG_RESPONSE_IP to it, but because EG uses a random port, I can't do it Can anyone advise what can I do? thanks -- You received this message because you are subscribed to the Google Groups "Project Jupyter" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jupyter/27b1a2fa-43ad-4543-acc4-5686ac4586e4%40googlegroups.com.
