Varad Karmarkar created ZEPPELIN-4804:
-----------------------------------------

             Summary: Unable to start Spark Interpreter on Kubernetes
                 Key: ZEPPELIN-4804
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4804
             Project: Zeppelin
          Issue Type: Bug
          Components: Kubernetes, spark
            Reporter: Varad Karmarkar


Hi team,

I'm trying to install Zeppelin on AWS EKS. When I try spinning up a Spark 
Interpreter pod (running just sc.version), it fails and says that there are no 
interpreters running. The pod's log includes the command:

/opt/spark/bin/spark-submit --class 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 
--driver-class-path 
":/zeppelin/interpreter/spark/*::/zeppelin/interpreter/zeppelin-interpreter-shaded-0.9.0-preview1.jar:/zeppelin/interpreter/spark/spark-interpreter-0.9.0-preview1.jar"
 --driver-java-options " -Dfile.encoding=UTF-8 
-Dlog4j.configuration=file:///zeppelin/conf/log4j.properties 
-Dzeppelin.log.file='/zeppelin/logs/zeppelin-interpreter-spark-shared_process--spark-refljc.log'"
 --conf spark.jars.ivy=/tmp/.ivy +_*--master 
k8s://https://kubernetes.default.svc*_+ --deploy-mode client --driver-memory 1g 
--conf spark.kubernetes.namespace=default conf spark.executor.instances=1 
--conf spark.kubernetes.driver.pod.name=spark-refljc --conf 
spark.kubernetes.container.image=stx-app-docker-prod-local.artifactory.dbgcloud.io/spark:2.4.5
 --conf spark.driver.bindAddress=0.0.0.0 --conf 
spark.driver.host=spark-refljc.default.svc --conf spark.driver.port=22321 
--conf spark.blockManager.port=22322 
/zeppelin/interpreter/spark/spark-interpreter-0.9.0-preview1.jar 
zeppelin-server-695446f7c6-cxd59.default.svc 12320 "spark-shared_process" 
12321:12321

 

As I've highlighted, the default master address is being passed in, but this 
shouldn't be - I've set the MASTER env variable to my cluster's API server and 
this is being overriden. I followed the stacktraceinto the code and found that 
the function BuildSparkSubmitOptions 
([https://github.com/apache/zeppelin/blob/master/zeppelin-plugins/launcher/k8s-standard/src/main/java/org/apache/zeppelin/interpreter/launcher/K8sRemoteInterpreterProcess.java#L337])
 is being called, where the line 

options.append(" --master k8s://https://kubernetes.default.svc";);

appears to hardcode this value in. Is this assumption correct? If so, it would 
explain why I'm not able to run Spark, as the Zeppelin pod isn't able to find 
the API server, since the default value is being hardcoded in. 

 

I believe that in order for me to fix the issue myself, I would need to remove 
the line where the master is being set and re-build zeppelin, right? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to