So after adding the quotes in both SparkInterpreterLauncher and interpreter.sh, interpreter is still failing with same error of Unrecognized option. But the weird thing is that if I copy the command supposedly executed from zeppelin (as it is printed to log) and run it directly in shell, the interpreter process is properly running. So my guess is that the forked process command that is created, is not really identical to the one that is logged.
This is how my cmd looks like (censored a bit): /usr/local/spark/bin/spark-submit --class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer --driver-class-path :/zeppelin/local-repo/spark/*:/zeppelin/interpreter/spark/*:::/zeppelin/inter preter/zeppelin-interpreter-shaded-0.10.0-SNAPSHOT.jar:/zeppelin/interpreter/spark/spark-interpreter-0.10.0-SNAPSHOT.jar:/etc/hadoop/conf *--driver-java-options " -DSERVICENAME=zeppelin_docker -Dfile.encoding=UTF-8 -Dlog4j.configuration=file:///zeppelin/conf/log4j.properties -Dlog4j.configurationFile=file:///zeppelin/conf/log4j2.properties -Dzeppelin.log.file=/var/log/zeppelin/zeppelin-interpreter-spark-shared_process--zeppelin-test-spark3-7d74d5df4-2g8x5.log" * --conf spark.driver.host=10.135.120.245 --conf "spark.dynamicAllocation.minExecutors=1" --conf "spark.shuffle.service.enabled=true" --conf "spark.sql.parquet.int96AsTimestamp=true" --conf "spark.ui.retainedTasks=10000" --conf "spark.executor.heartbeatInterval=600s" --conf "spark.ui.retainedJobs=100" --conf "spark.sql.ui.retainedExecutions=10" --conf "spark.hadoop.cloneConf=true" --conf "spark.debug.maxToStringFields=200000" --conf "spark.executor.memory=70g" --conf "spark.executor.extraClassPath=../mysql-connector-java-8.0.18.jar:../guava-19.0.jar" --conf "spark.hadoop.fs.permissions.umask-mode=000" --conf "spark.memory.storageFraction=0.1" --conf "spark.scheduler.mode=FAIR" --conf "spark.sql.adaptive.enabled=true" --conf "spark.master=mesos://zk://zk003:2181,zk004:2181,zk006:2181,/mesos-zeppelin" --conf "spark.driver.memory=15g" --conf "spark.io.compression.codec=lz4" --conf "spark.executor.uri= https://artifactory.company.com/artifactory/static/spark/spark-dist/spark-3.1.2.2-hadoop-2.7-zulu" - -conf "spark.ui.retainedStages=500" --conf "spark.mesos.uris= https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/mysql-connector-java-8.0.18.jar,https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/guava-19.0.jar" --conf "spark.driver.maxResultSize=8g" *--conf "spark.executor.extraJavaOptions=-DSERVICENAME=Zeppelin -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=2015 -XX:-OmitStackTraceInFastThrow -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=55745 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -verbose:gc -Dlog4j.configurationFile=/etc/config/log4j2-executor-config.xml -XX:+UseG1GC -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintFlagsFinal -XX:+PrintReferenceGC -XX:+PrintGCDetails -XX:+PrintAdaptiveSizePolicy -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -XX:+PrintStringDeduplicationStatistics -XX:+UseStringDeduplication -XX:InitiatingHeapOccupancyPercent=35 -Dhttps.proxyHost=proxy.service.consul -Dhttps.proxyPort=3128" * --conf "spark.dynamicAllocation.enabled=true" --conf "spark.default.parallelism=1200" --conf "spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2" --conf "spark.hadoop.fs.AbstractFileSystem.gs.impl=com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS" --conf "spark.app.name=zeppelin_docker_spark3" --conf "spark.shuffle.service.port=7337" --conf "spark.memory.fraction=0.75" --conf "spark.mesos.coarse=true" --conf "spark.ui.port=4041" --conf "spark.dynamicAllocation.executorIdleTimeout=60s" --conf "spark.sql.shuffle.partitions=1200" --conf "spark.sql.parquet.outputTimestampType=TIMESTAMP_MILLIS" --conf "spark.dynamicAllocation.cachedExecutorIdleTimeout=120s" --conf "spark.network.timeout=1200s" --conf "spark.cores.max=600" --conf "spark.hadoop.fs.gs.impl=com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem" --conf "spark.worker.timeout=150000" *--conf "spark.driver.extraJavaOptions=-Dhttps.proxyHost=proxy.service.consul -Dhttps.proxyPort=3128 -Dlog4j.configuration=file:/usr/local/spark/conf/log4j.properties -Djavax.jdo.option.ConnectionDriverName=com.mysql.cj.jdbc.Driver -Djavax.jdo.option.ConnectionPassword=2eebb22277 -Djavax.jdo.option.ConnectionURL=jdbc:mysql://proxysql-backend.service.consul.company.com:6033/hms?useSSL=false&databaseTerm=SCHEMA&nullDatabaseMeansCurrent=true <http://proxysql-backend.service.consul.company.com:6033/hms?useSSL=false&databaseTerm=SCHEMA&nullDatabaseMeansCurrent=true> -Djavax.jdo.option.ConnectionUserName=hms_rw" * --conf "spark.files.overwrite=true" /zeppelin/interpreter/spark/spark-interpreter-0.10.0-SNAPSHOT.jar 10.135.120.245 36419 spark-shared_process : *Error: Unrecognized option: -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=2015* Will continue tackling it... On Thu, Jul 8, 2021 at 4:49 PM Jeff Zhang <zjf...@gmail.com> wrote: > Thanks Lior for the investigation. > > > Lior Chaga <lio...@taboola.com> 于2021年7月8日周四 下午8:31写道: > >> Ok, I think I found the issue. It's not only that the quotations are >> missing from the --conf param, they are also missing from >> the --driver-java-options, which is concatenated to >> the INTERPRETER_RUN_COMMAND in interpreter.sh >> >> I will fix it in my build, but would like a confirmation that this is >> indeed the issue (and I'm not missing anything), so I'd open a pull >> request. >> >> On Thu, Jul 8, 2021 at 3:05 PM Lior Chaga <lio...@taboola.com> wrote: >> >>> I'm trying to run zeppelin using local spark interpreter. >>> Basically everything works, but if I try to set >>> `spark.driver.extraJavaOptions` or `spark.executor.extraJavaOptions` >>> containing several arguments, I get an exception. >>> For instance, for providing `-DmyParam=1 -DmyOtherParam=2`, I'd get: >>> Error: Unrecognized option: -DmyOtherParam=2 >>> >>> I noticed that the spark submit looks as follow: >>> >>> spark-submit --class >>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer >>> --driver-class-path >>> .... *--conf spark.driver.extraJavaOptions=-DmyParam=1 >>> -DmyOtherParam=2* >>> >>> So I tried to patch SparkInterpreterLauncher to add quotation marks >>> (like in the example from spark documentation - >>> https://spark.apache.org/docs/latest/configuration.html#dynamically-loading-spark-properties >>> ) >>> >>> I see that the quotation marks were added: *--conf >>> "spark.driver.extraJavaOptions=-DmyParam=1 -DmyOtherParam=2"* >>> But I still get the same error. >>> >>> Any idea how I can make it work? >>> >> > > -- > Best Regards > > Jeff Zhang >