Hello there! I'm trying to get SparkR running on Zeppelin. I'm using Spark 2.0.2 (built with Scala 2.11), R 3.4.0, and Zeppelin 0.6.2, on a MapR cluster, and having very little success. I'm having a difficult time googling the errors, as it seems a lot of people get these errors when the SPARK_HOME isn't set, but in my case it is.
The error in question is as follows: org.apache.commons.exec.ExecuteException: Execution failed (Exit value: -559038737. Caused by java.io.IOException: Cannot run program "R" (in directory "."): error=2, No such file or directory) at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:205) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Cannot run program "R" (in directory "."): error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047) at java.lang.Runtime.exec(Runtime.java:617) at org.apache.commons.exec.launcher.Java13CommandLauncher.exec(Java13CommandLauncher.java:61) at org.apache.commons.exec.DefaultExecutor.launch(DefaultExecutor.java:279) at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:336) at org.apache.commons.exec.DefaultExecutor.access$200(DefaultExecutor.java:48) at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:200) ... 1 more Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.<init>(UNIXProcess.java:187) at java.lang.ProcessImpl.start(ProcessImpl.java:130) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028) ... 7 more I have r-base and r-base-dev installed across the cluster, on bare metal (running Ubuntu 14.04), across the cluster, If I fire up just a sparkR shell, it works fine, and I can see the distributed job working in YARN. Here's the env output: SPARK_EXECUTOR_TIMEOUT=300 HOSTNAME=zeppelin-test5285.platform-dev.company.com.au SPARK_HOME=/apps/spark/spark-2.0.2-bin-mapr5.1.0_yarn_fat_j7_2.11 ZEPPELIN_MEM=-Xms1024m -Xmx1024m -XX:MaxPermSize=512m TERM=unknown HOST=node.company.com.au.local ZEPPELIN_INTERPRETER_DIR=/zeppelin/interpreter JAVA_INTP_OPTS= -Dfile.encoding=UTF-8 -Dlog4j.configuration=file:///projects/platform-dev/apps/platform-dev-zeppelin-test5285/conf/log4j.properties -Dzeppelin.log.file=/zeppelin/logs/zeppelin-interpreter-sh-rsherman-zeppelin-test5285.platform-dev.company.com.au.log HADOOP_HOME=/opt/mapr/hadoop/hadoop-2.7.0 SPARK_DRIVER_MEMORY=6g PORT0=31399 APP_NAME=platform-dev-zeppelin-test5285 MESOS_TASK_ID=platform-dev-zeppelin-test5285.ce9d43e1-466a-11e7-8ff4-0242ac1f3802 ZEPPELIN_INTP_MEM=-Xms1024m -Xmx1024m -XX:MaxPermSize=512m SPARK_CONF_DIR=/projects/platform-dev/apps/platform-dev-zeppelin-test5285/conf/spark SPARK_EXECUTOR_MEMORY=12g JAVA_OPTS= -Dfile.encoding=UTF-8 -Xms1024m -Xmx1024m -XX:MaxPermSize=512m -Dlog4j.configuration=file:///projects/platform-dev/apps/platform-dev-zeppelin-test5285/conf/log4j.properties -Dzeppelin.log.file=/zeppelin/logs/zeppelin-rsherman-zeppelin-test5285.platform-company.com.au.log -Dfile.encoding=UTF-8 -Xms1024m -Xmx1024m -XX:MaxPermSize=512m -Dlog4j.configuration=file:///projects/platform-dev/apps/platform-dev-zeppelin-test5285/conf/log4j.properties USER=rsherman SUDO_USER=root ZEPPELIN_NOTEBOOK_DIR=/projects/platform-dev/apps/platform-dev-zeppelin-test5285/data/notebook SUDO_UID=0 INITRD=no APP_USERNAME=rsherman SCALABLE=false MAPR_TICKETFILE_LOCATION=/home/rsherman/.maprticket ZEPPELIN_HOME=/zeppelin ZEPPELIN_RUNNER=java ZEPPELIN_WAR=/zeppelin/zeppelin-web/dist ZEPPELIN_PID_DIR=/zeppelin/run USERNAME=rsherman APP_REVISION=10 Here are the flags set at build time: $make_distro --mvn ${MAVEN_HOME}/bin/mvn --name ${ARTIFACT_NAME} --tgz -Pyarn -Phadoop-${HADOOP_VERSION} -Phive -Psparkr -Phive-thriftserver -Pspark-ganglia-lgpl ${HADOOP_PROVIDED_OPTION} -Dhadoop.version=2.7.0-mapr-1602 -Dyarn.version=2.7.0-mapr-1602 -Dzookeeper.version=3.4.5-mapr-1503 -Dscala-${SCALA_VERSION} -DskipTests -e And finally, here is the full stacktrace leading up to the failure: 17/06/01 02:22:06 INFO SchedulerFactory: Job paragraph_1496283119069_1363557346 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpretershared_session900607537 17/06/01 02:22:06 INFO Paragraph: run paragraph 20170601-021159_529838667 using spark.r org.apache.zeppelin.interpreter.LazyOpenInterpreter@52a4904f 17/06/01 02:22:06 INFO SchedulerFactory: Job remoteInterpretJob_1496283726813 started by scheduler org.apache.zeppelin.spark.SparkRInterpreter936853935 17/06/01 02:22:06 INFO ZeppelinR: File /tmp/zeppelin_sparkr-2889736965928624810.R created 17/06/01 02:22:06 ERROR ZeppelinR: Execution failed (Exit value: -559038737. Caused by java.io.IOException: Cannot run program "R" (in directory "."): error=2, No such file or directory) org.apache.commons.exec.ExecuteException: Execution failed (Exit value: -559038737. Caused by java.io.IOException: Cannot run program "R" (in directory "."): error=2, No such file or directory) at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:205) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Cannot run program "R" (in directory "."): error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047) at java.lang.Runtime.exec(Runtime.java:617) at org.apache.commons.exec.launcher.Java13CommandLauncher.exec(Java13CommandLauncher.java:61) at org.apache.commons.exec.DefaultExecutor.launch(DefaultExecutor.java:279) at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:336) at org.apache.commons.exec.DefaultExecutor.access$200(DefaultExecutor.java:48) at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:200) ... 1 more Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.<init>(UNIXProcess.java:187) at java.lang.ProcessImpl.start(ProcessImpl.java:130) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028) ... 7 more 17/06/01 02:22:07 ERROR Job: Job failed org.apache.zeppelin.interpreter.InterpreterException: sparkr is not responding at org.apache.zeppelin.spark.ZeppelinR.waitForRScriptInitialized(ZeppelinR.java:295) at org.apache.zeppelin.spark.ZeppelinR.request(ZeppelinR.java:235) at org.apache.zeppelin.spark.ZeppelinR.eval(ZeppelinR.java:183) at org.apache.zeppelin.spark.ZeppelinR.open(ZeppelinR.java:172) at org.apache.zeppelin.spark.SparkRInterpreter.open(SparkRInterpreter.java:85) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341) at org.apache.zeppelin.scheduler.Job.run(Job.java:176) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 17/06/01 02:22:07 INFO SchedulerFactory: Job remoteInterpretJob_1496283726813 finished by scheduler org.apache.zeppelin.spark.SparkRInterpreter936853935 17/06/01 02:22:07 INFO ZeppelinR: File /tmp/zeppelin_sparkr-5775729319721689200.R created 17/06/01 02:22:07 INFO NotebookServer: Job 20170601-021159_529838667 is finished 17/06/01 02:22:07 ERROR ZeppelinR: Execution failed (Exit value: -559038737. Caused by java.io.IOException: Cannot run program "R" (in directory "."): error=2, No such file or directory) org.apache.commons.exec.ExecuteException: Execution failed (Exit value: -559038737. Caused by java.io.IOException: Cannot run program "R" (in directory "."): error=2, No such file or directory) at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:205) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Cannot run program "R" (in directory "."): error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047) at java.lang.Runtime.exec(Runtime.java:617) at org.apache.commons.exec.launcher.Java13CommandLauncher.exec(Java13CommandLauncher.java:61) at org.apache.commons.exec.DefaultExecutor.launch(DefaultExecutor.java:279) at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:336) at org.apache.commons.exec.DefaultExecutor.access$200(DefaultExecutor.java:48) at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:200) ... 1 more Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.<init>(UNIXProcess.java:187) at java.lang.ProcessImpl.start(ProcessImpl.java:130) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028) ... 7 more 17/06/01 02:22:07 INFO SchedulerFactory: Job paragraph_1496283119069_1363557346 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpretershared_session900607537 17/06/01 02:22:08 ERROR TThreadPoolServer: Error occurred during processing of message. org.apache.zeppelin.interpreter.InterpreterException: sparkr is not responding at org.apache.zeppelin.spark.ZeppelinR.waitForRScriptInitialized(ZeppelinR.java:295) at org.apache.zeppelin.spark.ZeppelinR.request(ZeppelinR.java:235) at org.apache.zeppelin.spark.ZeppelinR.eval(ZeppelinR.java:183) at org.apache.zeppelin.spark.ZeppelinR.open(ZeppelinR.java:172) at org.apache.zeppelin.spark.SparkRInterpreter.open(SparkRInterpreter.java:85) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:110) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.getProgress(RemoteInterpreterServer.java:404) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:1509) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:1494) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 17/06/01 02:22:08 ERROR JobProgressPoller: Can not get or update progress org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getProgress(RemoteInterpreter.java:373) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:111) at org.apache.zeppelin.notebook.Paragraph.progress(Paragraph.java:237) at org.apache.zeppelin.scheduler.JobProgressPoller.run(JobProgressPoller.java:51) Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_getProgress(RemoteInterpreterService.java:296) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.getProgress(RemoteInterpreterService.java:281) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getProgress(RemoteInterpreter.java:370) ... 3 more It's probably obvious that I don't have much experience with Zeppelin, so if I've left out anything obvious, just let me know and I'll be happy to add it. anyone could point me in the right direction, it would be greatly appreciated. Best Regards, Roger Sherman