Moon, It is to support an architecture where Zeppeline does not need to run in the same machine/cluster where spark/hadoop is running.
Right now it us not possible to achieve the same as in yarn-client mode as in that case the spark drivers needs to have access to all data nodes/slave nodes. One can achieve the same having a remote spark stand alone cluster. But in that case I cannot use yarn to address the workload management. Regards, Souravu > On Oct 11, 2015, at 12:25 PM, moon soo Lee <m...@apache.org> wrote: > > My apologies, i missed the most important part of the question. Yarn-cluster > mode. Zeppelin is not expected to work with yarn-cluster mode at the moment. > > Is there any special reason you need to use yarn-cluster mode instead of > yarn-client mode? > > Thanks, > moon > >> On 2015년 10월 11일 (일) at 오후 8:41 Sourav Mazumder >> <sourav.mazumde...@gmail.com> wrote: >> Hi Moon, >> >> Yes I have checked the same. >> >> I have put some debug statement in the interpreter.sh to see what exactly is >> getting passed when I set the SPARK_HOME in zeppelin-env.sh. >> >> The debug statement does show that it is using the spark-submit utility from >> the bin folder of the SPARK_HOME which I have set in zeppelin-env.sh. >> >> Regards, >> Sourav >> >>> On Sun, Oct 11, 2015 at 2:55 AM, moon soo Lee <m...@apache.org> wrote: >>> Could you make sure your zeppelin-env.sh have SPARK_HOME exported? >>> >>> Zeppelin(0.6.0-SNAPSHOT) uses spark-submit command when SPARK_HOME is >>> defined, but your error shows that "please use spark-submit". >>> >>> Thanks, >>> moon >>>> On 2015년 10월 8일 (목) at 오후 9:14 Sourav Mazumder >>>> <sourav.mazumde...@gmail.com> wrote: >>>> Hi Deepak/Moon, >>>> >>>> After seeing the stack trace of the error and the code >>>> org.apache.zeppelin.spark.SparkInterpreter.java I think this is surely a >>>> bug in Spark Interpreter code. >>>> >>>> The SparkInterpreter code is always calling the constructor of >>>> org.apache.spark.SparkContext to create a new Spark Context whenever the >>>> SparkInterpreter class is loaded by >>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer. And hence >>>> this error. >>>> >>>> I'm not sure whether the check for yarn-cluster is newly added in >>>> SparkContext. >>>> >>>> Attaching here the complete stack trace for your ease of reference. >>>> >>>> Regards, >>>> Sourav >>>> >>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't >>>> running on a cluster. Deployment to YARN is not supported directly by >>>> SparkContext. Please use spark-submit. at >>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at >>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) >>>> at >>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) >>>> at >>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) >>>> at >>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) >>>> at >>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) >>>> at >>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) >>>> at >>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) >>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at >>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) >>>> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at >>>> java.util.concurrent.FutureTask.run(Unknown Source) at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown >>>> Source) at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown >>>> Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown >>>> Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown >>>> Source) at java.lang.Thread.run(Unknown Source) >>>> >>>>> On Mon, Oct 5, 2015 at 12:57 PM, Sourav Mazumder >>>>> <sourav.mazumde...@gmail.com> wrote: >>>>> I could execute following without any issue. >>>>> >>>>> spark-submit --class org.apache.spark.examples.SparkPi --master >>>>> yarn-cluster --num-executors 1 --driver-memory 512m --executor-memory >>>>> 512m --executor-cores 1 lib/spark-examples.jar 10 >>>>> >>>>> Regards, >>>>> Sourav >>>>> >>>>>> On Mon, Oct 5, 2015 at 12:04 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> >>>>>> wrote: >>>>>> did you try a test job with yarn-cluster (outside zeppelin) ? >>>>>> >>>>>>> On Mon, Oct 5, 2015 at 11:48 AM, Sourav Mazumder >>>>>>> <sourav.mazumde...@gmail.com> wrote: >>>>>>> Yes I have them setup appropriately. >>>>>>> >>>>>>> Where I am lost is I can see that interpreter is running spark-submit >>>>>>> but at some point of time it is switching to creating a spark context. >>>>>>> >>>>>>> May be, as you rightly mentioned, because of some permission issue it >>>>>>> is not able to run driver on yarn cluster. But what is that >>>>>>> issue/required configuration I'm not able to figure out. >>>>>>> >>>>>>> Regards, >>>>>>> Sourav >>>>>>> >>>>>>>> On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> >>>>>>>> wrote: >>>>>>>> Do you have these settings configured in zeppelin-env.sh >>>>>>>> >>>>>>>> export JAVA_HOME=/usr/src/jdk1.7.0_79/ >>>>>>>> >>>>>>>> export HADOOP_CONF_DIR=/etc/hadoop/conf >>>>>>>> >>>>>>>> Most likely you have this as your able to run with yarn-client. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Looks like the issue is to not be able to run the driver program on >>>>>>>> cluster. >>>>>>>> >>>>>>>> >>>>>>>>> On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder >>>>>>>>> <sourav.mazumde...@gmail.com> wrote: >>>>>>>>> Yes. Spark is installed in the machine where zeppelin is running. >>>>>>>>> >>>>>>>>> The location of spark.yarn.jar is very similar to what you have. I'm >>>>>>>>> using IOP as distribution and it is the directory naming convention >>>>>>>>> specific to IOP which is different form hdp. >>>>>>>>> >>>>>>>>> And yes the setup works perfectly fine when I use master as >>>>>>>>> yarn-client and same setup for SPARK_HOME, HADOOP_CONF_DIR and >>>>>>>>> HADOOP_CLIENT> >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Sourav >>>>>>>>> >>>>>>>>>> On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> Is spark installed on your zeppelin machine ? >>>>>>>>>> >>>>>>>>>> I would to try these >>>>>>>>>> >>>>>>>>>> master yarn-client >>>>>>>>>> spark.home === SPARK INSTALLATION HOME directory on your zeppelin >>>>>>>>>> server. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Looking at spark.yarn.jar , i see spark is installed at >>>>>>>>>> /usr/iop/current/spark-thriftserver/ . But why is it thirftserver >>>>>>>>>> (i do not know what is it). >>>>>>>>>> >>>>>>>>>> I have spark installed (unzip) on zeppelin machine at >>>>>>>>>> /usr/hdp/2.3.1.0-2574/spark/spark/ (can be any location) and have >>>>>>>>>> spark.yarn.jar to >>>>>>>>>> /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder >>>>>>>>>>> <sourav.mazumde...@gmail.com> wrote: >>>>>>>>>>> Hi Deepu, >>>>>>>>>>> >>>>>>>>>>> Here u go. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Sourav >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Properties >>>>>>>>>>> name value >>>>>>>>>>> args >>>>>>>>>>> master yarn-cluster >>>>>>>>>>> spark.app.name Zeppelin >>>>>>>>>>> spark.cores.max >>>>>>>>>>> spark.executor.memory 512m >>>>>>>>>>> spark.home >>>>>>>>>>> spark.yarn.jar >>>>>>>>>>> /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar >>>>>>>>>>> zeppelin.dep.localrepo local-repo >>>>>>>>>>> zeppelin.pyspark.python python >>>>>>>>>>> zeppelin.spark.concurrentSQL false >>>>>>>>>>> zeppelin.spark.maxResult 1000 >>>>>>>>>>> zeppelin.spark.useHiveContext true >>>>>>>>>>> >>>>>>>>>>>> On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) >>>>>>>>>>>> <deepuj...@gmail.com> wrote: >>>>>>>>>>>> Can you share screen shot of your spark interpreter on zeppelin >>>>>>>>>>>> web interface. >>>>>>>>>>>> >>>>>>>>>>>> I have exact same deployment structure and it runs fine with right >>>>>>>>>>>> set of configurations. >>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder >>>>>>>>>>>>> <sourav.mazumde...@gmail.com> wrote: >>>>>>>>>>>>> Hi Moon, >>>>>>>>>>>>> >>>>>>>>>>>>> I'm using 0.6 SNAPSHOT which I built from latest git hub. >>>>>>>>>>>>> >>>>>>>>>>>>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see >>>>>>>>>>>>> that the control goes to the appropriate IF-ELSE block in >>>>>>>>>>>>> interpreter.sh by putting some debug statement. >>>>>>>>>>>>> >>>>>>>>>>>>> But I get the same error as follows - >>>>>>>>>>>>> >>>>>>>>>>>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but >>>>>>>>>>>>> isn't running on a cluster. Deployment to YARN is not supported >>>>>>>>>>>>> directly by SparkContext. Please use spark-submit. at >>>>>>>>>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at >>>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) >>>>>>>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at >>>>>>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) >>>>>>>>>>>>> at >>>>>>>>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >>>>>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at >>>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >>>>>>>>>>>>> at >>>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>>>>>>>>>>>> at >>>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>>>>>>>>>>>> at >>>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>>>>>>>>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>>>>>>>>> >>>>>>>>>>>>> Let me know if you need any other details to figure out what is >>>>>>>>>>>>> going on. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> Sourav >>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <m...@apache.org> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> Which version of Zeppelin are you using? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Master branch uses spark-submit command, when SPARK_HOME is >>>>>>>>>>>>>> defined in conf/zeppelin-env.sh >>>>>>>>>>>>>> >>>>>>>>>>>>>> If you're not on master branch, recommend try it with SPARK_HOME >>>>>>>>>>>>>> defined. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hope this helps, >>>>>>>>>>>>>> moon >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder >>>>>>>>>>>>>>> <sourav.mazumde...@gmail.com> wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> When I try to run Spark Interpreter in Yarn Cluster mode from a >>>>>>>>>>>>>>> remote machine I always get the error saying try spark-submit >>>>>>>>>>>>>>> than using spark context. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Mu Zeppelin process runs in a separate machine remote to the >>>>>>>>>>>>>>> YARN cluster. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Any idea why is this error ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>> Sourav >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Deepak >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Deepak >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Deepak >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Deepak