Re: Zeppelin fails to submit Spark job in Yarn Cluster mode - a Bug ?

Sourav Mazumder Sun, 11 Oct 2015 15:31:20 -0700

Moon,

It is to support an architecture where Zeppeline does not need to run in the 
same machine/cluster where spark/hadoop is running.


Right now it us not possible to achieve the same as in yarn-client mode as in 
that case the spark drivers needs to have access to all data nodes/slave nodes.

One can achieve the same having a remote spark stand alone cluster. But in that 
case I cannot use yarn to address the workload management.

Regards,
Souravu

> On Oct 11, 2015, at 12:25 PM, moon soo Lee <m...@apache.org> wrote:
> 
> My apologies, i missed the most important part of the question. Yarn-cluster 
> mode. Zeppelin is not expected to work with yarn-cluster mode at the moment.
> 
> Is there any special reason you need to use yarn-cluster mode instead of 
> yarn-client mode?
> 
> Thanks,
> moon
> 
>> On 2015년 10월 11일 (일) at 오후 8:41 Sourav Mazumder 
>> <sourav.mazumde...@gmail.com> wrote:
>> Hi Moon,
>> 
>> Yes I have checked the same.
>> 
>> I have put some debug statement in the interpreter.sh to see what exactly is 
>> getting passed when I set the SPARK_HOME in zeppelin-env.sh.
>> 
>> The debug statement does show that it is using the spark-submit utility from 
>> the bin folder of the SPARK_HOME which I have set in zeppelin-env.sh.
>> 
>> Regards,
>> Sourav
>> 
>>> On Sun, Oct 11, 2015 at 2:55 AM, moon soo Lee <m...@apache.org> wrote:
>>> Could you make sure your zeppelin-env.sh have SPARK_HOME exported?
>>> 
>>> Zeppelin(0.6.0-SNAPSHOT) uses spark-submit command when SPARK_HOME is 
>>> defined, but your error shows that "please use spark-submit". 
>>> 
>>> Thanks,
>>> moon
>>>> On 2015년 10월 8일 (목) at 오후 9:14 Sourav Mazumder 
>>>> <sourav.mazumde...@gmail.com> wrote:
>>>> Hi Deepak/Moon,
>>>> 
>>>> After seeing the stack trace of the error and the code 
>>>> org.apache.zeppelin.spark.SparkInterpreter.java I think this is surely a 
>>>> bug in Spark Interpreter code.
>>>> 
>>>> The SparkInterpreter code is always calling the constructor of 
>>>> org.apache.spark.SparkContext to create a new Spark Context whenever the 
>>>> SparkInterpreter class is loaded by 
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer. And hence 
>>>> this error.
>>>> 
>>>> I'm not sure whether the check for yarn-cluster is newly added in 
>>>> SparkContext.
>>>> 
>>>> Attaching here the complete stack trace for your ease of reference.
>>>> 
>>>> Regards,
>>>> Sourav
>>>> 
>>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't 
>>>> running on a cluster. Deployment to YARN is not supported directly by 
>>>> SparkContext. Please use spark-submit. at 
>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at  
>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>>>>  at 
>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>>>>  at 
>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) 
>>>> at 
>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>>>>  at 
>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>>>>  at 
>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>>>>  at 
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>>>  at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at 
>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) 
>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at 
>>>> java.util.concurrent.FutureTask.run(Unknown Source) at  
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown
>>>>  Source) at 
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
>>>>  Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
>>>> Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
>>>> Source) at java.lang.Thread.run(Unknown Source)
>>>> 
>>>>> On Mon, Oct 5, 2015 at 12:57 PM, Sourav Mazumder 
>>>>> <sourav.mazumde...@gmail.com> wrote:
>>>>> I could execute following without any issue.
>>>>> 
>>>>> spark-submit --class org.apache.spark.examples.SparkPi --master 
>>>>> yarn-cluster --num-executors 1 --driver-memory 512m --executor-memory 
>>>>> 512m --executor-cores 1 lib/spark-examples.jar 10
>>>>> 
>>>>> Regards,
>>>>> Sourav
>>>>> 
>>>>>> On Mon, Oct 5, 2015 at 12:04 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> 
>>>>>> wrote:
>>>>>> did you try a test job with yarn-cluster (outside zeppelin) ?
>>>>>> 
>>>>>>> On Mon, Oct 5, 2015 at 11:48 AM, Sourav Mazumder 
>>>>>>> <sourav.mazumde...@gmail.com> wrote:
>>>>>>> Yes I have them setup appropriately.
>>>>>>> 
>>>>>>> Where I am lost is I can see that interpreter is running spark-submit 
>>>>>>> but at some point of time it is switching to creating a spark context.
>>>>>>> 
>>>>>>> May be, as you rightly mentioned, because of some permission issue it 
>>>>>>> is not able to run driver on yarn cluster. But what is that 
>>>>>>> issue/required configuration I'm not able to figure out.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Sourav
>>>>>>> 
>>>>>>>> On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> 
>>>>>>>> wrote:
>>>>>>>> Do you have these settings configured in zeppelin-env.sh 
>>>>>>>> 
>>>>>>>> export JAVA_HOME=/usr/src/jdk1.7.0_79/
>>>>>>>> 
>>>>>>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>>>>>>> 
>>>>>>>> Most likely you have this as your able to run with yarn-client.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Looks like the issue is to not be able to run the driver program on 
>>>>>>>> cluster. 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder 
>>>>>>>>> <sourav.mazumde...@gmail.com> wrote:
>>>>>>>>> Yes. Spark is installed in the machine where zeppelin is running.
>>>>>>>>> 
>>>>>>>>> The location of spark.yarn.jar is very similar to what you have. I'm 
>>>>>>>>> using IOP as distribution and it is the directory naming convention 
>>>>>>>>> specific to IOP which is different form hdp.
>>>>>>>>> 
>>>>>>>>> And yes the setup works perfectly fine when I use master as 
>>>>>>>>> yarn-client and same setup for SPARK_HOME, HADOOP_CONF_DIR and 
>>>>>>>>> HADOOP_CLIENT>
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Sourav
>>>>>>>>> 
>>>>>>>>>> On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> 
>>>>>>>>>> wrote:
>>>>>>>>>> Is spark installed on your zeppelin machine ?
>>>>>>>>>> 
>>>>>>>>>> I would to try these
>>>>>>>>>> 
>>>>>>>>>> master       yarn-client
>>>>>>>>>> spark.home === SPARK INSTALLATION HOME directory on your zeppelin 
>>>>>>>>>> server.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Looking at  spark.yarn.jar , i see spark is installed at 
>>>>>>>>>> /usr/iop/current/spark-thriftserver/  . But why is it thirftserver 
>>>>>>>>>> (i do not know what is it).
>>>>>>>>>> 
>>>>>>>>>> I have spark installed (unzip) on zeppelin machine at 
>>>>>>>>>> /usr/hdp/2.3.1.0-2574/spark/spark/  (can be any location) and have 
>>>>>>>>>> spark.yarn.jar to 
>>>>>>>>>> /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder 
>>>>>>>>>>> <sourav.mazumde...@gmail.com> wrote:
>>>>>>>>>>> Hi Deepu,
>>>>>>>>>>> 
>>>>>>>>>>> Here u go.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Sourav
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>  
>>>>>>>>>>> Properties
>>>>>>>>>>> name        value
>>>>>>>>>>> args        
>>>>>>>>>>> master      yarn-cluster
>>>>>>>>>>> spark.app.name      Zeppelin
>>>>>>>>>>> spark.cores.max     
>>>>>>>>>>> spark.executor.memory       512m
>>>>>>>>>>> spark.home  
>>>>>>>>>>> spark.yarn.jar      
>>>>>>>>>>> /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
>>>>>>>>>>> zeppelin.dep.localrepo      local-repo
>>>>>>>>>>> zeppelin.pyspark.python     python
>>>>>>>>>>> zeppelin.spark.concurrentSQL        false
>>>>>>>>>>> zeppelin.spark.maxResult    1000
>>>>>>>>>>> zeppelin.spark.useHiveContext       true
>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) 
>>>>>>>>>>>> <deepuj...@gmail.com> wrote:
>>>>>>>>>>>> Can you share screen shot of your spark interpreter on zeppelin 
>>>>>>>>>>>> web interface.
>>>>>>>>>>>> 
>>>>>>>>>>>> I have exact same deployment structure and it runs fine with right 
>>>>>>>>>>>> set of configurations.
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder 
>>>>>>>>>>>>> <sourav.mazumde...@gmail.com> wrote:
>>>>>>>>>>>>> Hi Moon,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I'm using 0.6 SNAPSHOT which I built from latest git hub.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see 
>>>>>>>>>>>>> that the control goes to the appropriate IF-ELSE block in 
>>>>>>>>>>>>> interpreter.sh by putting some debug statement.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> But I get the same error as follows -
>>>>>>>>>>>>> 
>>>>>>>>>>>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but 
>>>>>>>>>>>>> isn't running on a cluster. Deployment to YARN is not supported 
>>>>>>>>>>>>> directly by SparkContext. Please use spark-submit. at 
>>>>>>>>>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at 
>>>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>>>>>>>>>>>>>  at 
>>>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>>>>>>>>>>>>>  at 
>>>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
>>>>>>>>>>>>>  at 
>>>>>>>>>>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>>>>>>>>>>>>>  at 
>>>>>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>>>>>>>>>>>>>  at 
>>>>>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>>>>>>>>>>>>>  at 
>>>>>>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>>>>>>>>>>>>  at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at 
>>>>>>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>>>>>>>>>>>>>  at 
>>>>>>>>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>>>>>>>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
>>>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>>>>>>>  at 
>>>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>>>>>>>  at 
>>>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>>>>>>>>>  at 
>>>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>>>>>>>>>  at java.lang.Thread.run(Thread.java:745)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Let me know if you need any other details to figure out what is 
>>>>>>>>>>>>> going on.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Sourav
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <m...@apache.org> 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Which version of Zeppelin are you using?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Master branch uses spark-submit command, when SPARK_HOME is 
>>>>>>>>>>>>>> defined in conf/zeppelin-env.sh
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> If you're not on master branch, recommend try it with SPARK_HOME 
>>>>>>>>>>>>>> defined.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hope this helps,
>>>>>>>>>>>>>> moon
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder 
>>>>>>>>>>>>>>> <sourav.mazumde...@gmail.com> wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> When I try to run Spark Interpreter in Yarn Cluster mode from a 
>>>>>>>>>>>>>>> remote machine I always get the error saying try spark-submit 
>>>>>>>>>>>>>>> than using spark context.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Mu Zeppelin process runs in a separate machine remote to the 
>>>>>>>>>>>>>>> YARN cluster. 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Any idea why is this error ?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>> Sourav
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> -- 
>>>>>>>>>>>> Deepak
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> -- 
>>>>>>>>>> Deepak
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> Deepak
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> Deepak

Re: Zeppelin fails to submit Spark job in Yarn Cluster mode - a Bug ?

Reply via email to