Re: Run Zeppelin on yarn-client mode

Charmee Patel Sat, 18 Apr 2015 22:17:20 -0700

Hi Ram, Jongyoul,

Many thanks for your response. Was not able to test your suggestions
earlier.


I tried replacing my custom spark jar in interpreter/spark/spark*.jar but
that is giving me exception that Zeppelin Spark interpreter not found.
Looks like the zeppelin spark jar is a fat jar that includes spark and
zeppelin code combined in one jar? I can work around this for now so it is
okay, but would be good to know how custom spark jar works.

Btw, I already had hive-site.xml in spark conf directory but as that was
not being recognized, I copied it over to zeppelin's conf and now it is
being used.

Charmee

On Wed, Apr 15, 2015 at 9:19 AM Ram Venkatesh <[email protected]>
wrote:

>  Hi Charmee,
>
>  I have successfully configured spark in yarn-client mode for HDP by
> setting spark.home and spark.yarn.jar appropriately. I have not validated
> this against CDH.
>
>  For hive metastore access, yes you need to have a hive-site.xml in your
> zeppelin/conf directory or SPARK_CONF directory.
>
>  HTH
> Ram
>
>   On Apr 14, 2015, at 8:33 PM, Charmee Patel <[email protected]> wrote:
>
>  One more question on yarn-mode.
>
>  After setting Spark Yarn Jar path and Spark Home path in Spark
> interpreter (via UI) I was expecting Zeppelin to use my own version of
> Spark. I don't think that is happening. From zeppelin, if I do sc.version
> it gives 1.2.1 (which is included for CDH 5.3). When I run spark-shell from
> my custom Spark lib, version shows as 1.2.0.
>
>  I have set Spark Yarn Jar, Spark Home (from Interpreter UI) and Hadoop
> conf dir (in zeppelin-env.sh). Do I need to do anything else for zeppelin
> to use my spark jar?
>
>  Also, my hive context does not recognize the databases/tables on the
> cluster. Do I need to point or copy hive-site.xml anywhere in zeppelin's
> conf?
>
>  Thanks,
> Charmee
>
>
>
> On Tue, Apr 14, 2015 at 9:32 PM Jongyoul Lee <[email protected]> wrote:
>
>> Good!!
>>
>> On Tue, Apr 14, 2015 at 11:14 PM, Charmee Patel <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>>  Thanks.
>>>
>>>  I had pulled the code from https://github.com/NFLabs/zeppelin repository
>>> when it had not moved to apache/incubator-zeppelin yet and I kept pulling
>>> from same repo. Synced up my version with latest on
>>> apache/incubator-zeppelin and it is working now.
>>>
>>>  -Charmee
>>>
>>> On Mon, Apr 13, 2015 at 9:12 PM Jongyoul Lee <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>>  I don't know what version you use exactly because com.nflabs.zeppelin
>>>> moves to org.apache.zeppelin. You don't need to set SPARK_YARN_JAR at the
>>>> latest master. Could you please check this out? And I've test yarn-client
>>>> mode on 2.5.0-cdh5.3.0 with spark 1.3 but didn't do it with spark 1.2. My
>>>> build script is
>>>>
>>>>  mvn clean package -Pspark-1.3 -Dhadoop.version=2.5.0-cdh5.3.0
>>>> -Phadoop-2.4 -DskipTests -Pyarn -Pbuild-distr
>>>>
>>>>  Regards,
>>>> Jongyoul Lee
>>>>
>>>> On Mon, Apr 13, 2015 at 11:40 PM, Charmee Patel <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>>  I am trying to get Zeppelin work in yarn-client and yarn-submit
>>>>> mode. There are some conflicting notes in the email distros about how to
>>>>> get Zeppelin to work on yarn-client mode. I have tried a few different
>>>>> things but none have worked for me so far.
>>>>>
>>>>>  I have CDH 5.3 . Here is what I have so far
>>>>>
>>>>>    1. Built Zeppelin on local (OSX) using
>>>>>     1. mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0
>>>>>       -Phadoop-2.4 -DskipTests -Pyarn
>>>>>    2. Generated a distribution package and deployed to edge node of
>>>>>    my cluster using
>>>>>     1. mvn clean package -P build-distr -DskipTests
>>>>>    3. At this point local mode works fine
>>>>>    4. To get Yarn mode working
>>>>>       1. I set Hadoop conf dir in zeppelin-env.sh and set Yarn
>>>>>       Master=yarn-client
>>>>>       I get an exception in my logs
>>>>>
>>>>>       ERROR ({pool-2-thread-5} Job.java[run]:165) - Job failed
>>>>>       com.nflabs.zeppelin.interpreter.InterpreterException:
>>>>>       org.apache.thrift.TApplicationException: Internal error processing 
>>>>> open
>>>>>
>>>>>       2. Set Spark_Yarn_Jar but got same exception as above
>>>>>       3. Copied my spark assembly jar in interpreter/spark directory
>>>>>       but that did not work either
>>>>>       4. Also set Spark Yarn Jar and Spark Home in interpreter UI but
>>>>>       that did not work
>>>>>       5. I took Spark Yarn Jar out and let Zeppelin use Spark that
>>>>>       comes bundled with it
>>>>>          1. This seemed to work initially, but as soon as I call
>>>>>          Spark Action (collect/count etc) I get this exception
>>>>>
>>>>>  java.lang.RuntimeException: Error in configuring object at
>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>>>>> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>>>> at
>>>>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>>>>> at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:182)
>>>>>
>>>>>  Any pointers on I might have missed in configuring zeppelin to work
>>>>> with yarn-client mode?
>>>>>
>>>>>  Thanks,
>>>>>  Charmee
>>>>>
>>>>
>>>>
>>>>
>>>>  --
>>>>  이종열, Jongyoul Lee, 李宗烈
>>>>  http://madeng.net
>>>>
>>>
>>
>>
>>  --
>>  이종열, Jongyoul Lee, 李宗烈
>>  http://madeng.net
>>
>
>

Re: Run Zeppelin on yarn-client mode

Reply via email to