One more question on yarn-mode. After setting Spark Yarn Jar path and Spark Home path in Spark interpreter (via UI) I was expecting Zeppelin to use my own version of Spark. I don't think that is happening. From zeppelin, if I do sc.version it gives 1.2.1 (which is included for CDH 5.3). When I run spark-shell from my custom Spark lib, version shows as 1.2.0.
I have set Spark Yarn Jar, Spark Home (from Interpreter UI) and Hadoop conf dir (in zeppelin-env.sh). Do I need to do anything else for zeppelin to use my spark jar? Also, my hive context does not recognize the databases/tables on the cluster. Do I need to point or copy hive-site.xml anywhere in zeppelin's conf? Thanks, Charmee On Tue, Apr 14, 2015 at 9:32 PM Jongyoul Lee <[email protected]> wrote: > Good!! > > On Tue, Apr 14, 2015 at 11:14 PM, Charmee Patel <[email protected]> > wrote: > >> Hi, >> >> Thanks. >> >> I had pulled the code from https://github.com/NFLabs/zeppelin repository >> when it had not moved to apache/incubator-zeppelin yet and I kept pulling >> from same repo. Synced up my version with latest on >> apache/incubator-zeppelin and it is working now. >> >> -Charmee >> >> On Mon, Apr 13, 2015 at 9:12 PM Jongyoul Lee <[email protected]> wrote: >> >>> Hi, >>> >>> I don't know what version you use exactly because com.nflabs.zeppelin >>> moves to org.apache.zeppelin. You don't need to set SPARK_YARN_JAR at the >>> latest master. Could you please check this out? And I've test yarn-client >>> mode on 2.5.0-cdh5.3.0 with spark 1.3 but didn't do it with spark 1.2. My >>> build script is >>> >>> mvn clean package -Pspark-1.3 -Dhadoop.version=2.5.0-cdh5.3.0 >>> -Phadoop-2.4 -DskipTests -Pyarn -Pbuild-distr >>> >>> Regards, >>> Jongyoul Lee >>> >>> On Mon, Apr 13, 2015 at 11:40 PM, Charmee Patel <[email protected]> >>> wrote: >>> >>>> Hi, >>>> >>>> I am trying to get Zeppelin work in yarn-client and yarn-submit >>>> mode. There are some conflicting notes in the email distros about how to >>>> get Zeppelin to work on yarn-client mode. I have tried a few different >>>> things but none have worked for me so far. >>>> >>>> I have CDH 5.3 . Here is what I have so far >>>> >>>> 1. Built Zeppelin on local (OSX) using >>>> 1. mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0 >>>> -Phadoop-2.4 -DskipTests -Pyarn >>>> 2. Generated a distribution package and deployed to edge node of my >>>> cluster using >>>> 1. mvn clean package -P build-distr -DskipTests >>>> 3. At this point local mode works fine >>>> 4. To get Yarn mode working >>>> 1. I set Hadoop conf dir in zeppelin-env.sh and set Yarn >>>> Master=yarn-client >>>> I get an exception in my logs >>>> >>>> ERROR ({pool-2-thread-5} Job.java[run]:165) - Job failed >>>> com.nflabs.zeppelin.interpreter.InterpreterException: >>>> org.apache.thrift.TApplicationException: Internal error processing >>>> open >>>> >>>> 2. Set Spark_Yarn_Jar but got same exception as above >>>> 3. Copied my spark assembly jar in interpreter/spark directory >>>> but that did not work either >>>> 4. Also set Spark Yarn Jar and Spark Home in interpreter UI but >>>> that did not work >>>> 5. I took Spark Yarn Jar out and let Zeppelin use Spark that >>>> comes bundled with it >>>> 1. This seemed to work initially, but as soon as I call Spark >>>> Action (collect/count etc) I get this exception >>>> >>>> java.lang.RuntimeException: Error in configuring object at >>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) >>>> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) >>>> at >>>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) >>>> at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:182) >>>> >>>> Any pointers on I might have missed in configuring zeppelin to work >>>> with yarn-client mode? >>>> >>>> Thanks, >>>> Charmee >>>> >>> >>> >>> >>> -- >>> 이종열, Jongyoul Lee, 李宗烈 >>> http://madeng.net >>> >> > > > -- > 이종열, Jongyoul Lee, 李宗烈 > http://madeng.net >
