Fixed it mvn clean package -Pspark-1.3 -Dspark.version=1.3.1 -Dhadoop.version=2.7.0 -Phadoop-2.6 -Pyarn -DskipTests
Earlier i had mvn clean install -DskipTests -Pspark-1.3 -Dspark.version=1.3.1 -Phadoop-2.7 -Pyarn On Mon, Aug 3, 2015 at 1:31 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote: > I have hadoop cluster up using Ambari. It also allowed me to install Spark > 1.3.1 and i can run sample spark application & Yarn application. So cluster > is up and running fine. > > I got Zeppelin setup on a new box and was able to launch UI. > > I modified spark interpreter to set > > masteryarn-clientspark.app.nameZeppelinspark.cores.max > spark.driver.extraJavaOptions-Dhdp.version=2.3.1.0-2574 > spark.executor.memory512mspark.home/usr/hdp/2.3.1.0-2574/spark > spark.yarn.am.extraJavaOptions-Dhdp.version=2.3.1.0-2574spark.yarn.jar > /home/zeppelin/incubator-zeppelin/interpreter/spark/zeppelin-spark-0.6.0-incubating-SNAPSHOT.jar > zeppelin.dep.localrepolocal-repo > > When i run a spark notebook > %spark > val ambariLogs = > sc.textFile("file:///var/log/ambari-agent/ambari-agent.log") > ambariLogs.take(10).mkString("\n") > > (The location exists) > > I see two exceptions in Zeppelin spark interpreter logs > > ERROR [2015-08-03 13:30:50,262] ({pool-1-thread-2} > ProcessFunction.java[process]:41) - Internal error processing getProgress > > java.lang.NoClassDefFoundError: Could not initialize class > org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$ > > at > org.apache.spark.deploy.yarn.ClientArguments.<init>(ClientArguments.scala:38) > > at > org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:55) > > at > org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141) > > at org.apache.spark.SparkContext.<init>(SparkContext.scala:381) > > at > org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:301) > > at > org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146) > > at > org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:423) > > at > org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) > > at > org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) > > at > org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:109) > > at > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.getProgress(RemoteInterpreterServer.java:298) > > at > org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:1068) > > at > org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:1053) > > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > > AND > > > WARN [2015-08-03 13:30:50,085] ({pool-1-thread-2} > Logging.scala[logWarning]:71) - Service 'SparkUI' could not bind on port > 4041. Attempting port 4042. > > INFO [2015-08-03 13:30:50,112] ({pool-1-thread-2} > Server.java[doStart]:272) - jetty-8.y.z-SNAPSHOT > > WARN [2015-08-03 13:30:50,123] ({pool-1-thread-2} > AbstractLifeCycle.java[setFailed]:204) - FAILED > SelectChannelConnector@0.0.0.0:4042: java.net.BindException: Address > already in use > > java.net.BindException: Address already in use > > at sun.nio.ch.Net.bind0(Native Method) > > at sun.nio.ch.Net.bind(Net.java:444) > > at sun.nio.ch.Net.bind(Net.java:436) > > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > > > Any suggestions ? > > > On Mon, Aug 3, 2015 at 11:00 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> > wrote: > >> Thanks a lot for all these documents. Appreciate your effort & time. >> >> On Mon, Aug 3, 2015 at 10:15 AM, Christian Tzolov <ctzo...@pivotal.io> >> wrote: >> >>> ÐΞ€ρ@Ҝ (๏̯͡๏), >>> >>> I've successfully run Zeppelin with Spark on YARN. I'm using Ambari and >>> PivotalHD30. PHD30 is ODP compliant so you should be able to repeat the >>> configuration for HDP (e.g. hortonworks). >>> >>> 1. Before you start with Zeppelin, make sure that your Spark/YARN env. >>> works from the command line (e.g run Pi test). If it doesn't work make sure >>> that the hdp.version is set correctly or you can hardcode the stack.name >>> and stack.version properties as Ambari Custom yarn-site properties (that is >>> what i did). >>> >>> 2. Your Zeppelin should be build with proper Spark and Hadoop versions >>> and YARN support enabled. In my case used this build command: >>> >>> mvn clean package -Pspark-1.4 -Dspark.version=1.4.1 >>> -Dhadoop.version=2.6.0 -Phadoop-2.6 -Pyarn -DskipTests -Pbuild-distr >>> >>> 3. Open the Spark interpreter configuration and set 'master' property to >>> 'yarn-client' ( e.g. master=yarn-client). then press Save. >>> >>> 4. In (conf/zeppelin-env.sh) set HADOOP_CONF_DIR for PHD and HDP it will >>> look like this: >>> export HADOOP_CONF_DIR=/etc/hadoop/conf >>> >>> 5. (optional) i've restarted the zeppelin daemon but i don't think this >>> is required. >>> >>> 6. Make sure that HDFS has /user/<zeppelin user> folder exists and has >>> HDFS write permissions. Otherwise you can create it like this: >>> sudo -u hdfs hdfs dfs -mkdir /user/<zeppelin user> >>> sudo -u hdfs hdfs dfs -chown -R <zeppelin user>t:hdfs /user/<zeppelin >>> user> >>> >>> Good to go! >>> >>> Cheers, >>> Christian >>> >>> On 3 August 2015 at 17:50, Vadla, Karthik <karthik.va...@intel.com> >>> wrote: >>> >>>> Hi Deepak, >>>> >>>> >>>> >>>> I have documented everything here. >>>> >>>> Please check published document. >>>> >>>> >>>> https://software.intel.com/sites/default/files/managed/bb/bf/Apache-Zeppelin.pdf >>>> >>>> >>>> >>>> Thanks >>>> >>>> Karthik Vadla >>>> >>>> >>>> >>>> *From:* ÐΞ€ρ@Ҝ (๏̯͡๏) [mailto:deepuj...@gmail.com] >>>> *Sent:* Sunday, August 2, 2015 9:25 PM >>>> *To:* users@zeppelin.incubator.apache.org >>>> *Subject:* Yarn + Spark + Zepplin ? >>>> >>>> >>>> >>>> Hello, >>>> >>>> I would like to try out Zepplin and hence i got a 7 node Hadoop cluster >>>> with spark history server setup. I am able to run sample spark applications >>>> on my YARN cluster. >>>> >>>> >>>> >>>> I have no clue how to get zepplin to connect to this YARN cluster. >>>> Under https://zeppelin.incubator.apache.org/docs/install/install.html >>>> i see MASTER to point to spark master. I do not have a spark master >>>> running. >>>> >>>> >>>> >>>> How do i get Zepplin to be able to read data from YARN cluster ? Please >>>> share documentation. >>>> >>>> >>>> >>>> Regards, >>>> >>>> Deepak >>>> >>>> >>>> >>> >>> >>> >>> -- >>> Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution >>> Architect, EMEA Practice Team | Pivotal <http://pivotal.io/> >>> ctzo...@pivotal.io|+31610285517 >>> >> >> >> >> -- >> Deepak >> >> > > > -- > Deepak > > -- Deepak