Hi Kevin, I've mistaken. spark 1.2 needs snappyjava for compression because snappy is a default option for compressing on spark 1.2. That's all.
In my solution for avoiding this is JAVA_OPTS='-Dspark.io.compression.codec=lzf" ./zeppelin-daemon.sh start. And I've found we don't have to set SPARK_CLASSPATH. SPARK_CLASSPATH is as same as ZEPPELIN_CLASSPATH and this is a same value of CLASSPATH. I have a new question. Is there any way to set "spark.*"? In general, spark-submit use spark-defaults.conf. But Beacuse Zeppelin doesn't use spark-submit, we cannot set spark.* value except setting java system properties. Regards, JL On Thu, Jan 29, 2015 at 5:18 PM, Kevin Kim (Sangwoo) <[email protected]> wrote: > Cool, please send pull request, I'll look into it! > > On Thu Jan 29 2015 at 2:41:41 PM Jongyoul Lee <[email protected]> wrote: > > > Hi Kein, > > > > ADD_JARS is a good way to solve it. But, snappyjava depends on zeppelin. > > Zeppelin should add their jars into appropriate way. My PR is about > adding > > jars and might be very small change. > > > > Regars, > > JL > > > > On Thu, Jan 29, 2015 at 2:34 PM, Kevin (Sangwoo) Kim < > [email protected]> > > wrote: > > > > > Well, for me, > > > When I need to supply external libraries, > > > I'm using > > > export ADD_JARS="~~~.jar" > > > export ZEPPELIN_CLASSPATH="~~~.jar" > > > in zeppelin-env.sh > > > > > > and using ADD_JARS="~~~.jar" > > > in spark-env.sh for spark clusters. (the library jar is deployed across > > all > > > clusters) > > > > > > I want to note that the config I'm using is quite old and deprecated. > > > So I'm testing #308 for replace this. > > > > > > Of course a contribution is always welcomed, It would be cool supplying > > it > > > via PR if the code is simple, or the code is large, it would be good to > > > discuss it before writing codes. > > > > > > Regards, > > > Kevin > > > > > > > > > On Thu Jan 29 2015 at 2:21:34 PM Jongyoul Lee <[email protected]> > > wrote: > > > > > > > I'll resend email 'cause my attachment's size if larger than 1000000 > > > bytes > > > > > > > > > > > > ---------- Forwarded message ---------- > > > > From: Jongyoul Lee <[email protected]> > > > > Date: Thu, Jan 29, 2015 at 2:14 PM > > > > Subject: Re: Zeppelin with external cluster > > > > To: [email protected] > > > > > > > > > > > > Hi Kevin, > > > > > > > > I also change master to spark://dicc-m002:7077. Actually, I think > > > > interpreter.json affect what cluster is used on running codes. > Anyway, > > my > > > > interpreter screenshot is below, and my error is like this. > > > > > > > > org.apache.spark.SparkException: Job aborted due to stage failure: > > Task 1 > > > > in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in > > stage > > > > 0.0 (TID 6, DICc-r1n029): java.lang.UnsatisfiedLinkError: no > > snappyjava > > > > in java.library.path at java.lang.ClassLoader. > > > > loadLibrary(ClassLoader.java:1886) at > > > java.lang.Runtime.loadLibrary0(Runtime.java:849) > > > > at java.lang.System.loadLibrary(System.java:1088) at > > org.xerial.snappy. > > > > SnappyLoader.loadNativeLibrary(SnappyLoader.java:170) at > > > > org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:145) at > > > > org.xerial.snappy.Snappy.<clinit>(Snappy.java:47) at > > org.xerial.snappy. > > > > SnappyInputStream.hasNextChunk(SnappyInputStream.java:358) at > > > > org.xerial.snappy.SnappyInputStream.rawRead( > > SnappyInputStream.java:167) > > > > at org.xerial.snappy.SnappyInputStream.read( > > SnappyInputStream.java:150) > > > > at > > > java.io.ObjectInputStream$PeekInputStream.read( > > ObjectInputStream.java:2310) > > > > at > > > java.io.ObjectInputStream$PeekInputStream.readFully( > > ObjectInputStream.java:2323) > > > > at java.io.ObjectInputStream$BlockDataInputStream. > > > > readShort(ObjectInputStream.java:2794) at java.io.ObjectInputStream. > > > > readStreamHeader(ObjectInputStream.java:801) at > > > > java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) at > > > > org.apache.spark.serializer.JavaDeserializationStream$$ > > > > anon$1.<init>(JavaSerializer.scala:57) at > org.apache.spark.serializer. > > > > JavaDeserializationStream.<init>(JavaSerializer.scala:57) at > > > > > > > org.apache.spark.serializer.JavaSerializerInstance.deserializeStream( > > JavaSerializer.scala:95) > > > > at > > > org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject( > > TorrentBroadcast.scala:215) > > > > at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$ > > > > readBroadcastBlock$1.apply(TorrentBroadcast.scala:177) at > > > > org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000) at > > > > > > > org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock( > > TorrentBroadcast.scala:164) > > > > at org.apache.spark.broadcast.TorrentBroadcast._value$ > > > > lzycompute(TorrentBroadcast.scala:64) at org.apache.spark.broadcast. > > > > TorrentBroadcast._value(TorrentBroadcast.scala:64) at > > > > > > > org.apache.spark.broadcast.TorrentBroadcast.getValue( > > TorrentBroadcast.scala:87) > > > > at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70) at > > > > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58) at > > > > org.apache.spark.scheduler.Task.run(Task.scala:56) at > > > > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > > at > > > > > > > java.util.concurrent.ThreadPoolExecutor.runWorker( > > ThreadPoolExecutor.java:1145) > > > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run( > > ThreadPoolExecutor.java:615) > > > > at java.lang.Thread.run(Thread.java:744) Driver stacktrace: at > > > > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$ > > > > > > > scheduler$DAGScheduler$$failJobAndIndependentStages( > > DAGScheduler.scala:1214) > > > > at > > > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply( > > DAGScheduler.scala:1203) > > > > at > > > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply( > > DAGScheduler.scala:1202) > > > > at > > > scala.collection.mutable.ResizableArray$class.foreach( > > ResizableArray.scala:59) > > > > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > > at > > > > > > > org.apache.spark.scheduler.DAGScheduler.abortStage( > > DAGScheduler.scala:1202) > > > > at org.apache.spark.scheduler.DAGScheduler$$anonfun$ > > > > handleTaskSetFailed$1.apply(DAGScheduler.scala:696) at > > > > org.apache.spark.scheduler.DAGScheduler$$anonfun$ > > > > handleTaskSetFailed$1.apply(DAGScheduler.scala:696) at > > > > scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler. > > > > DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696) at > > > > org.apache.spark.scheduler.DAGSchedulerEventProcessActor$ > > > > $anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420) at > > > > akka.actor.Actor$class.aroundReceive(Actor.scala:465) at > > > > org.apache.spark.scheduler.DAGSchedulerEventProcessActor. > > > > aroundReceive(DAGScheduler.scala:1375) at akka.actor.ActorCell. > > > > receiveMessage(ActorCell.scala:516) at > > > akka.actor.ActorCell.invoke(ActorCell.scala:487) > > > > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) at > > > > akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch. > > > > > > > ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec( > > AbstractDispatcher.scala:393) > > > > at scala.concurrent.forkjoin.ForkJoinTask.doExec( > > ForkJoinTask.java:260) > > > > at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue. > > > > runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin. > > > > ForkJoinPool.runWorker(ForkJoinPool.java:1979) at > > > > scala.concurrent.forkjoin.ForkJoinWorkerThread.run( > > > > ForkJoinWorkerThread.java:107) > > > > > > > > I think that this error is about class path. I'm running zeppelin > under > > > > /home/1001079/apache-zeppelin. Which means all classes are located > > under > > > > this directory. Because zeppelin adds classes to SPARK_CLASSPATH, if > > > slave > > > > doesn't have that libraries on the same path, It might be no class > > error > > > > occurs. > > > > > > > > I want to contribute by fixing this issue. Could you please tell me > > > > regular steps for dealing with an issue? Or Is it ok to make a PR > > without > > > > JIRA issue? > > > > > > > > Regards, > > > > JL > > > > > > > > On Thu, Jan 29, 2015 at 1:55 PM, Kevin (Sangwoo) Kim < > > > [email protected]> > > > > wrote: > > > > > > > >> Hi Jongyoul, > > > >> I'm using Zeppelin with external cluster. > > > >> (standalone mode) > > > >> > > > >> All I needed to do is, writing master setting like > > > >> export MASTER="spark://IP-ADDRESS:7077" > > > >> in $ZEPPELIN/conf/zeppelin-env.sh > > > >> > > > >> If your error persists, plz post the error message in reply! > > > >> I'm gonna looking at it. > > > >> > > > >> Regards, > > > >> Kevin > > > >> > > > >> > > > >> On Thu Jan 29 2015 at 12:58:41 PM Jongyoul Lee <[email protected]> > > > >> wrote: > > > >> > > > >> > Hi dev, > > > >> > > > > >> > I've succeeded zeppelin with spark 1.2. Thanks, Moon. Now, I'm > > trying > > > to > > > >> > use zeppelin with external cluster. I've tested yesterday with > > > >> standalone, > > > >> > mesos, but the results are not good. In case of standalone, No > > > >> snappyjava > > > >> > error occurs, and in case of mesos, Nothing's happened. Do you > have > > > any > > > >> > reference to run zeppelin with external cluster? If you don't have > > > >> anyone, > > > >> > I can write references for running with external cluster. > > > >> > > > > >> > Regards, > > > >> > JL > > > >> > > > > >> > -- > > > >> > 이종열, Jongyoul Lee, 李宗烈 > > > >> > http://madeng.net > > > >> > > > > >> > > > > > > > > > > > > > > > > -- > > > > 이종열, Jongyoul Lee, 李宗烈 > > > > http://madeng.net > > > > > > > > > > > > > > > > -- > > > > 이종열, Jongyoul Lee, 李宗烈 > > > > http://madeng.net > > > > > > > > > > > > > > > -- > > 이종열, Jongyoul Lee, 李宗烈 > > http://madeng.net > > > -- 이종열, Jongyoul Lee, 李宗烈 http://madeng.net
