Sure! You can always update either github / homepage by making pullrequest. (homepage is maintained in gh-pages branch) Thanks for considering contribution. Let me know if you need any help.
Best, moon On Thu, Jul 30, 2015 at 4:42 PM Albert Yoon <yoon...@kanizsalab.com> wrote: > Thanks moon, > > > > Now It works like a charm. thank you for build a great software! > > And one more questions, Can I share my setup and settings experience in > github documentation or zeppelin homepages? > > The working solution you have told me was not mentioned at all in github > so newbies like me can be confused. > > I can also provide some documentation about secured nginx reverse proxy > setup for zeppelin who feel headache to make zetty as HTTPS/Basic auth. > > > > Can you give me a hint to how to contribute as document? > > > > > > -----Original Message----- > *From:* "Albert Yoon"<yoon...@kanizsalab.com> > *To:* "moon soo Lee"<m...@apache.org>; > *Cc:* "users@zeppelin. incubator. apache. org"< > users@zeppelin.incubator.apache.org>; > *Sent:* 2015-07-25 (토) 11:03:12 > *Subject:* Re: Local class incompatible? > > > -----Original Message----- > *From:* "moon soo Lee"<m...@apache.org> > *To:* <users@zeppelin.incubator.apache.org>; "Albert Yoon"< > yoon...@kanizsalab.com>; > *Cc:* > > *Sent:* 2015-07-25 (토) 06:48:18 > *Subject:* Re: Local class incompatible? > > Hi, > > Could you try build Zeppelin with -Dspark.version=1.4.1 ? > > mvn clean install -Pspark-1.4 -Dspark.version=1.4.1 -Ppyspark > -Dhadoop.version=2.6.0 > -Phadoop-2.6 -DskipTests > > Thanks, > moon > > > On Fri, Jul 24, 2015 at 1:45 PM Albert Yoon <yoon...@kanizsalab.com> > wrote: > > Hi, > > Although SPARK_HOME correctly set and build Z with -Dpyspark I still > experience local class not compatible error when I running any pyspark in > Z. Job was correctly send and processed on cluster but the error seems > thrown after final stage (Python RDD deserialization?) I followed > instruction from Jongyul Lee but it still not effective at all. Any > workarounds? > -----Original Message----- > 보낸사람:"Albert Yoon" <yoon...@kanizsalab.com> > 받는사람:"Jongyoul Lee" <jongy...@gmail.com> > 날짜: 2015.07.23 오후 02:52:13 > 제목: Re: Local class incompatible? > > Hi, > > > > I'd build Z with pyspark as like below after fresh git clone: > > > > mvn clean install -Pspark-1.4 -Ppyspark -Dhadoop.version=2.6.0 > -Phadoop-2.6 -DskipTests > > mvn clean install -Pspark-1.4 -Ppyspark -Dhadoop.version=2.7.0 > -Phadoop-2.6 -DskipTests > > > > and then setup environment values > > > > master spark://analytics-master:7077 > > > > spark.home /home/kanizsa.lab/spark/spark-1.4.1-bin-hadoop2.6 > > > > but all attempts give me the same error when using pyspark: > > > > Py4JJavaError: An error occurred while calling > z:org.apache.spark.api.python.PythonRDD.collectAndServe. : > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 > in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage > 0.0 (TID 9, 10.32.10.97): java.io.InvalidClassException: > org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437 > > at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:621) at > java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623) at > java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at > java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69) > at > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58) at > org.apache.spark.scheduler.Task.run(Task.scala:70) at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > > > > Exception in Z interpreter Log file shown like below: > > > > INFO [2015-07-23 14:43:15,127] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Starting task 0.0 in stage 6.0 (TID 606, 10.32.10.97, ANY, 1567 bytes) > > INFO [2015-07-23 14:43:15,128] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Starting task 1.0 in stage 6.0 (TID 607, 10.32.10.97, ANY, 1567 bytes) > > INFO [2015-07-23 14:43:15,154] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Added broadcast_7_piece0 in memory on 10.32.10.97:39254 (size: 3.5 KB, > free: 265.1 MB) > > INFO [2015-07-23 14:43:15,155] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Added broadcast_7_piece0 in memory on 10.32.10.97:46622 (size: 3.5 KB, > free: 265.1 MB) > > INFO [2015-07-23 14:43:15,173] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Starting task 2.0 in stage 6.0 (TID 608, 10.32.10.97, ANY, 1567 bytes) > > WARN [2015-07-23 14:43:15,178] ({task-result-getter-2} > Logging.scala[logWarning]:71) - Lost task 1.0 in stage 6.0 (TID 607, > 10.32.10.97): java.io.InvalidClassException: > org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437 > > at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:621) > > at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623) > > at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) > > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) > > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) > > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) > > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) > > at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69) > > at > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95) > > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58) > > at org.apache.spark.scheduler.Task.run(Task.scala:70) > > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > > at java.lang.Thread.run(Thread.java:745) > > > > INFO [2015-07-23 14:43:15,179] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Starting task 1.1 in stage 6.0 (TID 609, 10.32.10.97, ANY, 1567 bytes) > > INFO [2015-07-23 14:43:15,180] ({task-result-getter-1} > Logging.scala[logInfo]:59) - Lost task 0.0 in stage 6.0 (TID 606) on > executor 10.32.10.97: java.io.InvalidClassException > (org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437) [duplicate 1] > > INFO [2015-07-23 14:43:15,181] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Starting task 0.1 in stage 6.0 (TID 610, 10.32.10.97, ANY, 1567 bytes) > > INFO [2015-07-23 14:43:15,181] ({task-result-getter-3} > Logging.scala[logInfo]:59) - Lost task 2.0 in stage 6.0 (TID 608) on > executor 10.32.10.97: java.io.InvalidClassException > (org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437) [duplicate 2] > > INFO [2015-07-23 14:43:15,182] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Starting task 2.1 in stage 6.0 (TID 611, 10.32.10.97, ANY, 1567 bytes) > > INFO [2015-07-23 14:43:15,183] ({task-result-getter-0} > Logging.scala[logInfo]:59) - Lost task 1.1 in stage 6.0 (TID 609) on > executor 10.32.10.97: java.io.InvalidClassException > (org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437) [duplicate 3] > > INFO [2015-07-23 14:43:15,185] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Starting task 1.2 in stage 6.0 (TID 612, 10.32.10.97, ANY, 1567 bytes) > > INFO [2015-07-23 14:43:15,185] ({task-result-getter-2} > Logging.scala[logInfo]:59) - Lost task 0.1 in stage 6.0 (TID 610) on > executor 10.32.10.97: java.io.InvalidClassException > (org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437) [duplicate 4] > > INFO [2015-07-23 14:43:15,186] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Starting task 0.2 in stage 6.0 (TID 613, 10.32.10.97, ANY, 1567 bytes) > > INFO [2015-07-23 14:43:15,187] ({task-result-getter-1} > Logging.scala[logInfo]:59) - Lost task 2.1 in stage 6.0 (TID 611) on > executor 10.32.10.97: java.io.InvalidClassException > (org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437) [duplicate 5] > > INFO [2015-07-23 14:43:15,189] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Starting task 2.2 in stage 6.0 (TID 614, 10.32.10.97, ANY, 1567 bytes) > > INFO [2015-07-23 14:43:15,189] ({task-result-getter-3} > Logging.scala[logInfo]:59) - Lost task 1.2 in stage 6.0 (TID 612) on > executor 10.32.10.97: java.io.InvalidClassException > (org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437) [duplicate 6] > > INFO [2015-07-23 14:43:15,190] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Starting task 1.3 in stage 6.0 (TID 615, 10.32.10.97, ANY, 1567 bytes) > > INFO [2015-07-23 14:43:15,190] ({task-result-getter-0} > Logging.scala[logInfo]:59) - Lost task 0.2 in stage 6.0 (TID 613) on > executor 10.32.10.97: java.io.InvalidClassException > (org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437) [duplicate 7] > > INFO [2015-07-23 14:43:15,192] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Starting task 0.3 in stage 6.0 (TID 616, 10.32.10.97, ANY, 1567 bytes) > > INFO [2015-07-23 14:43:15,192] ({task-result-getter-2} > Logging.scala[logInfo]:59) - Lost task 2.2 in stage 6.0 (TID 614) on > executor 10.32.10.97: java.io.InvalidClassException > (org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437) [duplicate 8] > > INFO [2015-07-23 14:43:15,193] > ({sparkDriver-akka.actor.default-dispatcher-19} Logging.scala[logInfo]:59) > - Starting task 2.3 in stage 6.0 (TID 617, 10.32.10.97, ANY, 1567 bytes) > > INFO [2015-07-23 14:43:15,193] ({task-result-getter-1} > Logging.scala[logInfo]:59) - Lost task 1.3 in stage 6.0 (TID 615) on > executor 10.32.10.97: java.io.InvalidClassException > (org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437) [duplicate 9] > > ERROR [2015-07-23 14:43:15,194] ({task-result-getter-1} > Logging.scala[logError]:75) - Task 1 in stage 6.0 failed 4 times; aborting > job > > INFO [2015-07-23 14:43:15,197] ({task-result-getter-3} > Logging.scala[logInfo]:59) - Lost task 0.3 in stage 6.0 (TID 616) on > executor 10.32.10.97: java.io.InvalidClassException > (org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437) [duplicate 10] > > INFO [2015-07-23 14:43:15,202] ({task-result-getter-0} > Logging.scala[logInfo]:59) - Lost task 2.3 in stage 6.0 (TID 617) on > executor 10.32.10.97: java.io.InvalidClassException > (org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437) [duplicate 11] > > INFO [2015-07-23 14:43:15,202] ({task-result-getter-0} > Logging.scala[logInfo]:59) - Removed TaskSet 6.0, whose tasks have all > completed, from pool default > > INFO [2015-07-23 14:43:15,203] ({dag-scheduler-event-loop} > Logging.scala[logInfo]:59) - Cancelling stage 6 > > INFO [2015-07-23 14:43:15,205] ({dag-scheduler-event-loop} > Logging.scala[logInfo]:59) - ResultStage 6 (count at <string>:2) failed in > 0.078 s > > INFO [2015-07-23 14:43:15,206] ({Thread-58} Logging.scala[logInfo]:59) - > Job 3 failed: count at <string>:2, took 0.093380 s > > INFO [2015-07-23 14:43:15,216] ({pool-2-thread-4} > SchedulerFactory.java[jobFinished]:138) - Job > remoteInterpretJob_1437630194249 finished by scheduler > interpreter_1373012994 > > > > > > -----Original Message----- > *From:* "Jongyoul Lee"<jongy...@gmail.com> > *To:* "users@zeppelin.incubator.apache.org"< > users@zeppelin.incubator.apache.org>; "Albert Yoon"<yoon...@kanizsalab.com>; > > *Cc:* > *Sent:* 2015-07-23 (목) 10:39:03 > *Subject:* Re: Local class incompatible? > > Hi, > > You should add -Ppyspark when you want to use Z with pyspark. > > Regards, > JL > > On Thursday, July 23, 2015, Albert Yoon <yoon...@kanizsalab.com> wrote: > > Hi, I have a test-running zeppelin connected to my spark cluster and > cluster setup like this: > > > > *Spark*: spark-1.4.1-bin-hadoop2.6 > > *Hadoop*: hadoop-2.7.1 > > > > *Zeppelin installed as*: mvn clean install -Pspark-1.4 > -Dhadoop.version=2.6.0 -Phadoop-2.6 -DskipTests > > > > and ran a test code like this: > > > > val textFile = > sc.textFile("hdfs://analytics-master:9000/user/kanizsa.lab/gutenberg") > > print textFile.count() > > > > val textFile = sc.textFile("README.md") > > print textFile.count() > > > > this code works well with scala interpreter but I'd encountered error > when executing similar code as %pyspark. > > I'd also ran a code in pyspark command line shell but I works like a > charm. no error like below: > > > > How to resolve this error? It seems something inside of zeppelin not > compatible with my spark cluster... > > > > Py4JJavaError: An error occurred while calling > z:org.apache.spark.api.python.PythonRDD.collectAndServe. : > org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 > in stage 5.0 failed 4 times, most recent failure: Lost task 1.3 in stage > 5.0 (TID 52, 10.32.10.97): java.io.InvalidClassException: > org.apache.spark.api.python.PythonRDD; local class incompatible: stream > classdesc serialVersionUID = 1521627685947625661, local class > serialVersionUID = -2629548733386970437 at > java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:621) at > java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623) at > java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at > java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69) > at > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58) at > org.apache.spark.scheduler.Task.run(Task.scala:70) at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) Driver stacktrace: at > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1266) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1257) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1256) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1256) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) > at scala.Option.foreach(Option.scala:236) at > org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1450) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1411) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) (<class > 'py4j.protocol.Py4JJavaError'>, Py4JJavaError(u'An error occurred while > calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.\n', > JavaObject id=o93), <traceback object at 0x1bde290>) > > > > -- > 이종열, Jongyoul Lee, 李宗烈 > http://madeng.net > >