>>>  It appears that  during execution time on the yarn hosts, the native CDH 
>>> spark1.5 jars are loaded before the new spark2 jars.  I've tried using 
>>> spark.yarn.archive to specify the spark2 jars in hdfs as well as using 
>>> other spark options, none of which seems to make a difference.

Where do you see “ spark1.5 jars are loaded before the new spark2 jars” ?

Best Regard,
Jeff Zhang


From: Rob Anderson 
<rockclimbings...@gmail.com<mailto:rockclimbings...@gmail.com>>
Reply-To: "users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>" 
<users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>>
Date: Wednesday, March 8, 2017 at 2:29 AM
To: "users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>" 
<users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>>
Subject: Re: java.lang.ClassNotFoundException: $anonfun$1

Thanks.  I can reach out to Cloudera, although the same commands seem to be 
work via Spak-Shell (see below).  So, the issue seems unique to Zeppelin.


Spark context available as 'sc' (master = yarn, app id = 
application_1472496315722_481416).

Spark session available as 'spark'.

Welcome to

      ____              __

     / __/__  ___ _____/ /__

    _\ \/ _ \/ _ `/ __/  '_/

   /___/ .__/\_,_/_/ /_/\_\   version 2.0.0.cloudera1

      /_/



Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_60)

Type in expressions to have them evaluated.

Type :help for more information.


scala> val taxonomy = sc.textFile("/user/user1/data/")

taxonomy: org.apache.spark.rdd.RDD[String] = /user/user1/data/ 
MapPartitionsRDD[1] at textFile at <console>:24


scala> .map(l => l.split("\t"))

res0: org.apache.spark.rdd.RDD[Array[String]] = MapPartitionsRDD[2] at map at 
<console>:27


scala> taxonomy.first

res1: String = 43 B&B 459Sheets & Pillow 45 Sheets1 Sheets

On Mon, Mar 6, 2017 at 6:48 PM, moon soo Lee 
<m...@apache.org<mailto:m...@apache.org>> wrote:
Hi Rob,

Thanks for sharing the problem.
fyi, https://issues.apache.org/jira/browse/ZEPPELIN-1735 is tracking the 
problem.

If we can get help from cloudera forum, that would be great.

Thanks,
moon

On Tue, Mar 7, 2017 at 10:08 AM Jeff Zhang 
<zjf...@gmail.com<mailto:zjf...@gmail.com>> wrote:

It seems CDH specific issue, you might be better to ask cloudera forum.


Rob Anderson 
<rockclimbings...@gmail.com<mailto:rockclimbings...@gmail.com>>于2017年3月7日周二 
上午9:02写道:
Hey Everyone,

We're running Zeppelin 0.7.0.  We've just cut over to spark2, using scala11 via 
the CDH parcel (SPARK2-2.0.0.cloudera1-1.cdh5.7.0.p0.113931).

Running a simple job, throws a "Caused by: java.lang.ClassNotFoundException: 
$anonfun$1".  It appears that  during execution time on the yarn hosts, the 
native CDH spark1.5 jars are loaded before the new spark2 jars.  I've tried 
using spark.yarn.archive to specify the spark2 jars in hdfs as well as using 
other spark options, none of which seems to make a difference.


Any suggestions you can offer is appreciated.

Thanks,

Rob

------------------------


%spark
val taxonomy = sc.textFile("/user/user1/data/")
                 .map(l => l.split("\t"))

%spark
taxonomy.first


org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 
7, data08.hadoop.prod.ostk.com<http://data08.hadoop.prod.ostk.com>, executor 
2): java.lang.ClassNotFoundException: $anonfun$1
at 
org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:82)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:86)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: $anonfun$1
at java.lang.ClassLoader.findClass(ClassLoader.java:530)
at org.apache.spark.util.ParentClassLoader.findClass(ParentClassLoader.scala:26)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:34)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:30)
at 
org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:77)
... 30 more
Driver stacktrace:
at 
org.apache.spark.scheduler.DAGScheduler.org<http://org.apache.spark.scheduler.DAGScheduler.org>$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1454)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1442)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1441)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1441)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
at scala.Option.foreach(Option.scala:257)
at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1669)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1624)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1613)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1893)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1906)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1919)
at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1318)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
at org.apache.spark.rdd.RDD.take(RDD.scala:1292)
at org.apache.spark.rdd.RDD$$anonfun$first$1.apply(RDD.scala:1332)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
at org.apache.spark.rdd.RDD.first(RDD.scala:1331)
... 37 elided
Caused by: java.lang.ClassNotFoundException: $anonfun$1
at 
org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:82)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:86)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
... 1 more
Caused by: java.lang.ClassNotFoundException: $anonfun$1
at java.lang.ClassLoader.findClass(ClassLoader.java:530)
at org.apache.spark.util.ParentClassLoader.findClass(ParentClassLoader.scala:26)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:34)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:30)
at org.apache.spark.repl.ExecutorClassL
oader.findClass(ExecutorClassLoader.scala:77)
... 30 more

Reply via email to