[
https://issues.apache.org/jira/browse/SPARK-11195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14969271#comment-14969271
]
Hao Ren commented on SPARK-11195:
---------------------------------
Thank you for the quick reply.
In fact, we have Kafka classes in our app with "--jars
/opt/spark/lib/kafka_2.10-0.8.2.2.jar,/opt/spark/lib/kafka-clients-0.8.2.2.jar".
It seems that the kafka exception I mentioned was created on slaves, however,
when they were sent back to driver, the driver can not deserialize the
exception object since kafka deps are not in the driver's classpath.
Normally, "--jars" should include the list of local jars on the driver and
executor classpaths.
But it doesn't, that's why we think there may be a bug here, .
The workaround is just to make the driver's classpath contain kafka deps by
adding "--conf
spark.driver.extraClassPath=/opt/spark/lib/kafka_2.10-0.8.2.2.jar:/opt/spark/lib/kafka-clients-0.8.2.2.jar".
And it works.
(Maybe I should create another issue on this)
> Exception thrown on executor throws ClassNotFound on driver
> -----------------------------------------------------------
>
> Key: SPARK-11195
> URL: https://issues.apache.org/jira/browse/SPARK-11195
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.5.1
> Reporter: Hurshal Patel
>
> I have a minimal repro job
> {code:title=Repro.scala}
> package repro
> import org.apache.spark.SparkContext
> import org.apache.spark.SparkConf
> import org.apache.spark.SparkException
> class MyException(message: String) extends Exception(message: String)
> object Repro {
> def main(args: Array[String]) {
> val conf = new SparkConf().setAppName("MyException ClassNotFound Repro")
> val sc = new SparkContext(conf)
> sc.parallelize(List(1)).map { x =>
> throw new repro.MyException("this is a failure")
> true
> }.collect()
> }
> }
> {code}
> On Spark 1.4.1, I get a task failure with the reason correctly set to
> MyException.
> On Spark 1.5.1, I _expect_ the same behavior, but instead I get a task
> failure with an UnknownReason caused by ClassNotFoundException.
>
> here is the job on vanilla Spark 1.4.1:
> {code:title=spark_1.5.1_log}
> $ ./bin/spark-submit --master local --deploy-mode client --class repro.Repro
> /home/nix/repro/target/scala-2.10/repro-assembly-0.0.1.jar
> Using Spark's default log4j profile:
> org/apache/spark/log4j-defaults.properties
> 15/10/19 11:55:20 INFO SparkContext: Running Spark version 1.4.1
> 15/10/19 11:55:21 WARN NativeCodeLoader: Unable to load native-hadoop library
> for your platform... using builtin-java classes where applicable
> 15/10/19 11:55:22 WARN Utils: Your hostname, choochootrain resolves to a
> loopback address: 127.0.1.1; using 10.0.1.97 instead (on interface wlan0)
> 15/10/19 11:55:22 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to
> another address
> 15/10/19 11:55:22 INFO SecurityManager: Changing view acls to: root
> 15/10/19 11:55:22 INFO SecurityManager: Changing modify acls to: root
> 15/10/19 11:55:22 INFO SecurityManager: SecurityManager: authentication
> disabled; ui acls disabled; users with view permissions: Set(root); users
> with modify permissions: Set(root)
> 15/10/19 11:55:24 INFO Slf4jLogger: Slf4jLogger started
> 15/10/19 11:55:24 INFO Remoting: Starting remoting
> 15/10/19 11:55:24 INFO Remoting: Remoting started; listening on addresses
> :[akka.tcp://[email protected]:46683]
> 15/10/19 11:55:24 INFO Utils: Successfully started service 'sparkDriver' on
> port 46683.
> 15/10/19 11:55:24 INFO SparkEnv: Registering MapOutputTracker
> 15/10/19 11:55:24 INFO SparkEnv: Registering BlockManagerMaster
> 15/10/19 11:55:24 INFO DiskBlockManager: Created local directory at
> /tmp/spark-0348a320-0ca3-4528-9ab5-9ba37d3c2e07/blockmgr-08496143-1d9d-41c8-a581-b6220edf00d5
> 15/10/19 11:55:24 INFO MemoryStore: MemoryStore started with capacity 265.4 MB
> 15/10/19 11:55:25 INFO HttpFileServer: HTTP File server directory is
> /tmp/spark-0348a320-0ca3-4528-9ab5-9ba37d3c2e07/httpd-52c396d2-b47f-45a5-bb76-d10aa864e6d5
> 15/10/19 11:55:25 INFO HttpServer: Starting HTTP Server
> 15/10/19 11:55:25 INFO Utils: Successfully started service 'HTTP file server'
> on port 47915.
> 15/10/19 11:55:25 INFO SparkEnv: Registering OutputCommitCoordinator
> 15/10/19 11:55:25 INFO Utils: Successfully started service 'SparkUI' on port
> 4040.
> 15/10/19 11:55:25 INFO SparkUI: Started SparkUI at http://10.0.1.97:4040
> 15/10/19 11:55:25 INFO SparkContext: Added JAR
> file:/home/nix/repro/target/scala-2.10/repro-assembly-0.0.1.jar at
> http://10.0.1.97:47915/jars/repro-assembly-0.0.1.jar with timestamp
> 1445280925969
> 15/10/19 11:55:26 INFO Executor: Starting executor ID driver on host localhost
> 15/10/19 11:55:26 INFO Utils: Successfully started service
> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 46569.
> 15/10/19 11:55:26 INFO NettyBlockTransferService: Server created on 46569
> 15/10/19 11:55:26 INFO BlockManagerMaster: Trying to register BlockManager
> 15/10/19 11:55:26 INFO BlockManagerMasterEndpoint: Registering block manager
> localhost:46569 with 265.4 MB RAM, BlockManagerId(driver, localhost, 46569)
> 15/10/19 11:55:26 INFO BlockManagerMaster: Registered BlockManager
> 15/10/19 11:55:27 INFO SparkContext: Starting job: collect at repro.scala:18
> 15/10/19 11:55:27 INFO DAGScheduler: Got job 0 (collect at repro.scala:18)
> with 1 output partitions (allowLocal=false)
> 15/10/19 11:55:27 INFO DAGScheduler: Final stage: ResultStage 0(collect at
> repro.scala:18)
> 15/10/19 11:55:27 INFO DAGScheduler: Parents of final stage: List()
> 15/10/19 11:55:27 INFO DAGScheduler: Missing parents: List()
> 15/10/19 11:55:27 INFO DAGScheduler: Submitting ResultStage 0
> (MapPartitionsRDD[1] at map at repro.scala:15), which has no missing parents
> 15/10/19 11:55:28 INFO MemoryStore: ensureFreeSpace(1984) called with
> curMem=0, maxMem=278302556
> 15/10/19 11:55:28 INFO MemoryStore: Block broadcast_0 stored as values in
> memory (estimated size 1984.0 B, free 265.4 MB)
> 15/10/19 11:55:28 INFO MemoryStore: ensureFreeSpace(1248) called with
> curMem=1984, maxMem=278302556
> 15/10/19 11:55:28 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes
> in memory (estimated size 1248.0 B, free 265.4 MB)
> 15/10/19 11:55:28 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory
> on localhost:46569 (size: 1248.0 B, free: 265.4 MB)
> 15/10/19 11:55:28 INFO SparkContext: Created broadcast 0 from broadcast at
> DAGScheduler.scala:874
> 15/10/19 11:55:28 INFO DAGScheduler: Submitting 1 missing tasks from
> ResultStage 0 (MapPartitionsRDD[1] at map at repro.scala:15)
> 15/10/19 11:55:28 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
> 15/10/19 11:55:28 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0,
> localhost, PROCESS_LOCAL, 1375 bytes)
> 15/10/19 11:55:28 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
> 15/10/19 11:55:28 INFO Executor: Fetching
> http://10.0.1.97:47915/jars/repro-assembly-0.0.1.jar with timestamp
> 1445280925969
> 15/10/19 11:55:28 INFO Utils: Fetching
> http://10.0.1.97:47915/jars/repro-assembly-0.0.1.jar to
> /tmp/spark-0348a320-0ca3-4528-9ab5-9ba37d3c2e07/userFiles-08fd567b-f708-49c8-a41b-83994436ef4f/fetchFileTemp4791648304973175221.tmp
> 15/10/19 11:55:28 INFO Executor: Adding
> file:/tmp/spark-0348a320-0ca3-4528-9ab5-9ba37d3c2e07/userFiles-08fd567b-f708-49c8-a41b-83994436ef4f/repro-assembly-0.0.1.jar
> to class loader
> 15/10/19 11:55:28 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> repro.MyException: this is a failure
> at repro.Repro$$anonfun$main$1.apply$mcZI$sp(repro.scala:16)
> at repro.Repro$$anonfun$main$1.apply(repro.scala:15)
> at repro.Repro$$anonfun$main$1.apply(repro.scala:15)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at
> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at
> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:885)
> at
> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:885)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
> at org.apache.spark.scheduler.Task.run(Task.scala:70)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 15/10/19 11:55:28 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
> localhost): repro.MyException: this is a failure
> at repro.Repro$$anonfun$main$1.apply$mcZI$sp(repro.scala:16)
> at repro.Repro$$anonfun$main$1.apply(repro.scala:15)
> at repro.Repro$$anonfun$main$1.apply(repro.scala:15)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at
> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at
> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:885)
> at
> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:885)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
> at org.apache.spark.scheduler.Task.run(Task.scala:70)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> 15/10/19 11:55:28 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times;
> aborting job
> 15/10/19 11:55:28 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks
> have all completed, from pool
> 15/10/19 11:55:28 INFO TaskSchedulerImpl: Cancelling stage 0
> 15/10/19 11:55:28 INFO DAGScheduler: ResultStage 0 (collect at
> repro.scala:18) failed in 0.542 s
> 15/10/19 11:55:28 INFO DAGScheduler: Job 0 failed: collect at repro.scala:18,
> took 0.972468 s
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due
> to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure:
> Lost task 0.0 in stage 0.0 (TID 0, localhost): repro.MyException: this is a
> failure
> at repro.Repro$$anonfun$main$1.apply$mcZI$sp(repro.scala:16)
> at repro.Repro$$anonfun$main$1.apply(repro.scala:15)
> at repro.Repro$$anonfun$main$1.apply(repro.scala:15)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at
> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at
> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:885)
> at
> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:885)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
> at org.apache.spark.scheduler.Task.run(Task.scala:70)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> Driver stacktrace:
> at
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263)
> at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
> at scala.Option.foreach(Option.scala:236)
> at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> 15/10/19 11:55:28 INFO SparkContext: Invoking stop() from shutdown hook
> 15/10/19 11:55:28 INFO SparkUI: Stopped Spark web UI at http://10.0.1.97:4040
> 15/10/19 11:55:28 INFO DAGScheduler: Stopping DAGScheduler
> 15/10/19 11:55:28 INFO MapOutputTrackerMasterEndpoint:
> MapOutputTrackerMasterEndpoint stopped!
> 15/10/19 11:55:29 INFO Utils: path =
> /tmp/spark-0348a320-0ca3-4528-9ab5-9ba37d3c2e07/blockmgr-08496143-1d9d-41c8-a581-b6220edf00d5,
> already present as root for deletion.
> 15/10/19 11:55:29 INFO MemoryStore: MemoryStore cleared
> 15/10/19 11:55:29 INFO BlockManager: BlockManager stopped
> 15/10/19 11:55:29 INFO BlockManagerMaster: BlockManagerMaster stopped
> 15/10/19 11:55:29 INFO
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
> OutputCommitCoordinator stopped!
> 15/10/19 11:55:29 INFO SparkContext: Successfully stopped SparkContext
> 15/10/19 11:55:29 INFO Utils: Shutdown hook called
> 15/10/19 11:55:29 INFO Utils: Deleting directory
> /tmp/spark-0348a320-0ca3-4528-9ab5-9ba37d3c2e07
> {code}
> and here is the job on vanilla Spark 1.5.1
> {code:title=spark_1.5.1_log}
> $ ./bin/spark-submit --master local --deploy-mode client --class repro.Repro
> /home/nix/repro/target/scala-2.10/repro-assembly-0.0.1.jar
> Using Spark's default log4j profile:
> org/apache/spark/log4j-defaults.properties
> 15/10/19 11:53:30 INFO SparkContext: Running Spark version 1.5.1
> 15/10/19 11:53:31 WARN NativeCodeLoader: Unable to load native-hadoop library
> for your platform... using builtin-java classes where applicable
> 15/10/19 11:53:32 WARN Utils: Your hostname, choochootrain resolves to a
> loopback address: 127.0.1.1; using 10.0.1.97 instead (on interface wlan0)
> 15/10/19 11:53:32 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to
> another address
> 15/10/19 11:53:32 INFO SecurityManager: Changing view acls to: root
> 15/10/19 11:53:32 INFO SecurityManager: Changing modify acls to: root
> 15/10/19 11:53:32 INFO SecurityManager: SecurityManager: authentication
> disabled; ui acls disabled; users with view permissions: Set(root); users
> with modify permissions: Set(root)
> 15/10/19 11:53:34 INFO Slf4jLogger: Slf4jLogger started
> 15/10/19 11:53:34 INFO Remoting: Starting remoting
> 15/10/19 11:53:34 INFO Remoting: Remoting started; listening on addresses
> :[akka.tcp://[email protected]:47096]
> 15/10/19 11:53:34 INFO Utils: Successfully started service 'sparkDriver' on
> port 47096.
> 15/10/19 11:53:34 INFO SparkEnv: Registering MapOutputTracker
> 15/10/19 11:53:34 INFO SparkEnv: Registering BlockManagerMaster
> 15/10/19 11:53:34 INFO DiskBlockManager: Created local directory at
> /tmp/blockmgr-be9b3111-6640-4bc8-bbd0-054cccfa474f
> 15/10/19 11:53:34 INFO MemoryStore: MemoryStore started with capacity 530.3 MB
> 15/10/19 11:53:35 INFO HttpFileServer: HTTP File server directory is
> /tmp/spark-e2aeb6af-3b15-4d36-8f1c-abd1ae494be2/httpd-e1fcfd42-4521-4cee-96d2-eac83d0a89ea
> 15/10/19 11:53:35 INFO HttpServer: Starting HTTP Server
> 15/10/19 11:53:35 INFO Utils: Successfully started service 'HTTP file server'
> on port 59017.
> 15/10/19 11:53:35 INFO SparkEnv: Registering OutputCommitCoordinator
> 15/10/19 11:53:35 INFO Utils: Successfully started service 'SparkUI' on port
> 4040.
> 15/10/19 11:53:35 INFO SparkUI: Started SparkUI at http://10.0.1.97:4040
> 15/10/19 11:53:35 INFO SparkContext: Added JAR
> file:/home/nix/repro/target/scala-2.10/repro-assembly-0.0.1.jar at
> http://10.0.1.97:59017/jars/repro-assembly-0.0.1.jar with timestamp
> 1445280815913
> 15/10/19 11:53:36 WARN MetricsSystem: Using default name DAGScheduler for
> source because spark.app.id is not set.
> 15/10/19 11:53:36 INFO Executor: Starting executor ID driver on host localhost
> 15/10/19 11:53:36 INFO Utils: Successfully started service
> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 40701.
> 15/10/19 11:53:36 INFO NettyBlockTransferService: Server created on 40701
> 15/10/19 11:53:36 INFO BlockManagerMaster: Trying to register BlockManager
> 15/10/19 11:53:36 INFO BlockManagerMasterEndpoint: Registering block manager
> localhost:40701 with 530.3 MB RAM, BlockManagerId(driver, localhost, 40701)
> 15/10/19 11:53:36 INFO BlockManagerMaster: Registered BlockManager
> 15/10/19 11:53:38 INFO SparkContext: Starting job: collect at repro.scala:18
> 15/10/19 11:53:38 INFO DAGScheduler: Got job 0 (collect at repro.scala:18)
> with 1 output partitions
> 15/10/19 11:53:38 INFO DAGScheduler: Final stage: ResultStage 0(collect at
> repro.scala:18)
> 15/10/19 11:53:38 INFO DAGScheduler: Parents of final stage: List()
> 15/10/19 11:53:38 INFO DAGScheduler: Missing parents: List()
> 15/10/19 11:53:38 INFO DAGScheduler: Submitting ResultStage 0
> (MapPartitionsRDD[1] at map at repro.scala:15), which has no missing parents
> 15/10/19 11:53:38 INFO MemoryStore: ensureFreeSpace(1984) called with
> curMem=0, maxMem=556038881
> 15/10/19 11:53:38 INFO MemoryStore: Block broadcast_0 stored as values in
> memory (estimated size 1984.0 B, free 530.3 MB)
> 15/10/19 11:53:38 INFO MemoryStore: ensureFreeSpace(1248) called with
> curMem=1984, maxMem=556038881
> 15/10/19 11:53:38 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes
> in memory (estimated size 1248.0 B, free 530.3 MB)
> 15/10/19 11:53:38 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory
> on localhost:40701 (size: 1248.0 B, free: 530.3 MB)
> 15/10/19 11:53:38 INFO SparkContext: Created broadcast 0 from broadcast at
> DAGScheduler.scala:861
> 15/10/19 11:53:38 INFO DAGScheduler: Submitting 1 missing tasks from
> ResultStage 0 (MapPartitionsRDD[1] at map at repro.scala:15)
> 15/10/19 11:53:38 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
> 15/10/19 11:53:38 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0,
> localhost, PROCESS_LOCAL, 2091 bytes)
> 15/10/19 11:53:38 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
> 15/10/19 11:53:38 INFO Executor: Fetching
> http://10.0.1.97:59017/jars/repro-assembly-0.0.1.jar with timestamp
> 1445280815913
> 15/10/19 11:53:38 INFO Utils: Fetching
> http://10.0.1.97:59017/jars/repro-assembly-0.0.1.jar to
> /tmp/spark-e2aeb6af-3b15-4d36-8f1c-abd1ae494be2/userFiles-94c6cfc6-59a3-4a6f-ab06-b8b41956c9c0/fetchFileTemp5258469143029308872.tmp
> 15/10/19 11:53:38 INFO Executor: Adding
> file:/tmp/spark-e2aeb6af-3b15-4d36-8f1c-abd1ae494be2/userFiles-94c6cfc6-59a3-4a6f-ab06-b8b41956c9c0/repro-assembly-0.0.1.jar
> to class loader
> 15/10/19 11:53:39 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> repro.MyException: this is a failure
> at repro.Repro$$anonfun$main$1.apply$mcZI$sp(repro.scala:16)
> at repro.Repro$$anonfun$main$1.apply(repro.scala:15)
> at repro.Repro$$anonfun$main$1.apply(repro.scala:15)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at
> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at
> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:905)
> at
> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:905)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1848)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1848)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 15/10/19 11:53:39 WARN ThrowableSerializationWrapper: Task exception could
> not be deserialized
> java.lang.ClassNotFoundException: repro.MyException
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:274)
> at
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
> at
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
> at
> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> at
> org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:167)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1897)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
> at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
> at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108)
> at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
> at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
> at
> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
> at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 15/10/19 11:53:39 ERROR TaskResultGetter: Could not deserialize
> TaskEndReason: ClassNotFound with classloader
> org.apache.spark.util.MutableURLClassLoader@7f08a6b1
> 15/10/19 11:53:39 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
> localhost): UnknownReason
> 15/10/19 11:53:39 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times;
> aborting job
> 15/10/19 11:53:39 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks
> have all completed, from pool
> 15/10/19 11:53:39 INFO TaskSchedulerImpl: Cancelling stage 0
> 15/10/19 11:53:39 INFO DAGScheduler: ResultStage 0 (collect at
> repro.scala:18) failed in 0.567 s
> 15/10/19 11:53:39 INFO DAGScheduler: Job 0 failed: collect at repro.scala:18,
> took 1.049437 s
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due
> to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure:
> Lost task 0.0 in stage 0.0 (TID 0, localhost): UnknownReason
> Driver stacktrace:
> at
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
> at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
> at scala.Option.foreach(Option.scala:236)
> at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> at
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1822)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1835)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1848)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1919)
> at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:905)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
> at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
> at org.apache.spark.rdd.RDD.collect(RDD.scala:904)
> at repro.Repro$.main(repro.scala:18)
> at repro.Repro.main(repro.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
> at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 15/10/19 11:53:39 INFO SparkContext: Invoking stop() from shutdown hook
> 15/10/19 11:53:39 INFO SparkUI: Stopped Spark web UI at http://10.0.1.97:4040
> 15/10/19 11:53:39 INFO DAGScheduler: Stopping DAGScheduler
> 15/10/19 11:53:39 INFO MapOutputTrackerMasterEndpoint:
> MapOutputTrackerMasterEndpoint stopped!
> 15/10/19 11:53:39 INFO MemoryStore: MemoryStore cleared
> 15/10/19 11:53:39 INFO BlockManager: BlockManager stopped
> 15/10/19 11:53:39 INFO BlockManagerMaster: BlockManagerMaster stopped
> 15/10/19 11:53:39 INFO
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
> OutputCommitCoordinator stopped!
> 15/10/19 11:53:39 INFO SparkContext: Successfully stopped SparkContext
> 15/10/19 11:53:39 INFO ShutdownHookManager: Shutdown hook called
> 15/10/19 11:53:39 INFO ShutdownHookManager: Deleting directory
> /tmp/spark-e2aeb6af-3b15-4d36-8f1c-abd1ae494be2
> {code}
> See
> http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201510.mbox/browser
> for full context
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]