Re: Error while running SparkPi in Hadoop HA

Fernando O. Tue, 21 Apr 2015 12:51:39 -0700

Solved

Looks like it's some incompatibility in the build when using -Phadoop-2.4 ,
made the distribution with -Phadoop-provided   and that fixed the issue


On Tue, Apr 21, 2015 at 2:03 PM, Fernando O. <fot...@gmail.com> wrote:

> Hi all,
>
>     I'm wondering if SparkPi works with hadoop HA (I guess it should)
>
>
> Hadoop's pi example works great on my cluster, so after having that done I 
> installed spark and in the worker log I'm seeing two problems that might be 
> related.
>
>
> Versions: Hadoop 2.6.0
>
>           Spark 1.3.1
>
>
> I'm running :
>
> ./bin/spark-submit --class org.apache.spark.examples.SparkPi     --master 
> yarn-cluster     --num-executors 2     --driver-memory 2g     
> --executor-memory 1g     --executor-cores 1     lib/spark-examples*.jar     10
>
>
> It seems like it can't see the resource manager but I pinged and telnet it
> from the node and that works, also: hadoop's pi example works...
>
> So I'm not sure why spark is not seeing the RM
>
>
>
> 15/04/21 15:56:21 INFO yarn.YarnRMClient: Registering the 
> ApplicationMaster*15/04/21 15:56:22 INFO ipc.Client: Retrying connect to 
> server: 0.0.0.0/0.0.0.0:8030 <http://0.0.0.0/0.0.0.0:8030>. Already tried 0 
> time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
> sleepTime=1000 MILLISECONDS)
> 15/04/21 15:56:23 INFO ipc.Client: Retrying connect to server: 
> 0.0.0.0/0.0.0.0:8030 <http://0.0.0.0/0.0.0.0:8030>. Already tried 1 time(s); 
> retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
> sleepTime=1000 MILLISECONDS)*
> 15/04/21 15:56:24 INFO ipc.Client: Retrying connect to server: 
> 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 15/04/21 15:56:25 INFO ipc.Client: Retrying connect to server: 
> 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 15/04/21 15:56:26 INFO ipc.Client: Retrying connect to server: 
> 0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 15/04/21 15:56:27 INFO ipc.Client: Retrying connect to server: 
> 0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 15/04/21 15:56:28 INFO ipc.Client: Retrying connect to server: 
> 0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 15/04/21 15:56:29 INFO ipc.Client: Retrying connect to server: 
> 0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 15/04/21 15:56:30 INFO ipc.Client: Retrying connect to server: 
> 0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 15/04/21 15:56:31 INFO ipc.Client: Retrying connect to server: 
> 0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 15/04/21 15:56:51 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend 
> is ready for scheduling beginning after waiting 
> maxRegisteredResourcesWaitingTime: 30000(ms)
> 15/04/21 15:56:51 INFO cluster.YarnClusterScheduler: 
> YarnClusterScheduler.postStartHook done
> 15/04/21 15:56:51 INFO spark.SparkContext: Starting job: reduce at 
> SparkPi.scala:35
> 15/04/21 15:56:51 INFO scheduler.DAGScheduler: Got job 0 (reduce at 
> SparkPi.scala:35) with 10 output partitions (allowLocal=false)
> 15/04/21 15:56:51 INFO scheduler.DAGScheduler: Final stage: Stage 0(reduce at 
> SparkPi.scala:35)
> 15/04/21 15:56:51 INFO scheduler.DAGScheduler: Parents of final stage: List()
> 15/04/21 15:56:51 INFO scheduler.DAGScheduler: Missing parents: List()
> 15/04/21 15:56:51 INFO scheduler.DAGScheduler: Submitting Stage 0 
> (MapPartitionsRDD[1] at map at SparkPi.scala:31), which has no missing parents
> 15/04/21 15:56:51 INFO cluster.YarnClusterScheduler: Cancelling stage 0
> 15/04/21 15:56:51 INFO scheduler.DAGScheduler: Stage 0 (reduce at 
> SparkPi.scala:35) failed in Unknown s
> 15/04/21 15:56:51 INFO scheduler.DAGScheduler: Job 0 failed: reduce at 
> SparkPi.scala:35, took 0.121974 s*15/04/21 15:56:51 ERROR 
> yarn.ApplicationMaster: User class threw exception: Job aborted due to stage 
> failure: Task serialization failed: 
> java.lang.reflect.InvocationTargetException
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> java.lang.reflect.Constructor.newInstance(Constructor.java:526)*
> org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:68)
> org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:60)org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
> org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:79)
> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29)
> org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
> org.apache.spark.SparkContext.broadcast(SparkContext.scala:1051)org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:839)org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:778)
> org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:762)
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1362)
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
> org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 
> serialization failed: java.lang.reflect.InvocationTargetException
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:68)
> org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:60)org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
> org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:79)
> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29)
> org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
> org.apache.spark.SparkContext.broadcast(SparkContext.scala:1051)org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:839)org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:778)
> org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:762)
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1362)
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
> org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>
>       at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
>       at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>       at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>       at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1192)
>       at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:847)
>       at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:778)
>       at 
> org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:762)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1362)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
>       at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> 15/04/21 15:56:51 INFO yarn.ApplicationMaster: Final app status: FAILED, 
> exitCode: 15, (reason: User class threw exception: Job aborted due to stage 
> failure: Task serialization failed: 
> java.lang.reflect.InvocationTargetException
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:68)
> org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:60)org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
> org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:79)
> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29)
> org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
> org.apache.spark.SparkContext.broadcast(SparkContext.scala:1051)org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:839)org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:778)
> org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:762)
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1362)
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
> org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> )
> 15/04/21 15:57:02 INFO ipc.Client: Retrying connect to server: 
> 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 15/04/21 15:57:03 INFO ipc.Client: Retrying connect to server: 
> 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
>
>

Re: Error while running SparkPi in Hadoop HA

Reply via email to