[ 
https://issues.apache.org/jira/browse/SPARK-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198556#comment-15198556
 ] 

Hiten Patel edited comment on SPARK-5594 at 3/17/16 1:31 AM:
-------------------------------------------------------------

Yes, this was indeed the problem in my case too. 

I had a custom function all in the same Scala object (Spark Streaming job) with 
input (Kafka direct streaming) and output (to Cassandra). 

After moving the function to a seperate object that extends Serializable, it 
did work finally !!. A BIG thanks to Sal Uryasev for reporting the solution. 

It does make sense after realizing the proper solution on how indeed it should 
work in Spark. However, it's really vexing to nail down such issues when you 
hit this the first time. I think this is more of a coding practice and needs to 
be documented in the official documentation as  one of the coding practices. 
It''s very easy to get trapped into such issues for anyone.

Let me know the right place where this needs to go in documentation and I can 
send a PR. 




was (Author: echelon_apache):
Yes, this was indeed the problem in my case too. 

I had a custom function all in the same Scala object (Spark Streaming job) with 
input (Kafka direct streaming) and output (to Cassandra). 

After moving the function to a seperate object that extends Serializable, it 
did work finally !!. A BIG thanks to Sal Uryasev for reporting the solution. 

It does make sense after realizing the proper solution on how indeed it should 
work in Spark. However, it's really vexing to nail down such issues since the 
stacktraces are too generic and could mean lot of things. I think this is more 
of a coding practice and needs to be documented in the official documentation 
as  one of the coding practices. It''s very easy to get trapped into such 
issues for anyone.

Let me know the right place where this needs to go in documentation and I can 
send a PR. 



> SparkException: Failed to get broadcast (TorrentBroadcast)
> ----------------------------------------------------------
>
>                 Key: SPARK-5594
>                 URL: https://issues.apache.org/jira/browse/SPARK-5594
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.2.0, 1.3.0
>            Reporter: John Sandiford
>            Priority: Critical
>
> I am uncertain whether this is a bug, however I am getting the error below 
> when running on a cluster (works locally), and have no idea what is causing 
> it, or where to look for more information.
> Any help is appreciated.  Others appear to experience the same issue, but I 
> have not found any solutions online.
> Please note that this only happens with certain code and is repeatable, all 
> my other spark jobs work fine.
> {noformat}
> ERROR TaskSetManager: Task 3 in stage 6.0 failed 4 times; aborting job
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due 
> to stage failure: Task 3 in stage 6.0 failed 4 times, most recent failure: 
> Lost task 3.3 in stage 6.0 (TID 24, <removed>): java.io.IOException: 
> org.apache.spark.SparkException: Failed to get broadcast_6_piece0 of 
> broadcast_6
>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1011)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
>         at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
>         at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58)
>         at org.apache.spark.scheduler.Task.run(Task.scala:56)
>         at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: org.apache.spark.SparkException: Failed to get broadcast_6_piece0 
> of broadcast_6
>         at 
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
>         at scala.Option.getOrElse(Option.scala:120)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:136)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
>         at scala.collection.immutable.List.foreach(List.scala:318)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:119)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:174)
>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1008)
>         ... 11 more
> {noformat}
> Driver stacktrace:
> {noformat}
>         at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
>         at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>         at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
>         at scala.Option.foreach(Option.scala:236)
>         at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
>         at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420)
>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>         at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>         at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>         at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>         at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>         at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>         at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>         at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to