Hi, Just updating on my findings for future reference. The problem was that after refactoring my code I ended up with a scala object which held SparkContext as a member, eg: object A { sc: SparkContext = new SparkContext def mapFunction {} }
and when I called rdd.map(A.mapFunction) it failed as A.sc is not serializable. Thanks, Daniel On Tue, Jun 7, 2016 at 10:13 AM, Takeshi Yamamuro <linguin....@gmail.com> wrote: > Hi, > > Since `HttpBroadcastFactory` has already been removed in master, so > you cannot use the broadcast mechanism in future releases. > > Anyway, I couldn't find a root cause only from the stacktraces... > > // maropu > > > > > On Mon, Jun 6, 2016 at 2:14 AM, Daniel Haviv < > daniel.ha...@veracity-group.com> wrote: > >> Hi, >> I've set spark.broadcast.factory to >> org.apache.spark.broadcast.HttpBroadcastFactory and it indeed resolve my >> issue. >> >> I'm creating a dataframe which creates a broadcast variable internally >> and then fails due to the torrent broadcast with the following stacktrace: >> Caused by: org.apache.spark.SparkException: Failed to get >> broadcast_3_piece0 of broadcast_3 >> at >> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:138) >> at >> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:138) >> at scala.Option.getOrElse(Option.scala:120) >> at >> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:137) >> at >> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120) >> at >> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120) >> at scala.collection.immutable.List.foreach(List.scala:318) >> at org.apache.spark.broadcast.TorrentBroadcast.org >> $apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:120) >> at >> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:175) >> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1220) >> >> I'm using spark 1.6.0 on CDH 5.7 >> >> Thanks, >> Daniel >> >> >> On Wed, Jun 1, 2016 at 5:52 PM, Ted Yu <yuzhih...@gmail.com> wrote: >> >>> I found spark.broadcast.blockSize but no parameter to switch broadcast >>> method. >>> >>> Can you describe the issues with torrent broadcast in more detail ? >>> >>> Which version of Spark are you using ? >>> >>> Thanks >>> >>> On Wed, Jun 1, 2016 at 7:48 AM, Daniel Haviv < >>> daniel.ha...@veracity-group.com> wrote: >>> >>>> Hi, >>>> Our application is failing due to issues with the torrent broadcast, is >>>> there a way to switch to another broadcast method ? >>>> >>>> Thank you. >>>> Daniel >>>> >>> >>> >> > > > -- > --- > Takeshi Yamamuro >