Hi, This is a known issue. You need to check a related JIRA ticket: https://issues.apache.org/jira/browse/SPARK-4105
// maropu On Mon, May 30, 2016 at 7:51 PM, Prashant Singh Thakur < prashant.tha...@impetus.co.in> wrote: > Hi, > > > > We are trying to use Spark Data Frames for our use case where we are > getting this exception. > > The parameters used are listed below. Kindly suggest if we are missing > something. > > Version used is Spark 1.3.1 > > Jira is still showing this issue as Open > https://issues.apache.org/jira/browse/SPARK-4105 > > Kindly suggest if there is workaround . > > > > Exception : > > Caused by: org.apache.spark.SparkException: Job aborted due to stage > failure: Task 88 in stage 40.0 failed 4 times, most recent failure: Lost > task 88.3 in stage 40.0 : java.io.IOException: FAILED_TO_UNCOMPRESS(5) > > at > org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:78) > > at org.xerial.snappy.SnappyNative.rawUncompress(Native > Method) > > at org.xerial.snappy.Snappy.rawUncompress(Snappy.java:391) > > at org.xerial.snappy.Snappy.uncompress(Snappy.java:427) > > at > org.xerial.snappy.SnappyInputStream.readFully(SnappyInputStream.java:127) > > at > org.xerial.snappy.SnappyInputStream.readHeader(SnappyInputStream.java:88) > > at > org.xerial.snappy.SnappyInputStream.<init>(SnappyInputStream.java:58) > > at > org.apache.spark.io.SnappyCompressionCodec.compressedInputStream(CompressionCodec.scala:160) > > at > org.apache.spark.broadcast.TorrentBroadcast$$anonfun$7.apply(TorrentBroadcast.scala:213) > > at > org.apache.spark.broadcast.TorrentBroadcast$$anonfun$7.apply(TorrentBroadcast.scala:213) > > at scala.Option.map(Option.scala:145) > > at > org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:213) > > at > org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177) > > at > org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1153) > > at > org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164) > > at > org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64) > > at > org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64) > > at > org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87) > > at > org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70) > > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:61) > > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > > at org.apache.spark.scheduler.Task.run(Task.scala:64) > > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > > > Parameters Changed : > > spark.akka.frameSize=50 > > spark.shuffle.memoryFraction=0.4 > > spark.storage.memoryFraction=0.5 > > spark.worker.timeout=120000 > > spark.storage.blockManagerSlaveTimeoutMs=120000 > > spark.akka.heartbeat.pauses=6000 > > spark.akka.heartbeat.interval=1000 > > spark.ui.port=21000 > > spark.port.maxRetries=50 > > spark.executor.memory=10G > > spark.executor.instances=100 > > spark.driver.memory=8G > > spark.executor.cores=2 > > spark.shuffle.compress=true > > spark.io.compression.codec=snappy > > spark.broadcast.compress=true > > spark.rdd.compress=true > > spark.worker.cleanup.enabled=true > > spark.worker.cleanup.interval=600 > > spark.worker.cleanup.appDataTtl=600 > > spark.shuffle.consolidateFiles=true > > spark.yarn.preserve.staging.files=false > > spark.yarn.driver.memoryOverhead=1024 > > spark.yarn.executor.memoryOverhead=1024 > > > > Best Regards, > > Prashant Singh Thakur > > Mobile: +91-9740266522 > > > > ------------------------------ > > > > > > > NOTE: This message may contain information that is confidential, > proprietary, privileged or otherwise protected by law. The message is > intended solely for the named addressee. If received in error, please > destroy and notify the sender. Any use of this email is prohibited when > received in error. Impetus does not represent, warrant and/or guarantee, > that the integrity of this communication has been maintained nor that the > communication is free of errors, virus, interception or interference. > -- --- Takeshi Yamamuro