Re: FAILED_TO_UNCOMPRESS Error - Spark 1.3.1

2016-05-30 Thread Takeshi Yamamuro
Hi,

This is a known issue.
You need to check a related JIRA ticket:
https://issues.apache.org/jira/browse/SPARK-4105

// maropu

On Mon, May 30, 2016 at 7:51 PM, Prashant Singh Thakur <
prashant.tha...@impetus.co.in> wrote:

> Hi,
>
>
>
> We are trying to use Spark Data Frames for our use case where we are
> getting this exception.
>
> The parameters used are listed below. Kindly suggest if we are missing
> something.
>
> Version used is Spark 1.3.1
>
> Jira is still showing this issue as Open
> https://issues.apache.org/jira/browse/SPARK-4105
>
> Kindly suggest if there is workaround .
>
>
>
> Exception :
>
> Caused by: org.apache.spark.SparkException: Job aborted due to stage
> failure: Task 88 in stage 40.0 failed 4 times, most recent failure: Lost
> task 88.3 in stage 40.0 : java.io.IOException: FAILED_TO_UNCOMPRESS(5)
>
>   at
> org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:78)
>
>   at org.xerial.snappy.SnappyNative.rawUncompress(Native
> Method)
>
>   at org.xerial.snappy.Snappy.rawUncompress(Snappy.java:391)
>
>   at org.xerial.snappy.Snappy.uncompress(Snappy.java:427)
>
>   at
> org.xerial.snappy.SnappyInputStream.readFully(SnappyInputStream.java:127)
>
>   at
> org.xerial.snappy.SnappyInputStream.readHeader(SnappyInputStream.java:88)
>
>   at
> org.xerial.snappy.SnappyInputStream.(SnappyInputStream.java:58)
>
>   at
> org.apache.spark.io.SnappyCompressionCodec.compressedInputStream(CompressionCodec.scala:160)
>
>   at
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$7.apply(TorrentBroadcast.scala:213)
>
>   at
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$7.apply(TorrentBroadcast.scala:213)
>
>   at scala.Option.map(Option.scala:145)
>
>   at
> org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:213)
>
>   at
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177)
>
>   at
> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1153)
>
>   at
> org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
>
>   at
> org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
>
>   at
> org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
>
>   at
> org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
>
>   at
> org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
>
>   at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:61)
>
>   at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>
>   at org.apache.spark.scheduler.Task.run(Task.scala:64)
>
>   at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
>
>   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>   at java.lang.Thread.run(Thread.java:745)
>
>
>
> Parameters Changed :
>
> spark.akka.frameSize=50
>
> spark.shuffle.memoryFraction=0.4
>
> spark.storage.memoryFraction=0.5
>
> spark.worker.timeout=12
>
> spark.storage.blockManagerSlaveTimeoutMs=12
>
> spark.akka.heartbeat.pauses=6000
>
> spark.akka.heartbeat.interval=1000
>
> spark.ui.port=21000
>
> spark.port.maxRetries=50
>
> spark.executor.memory=10G
>
> spark.executor.instances=100
>
> spark.driver.memory=8G
>
> spark.executor.cores=2
>
> spark.shuffle.compress=true
>
> spark.io.compression.codec=snappy
>
> spark.broadcast.compress=true
>
> spark.rdd.compress=true
>
> spark.worker.cleanup.enabled=true
>
> spark.worker.cleanup.interval=600
>
> spark.worker.cleanup.appDataTtl=600
>
> spark.shuffle.consolidateFiles=true
>
> spark.yarn.preserve.staging.files=false
>
> spark.yarn.driver.memoryOverhead=1024
>
> spark.yarn.executor.memoryOverhead=1024
>
>
>
> Best Regards,
>
> Prashant Singh Thakur
>
> Mobile: +91-9740266522
>
>
>
> --
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.
>



-- 
---
Takeshi Yamamuro


FAILED_TO_UNCOMPRESS Error - Spark 1.3.1

2016-05-30 Thread Prashant Singh Thakur
Hi,

We are trying to use Spark Data Frames for our use case where we are getting 
this exception.
The parameters used are listed below. Kindly suggest if we are missing 
something.
Version used is Spark 1.3.1
Jira is still showing this issue as Open 
https://issues.apache.org/jira/browse/SPARK-4105
Kindly suggest if there is workaround .

Exception :
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 88 in stage 40.0 failed 4 times, most recent failure: Lost task 88.3 in 
stage 40.0 : java.io.IOException: FAILED_TO_UNCOMPRESS(5)
  at 
org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:78)
  at org.xerial.snappy.SnappyNative.rawUncompress(Native Method)
  at org.xerial.snappy.Snappy.rawUncompress(Snappy.java:391)
  at org.xerial.snappy.Snappy.uncompress(Snappy.java:427)
  at 
org.xerial.snappy.SnappyInputStream.readFully(SnappyInputStream.java:127)
  at 
org.xerial.snappy.SnappyInputStream.readHeader(SnappyInputStream.java:88)
  at 
org.xerial.snappy.SnappyInputStream.(SnappyInputStream.java:58)
  at 
org.apache.spark.io.SnappyCompressionCodec.compressedInputStream(CompressionCodec.scala:160)
  at 
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$7.apply(TorrentBroadcast.scala:213)
  at 
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$7.apply(TorrentBroadcast.scala:213)
  at scala.Option.map(Option.scala:145)
  at 
org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:213)
  at 
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177)
  at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1153)
  at 
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
  at 
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
  at 
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
  at 
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
  at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
  at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:61)
  at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
  at org.apache.spark.scheduler.Task.run(Task.scala:64)
  at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)

Parameters Changed :
spark.akka.frameSize=50
spark.shuffle.memoryFraction=0.4
spark.storage.memoryFraction=0.5
spark.worker.timeout=12
spark.storage.blockManagerSlaveTimeoutMs=12
spark.akka.heartbeat.pauses=6000
spark.akka.heartbeat.interval=1000
spark.ui.port=21000
spark.port.maxRetries=50
spark.executor.memory=10G
spark.executor.instances=100
spark.driver.memory=8G
spark.executor.cores=2
spark.shuffle.compress=true
spark.io.compression.codec=snappy
spark.broadcast.compress=true
spark.rdd.compress=true
spark.worker.cleanup.enabled=true
spark.worker.cleanup.interval=600
spark.worker.cleanup.appDataTtl=600
spark.shuffle.consolidateFiles=true
spark.yarn.preserve.staging.files=false
spark.yarn.driver.memoryOverhead=1024
spark.yarn.executor.memoryOverhead=1024

Best Regards,
Prashant Singh Thakur
Mobile: +91-9740266522









NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.