Spark 2.0 has been released. Mind giving it a try :-) ?
On Wed, Aug 3, 2016 at 9:11 AM, Rychnovsky, Dusan < dusan.rychnov...@firma.seznam.cz> wrote: > OK, thank you. What do you suggest I do to get rid of the error? > > > ------------------------------ > *From:* Ted Yu <yuzhih...@gmail.com> > *Sent:* Wednesday, August 3, 2016 6:10 PM > *To:* Rychnovsky, Dusan > *Cc:* user@spark.apache.org > *Subject:* Re: Managed memory leak detected + OutOfMemoryError: Unable to > acquire X bytes of memory, got 0 > > The latest QA run was no longer accessible (error 404): > > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59141/consoleFull > > Looking at the comments on the PR, there is not enough confidence in > pulling in the fix into 1.6 > > On Wed, Aug 3, 2016 at 9:05 AM, Rychnovsky, Dusan < > dusan.rychnov...@firma.seznam.cz> wrote: > >> I am confused. >> >> >> I tried to look for Spark that would have this issue fixed, i.e. >> https://github.com/apache/spark/pull/13027/ merged in, but it looks like >> the patch has not been merged for 1.6. >> >> >> How do I get a fixed 1.6 version? >> >> >> Thanks, >> >> Dusan >> >> >> <https://github.com/apache/spark/pull/13027/> >> [SPARK-4452][SPARK-11293][Core][BRANCH-1.6] Shuffle data structures can >> starve others on the same thread for memory by lianhuiwang · Pull Request >> #13027 · apache/spark · GitHub >> What changes were proposed in this pull request? This PR is for the >> branch-1.6 version of the commits PR #10024. In #9241 It implemented a >> mechanism to call spill() on those SQL operators that sup... >> Read more... <https://github.com/apache/spark/pull/13027/> >> >> >> >> ------------------------------ >> *From:* Rychnovsky, Dusan >> *Sent:* Wednesday, August 3, 2016 3:58 PM >> *To:* Ted Yu >> >> *Cc:* user@spark.apache.org >> *Subject:* Re: Managed memory leak detected + OutOfMemoryError: Unable >> to acquire X bytes of memory, got 0 >> >> >> Yes, I believe I'm using Spark 1.6.0. >> >> >> > spark-submit --version >> Welcome to >> ____ __ >> / __/__ ___ _____/ /__ >> _\ \/ _ \/ _ `/ __/ '_/ >> /___/ .__/\_,_/_/ /_/\_\ version 1.6.0 >> /_/ >> >> I don't understand the ticket. It says "Fixed in 1.6.0". I have 1.6.0 and >> therefore should have it fixed, right? Or what do I do to fix it? >> >> >> Thanks, >> >> Dusan >> >> >> ------------------------------ >> *From:* Ted Yu <yuzhih...@gmail.com> >> *Sent:* Wednesday, August 3, 2016 3:52 PM >> *To:* Rychnovsky, Dusan >> *Cc:* user@spark.apache.org >> *Subject:* Re: Managed memory leak detected + OutOfMemoryError: Unable >> to acquire X bytes of memory, got 0 >> >> Are you using Spark 1.6+ ? >> >> See SPARK-11293 >> >> On Wed, Aug 3, 2016 at 5:03 AM, Rychnovsky, Dusan < >> dusan.rychnov...@firma.seznam.cz> wrote: >> >>> Hi, >>> >>> >>> I have a Spark workflow that when run on a relatively small portion of >>> data works fine, but when run on big data fails with strange errors. In the >>> log files of failed executors I found the following errors: >>> >>> >>> Firstly >>> >>> >>> > Managed memory leak detected; size = 263403077 bytes, TID = 6524 >>> >>> And then a series of >>> >>> > java.lang.OutOfMemoryError: Unable to acquire 241 bytes of memory, got >>> 0 >>> >>> > at >>> org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:120) >>> >>> >>> > at >>> org.apache.spark.shuffle.sort.ShuffleExternalSorter.acquireNewPageIfNecessary(ShuffleExternalSorter.java:346) >>> >>> >>> > at >>> org.apache.spark.shuffle.sort.ShuffleExternalSorter.insertRecord(ShuffleExternalSorter.java:367) >>> >>> >>> > at >>> org.apache.spark.shuffle.sort.UnsafeShuffleWriter.insertRecordIntoSorter(UnsafeShuffleWriter.java:237) >>> >>> >>> > at >>> org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:164) >>> >>> >>> > at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) >>> >>> > at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) >>> >>> > at org.apache.spark.scheduler.Task.run(Task.scala:89) >>> >>> > at >>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) >>> >>> > at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> >>> >>> > at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> >>> >>> > at java.lang.Thread.run(Thread.java:745) >>> >>> >>> The job keeps failing in the same way (I tried a few times). >>> >>> >>> What could be causing such error? >>> >>> I have a feeling that I'm not providing enough context necessary to >>> understand the issue. Please ask for any other information needed. >>> >>> >>> Thank you, >>> >>> Dusan >>> >>> >>> >> >