Hi Spark devs,
I am using 1.6.0 with dynamic allocation on yarn. I am trying to run a
relatively big application with 10s of jobs and 100K+ tasks and my app
fails with the exception below. The closest jira issue I could find is
SPARK-11293 <https://issues.apache.org/jira/browse/SPARK-11293>, which is a
critical bug that has been open for a long time. There are other similar
jira issues (all fixed): SPARK-10474
<https://issues.apache.org/jira/browse/SPARK-10474>, SPARK-10733
<https://issues.apache.org/jira/browse/SPARK-10733>, SPARK-10309
<https://issues.apache.org/jira/browse/SPARK-10309>, SPARK-10379
<https://issues.apache.org/jira/browse/SPARK-10379>.

Any workarounds to this issue or any plans to fix it?

Thanks a lot,
Nezih

16/03/19 05:12:09 INFO memory.TaskMemoryManager: Memory used in task
4687016/03/19 05:12:09 INFO memory.TaskMemoryManager: Acquired by
org.apache.spark.shuffle.sort.ShuffleExternalSorter@1c36f801: 32.0
KB16/03/19 05:12:09 INFO memory.TaskMemoryManager: 1512915599 bytes of
memory were used by task 46870 but are not associated with specific
consumers16/03/19 05:12:09 INFO memory.TaskMemoryManager: 1512948367
bytes of memory are used for execution and 156978343 bytes of memory
are used for storage16/03/19 05:12:09 ERROR executor.Executor: Managed
memory leak detected; size = 1512915599 bytes, TID = 4687016/03/19
05:12:09 ERROR executor.Executor: Exception in task 77.0 in stage
273.0 (TID 46870)
java.lang.OutOfMemoryError: Unable to acquire 128 bytes of memory, got 0
    at 
org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:120)
    at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.acquireNewPageIfNecessary(ShuffleExternalSorter.java:354)
    at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.insertRecord(ShuffleExternalSorter.java:375)
    at 
org.apache.spark.shuffle.sort.UnsafeShuffleWriter.insertRecordIntoSorter(UnsafeShuffleWriter.java:237)
    at 
org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:164)
    at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)16/03/19 05:12:09 ERROR
util.SparkUncaughtExceptionHandler: Uncaught exception in thread
Thread[Executor task launch worker-8,5,main]
java.lang.OutOfMemoryError: Unable to acquire 128 bytes of memory, got 0
    at 
org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:120)
    at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.acquireNewPageIfNecessary(ShuffleExternalSorter.java:354)
    at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.insertRecord(ShuffleExternalSorter.java:375)
    at 
org.apache.spark.shuffle.sort.UnsafeShuffleWriter.insertRecordIntoSorter(UnsafeShuffleWriter.java:237)
    at 
org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:164)
    at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)16/03/19 05:12:10 INFO
storage.DiskBlockManager: Shutdown hook called16/03/19 05:12:10 INFO
util.ShutdownHookManager: Shutdown hook called

​

Reply via email to