cxzl25 commented on a change in pull request #22062:
URL: https://github.com/apache/spark/pull/22062#discussion_r525807522



##########
File path: 
core/src/main/java/org/apache/spark/shuffle/sort/ShuffleInMemorySorter.java
##########
@@ -94,12 +94,20 @@ public int numRecords() {
   }
 
   public void reset() {
+    // Reset `pos` here so that `spill` triggered by the below `allocateArray` 
will be no-op.
+    pos = 0;

Review comment:
       The process of `inMemSorter.reset()` may throw oom in 
`consumer.allocateArray()`, at this time array=null.
   `ShuffleExternalSorter#cleanupResources` called 
`inMemSorter.getMemoryUsage()`, throwing npe.
   `UnsafeExternalSorter#inMemSorter` also has this problem.
   
   Do I need to submit a pr to solve this problem? @zsxwing 
   
   executor log
   ```
   20/11/17 19:48:00,675 [Executor task launch worker for task 1721] ERROR 
UnsafeShuffleWriter: In addition to a failure during writing, we failed during 
cleanup.
   java.lang.NullPointerException
        at 
org.apache.spark.shuffle.sort.ShuffleInMemorySorter.getMemoryUsage(ShuffleInMemorySorter.java:133)
        at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.getMemoryUsage(ShuffleExternalSorter.java:269)
        at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.updatePeakMemoryUsed(ShuffleExternalSorter.java:273)
        at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.freeMemory(ShuffleExternalSorter.java:288)
        at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.cleanupResources(ShuffleExternalSorter.java:304)
        at 
org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:174)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
        at org.apache.spark.scheduler.Task.run(Task.scala:116)
        at 
org.apache.spark.executor.Executor$TaskRunner.org$apache$spark$executor$Executor$TaskRunner$$runInternal(Executor.scala:353)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:296)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
   20/11/17 19:48:00,682 [Executor task launch worker for task 1721] ERROR 
Executor: Exception in task 0.0 in stage 1917.0 (TID 1721)
   java.lang.OutOfMemoryError: Unable to acquire 32768 bytes of memory, got 0
        at 
org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:98)
        at 
org.apache.spark.shuffle.sort.ShuffleInMemorySorter.reset(ShuffleInMemorySorter.java:109)
        at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.spill(ShuffleExternalSorter.java:256)
        at 
org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:203)
        at 
org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:281)
        at 
org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:119)
        at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.acquireNewPageIfNecessary(ShuffleExternalSorter.java:359)
        at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.insertRecord(ShuffleExternalSorter.java:382)
        at 
org.apache.spark.shuffle.sort.UnsafeShuffleWriter.insertRecordIntoSorter(UnsafeShuffleWriter.java:246)
        at 
org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:167)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
        at org.apache.spark.scheduler.Task.run(Task.scala:116)
        at 
org.apache.spark.executor.Executor$TaskRunner.org$apache$spark$executor$Executor$TaskRunner$$runInternal(Executor.scala:353)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:296)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
   
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to