GitHub user sharkdtu opened a pull request:

    https://github.com/apache/spark/pull/14479

    [SPARK-16873] [Core] Fix SpillReader NPE when spillFile has no data

    ## What changes were proposed in this pull request?
    
    SpillReader NPE when spillFile has no data. See follow logs:
    
    16/07/31 20:54:04 INFO collection.ExternalSorter: spill memory to 
file:/data4/yarnenv/local/usercache/tesla/appcache/application_1465785263942_56138/blockmgr-db5f46c3-d7a4-4f93-8b77-565e469696fb/09/temp_shuffle_ec3ece08-4569-4197-893a-4a5dfcbbf9fa,
 fileSize:0.0 B
    16/07/31 20:54:04 WARN memory.TaskMemoryManager: leak 164.3 MB memory from 
org.apache.spark.util.collection.ExternalSorter@3db4b52d
    16/07/31 20:54:04 ERROR executor.Executor: Managed memory leak detected; 
size = 190458101 bytes, TID = 2358516/07/31 20:54:04 ERROR executor.Executor: 
Exception in task 1013.0 in stage 18.0 (TID 23585)
    java.lang.NullPointerException
        at 
org.apache.spark.util.collection.ExternalSorter$SpillReader.cleanup(ExternalSorter.scala:624)
        at 
org.apache.spark.util.collection.ExternalSorter$SpillReader.nextBatchStream(ExternalSorter.scala:539)
        at 
org.apache.spark.util.collection.ExternalSorter$SpillReader.<init>(ExternalSorter.scala:507)
        at 
org.apache.spark.util.collection.ExternalSorter$SpillableIterator.spill(ExternalSorter.scala:816)
        at 
org.apache.spark.util.collection.ExternalSorter.forceSpill(ExternalSorter.scala:251)
        at org.apache.spark.util.collection.Spillable.spill(Spillable.scala:109)
        at 
org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:154)
        at 
org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:249)
        at 
org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:112)
        at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.acquireNewPageIfNecessary(ShuffleExternalSorter.java:346)
        at 
org.apache.spark.shuffle.sort.ShuffleExternalSorter.insertRecord(ShuffleExternalSorter.java:367)
        at 
org.apache.spark.shuffle.sort.UnsafeShuffleWriter.insertRecordIntoSorter(UnsafeShuffleWriter.java:237)
        at 
org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:164)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
    16/07/31 20:54:30 INFO executor.Executor: Executor is trying to kill task 
1090.1 in stage 18.0 (TID 23793)
    16/07/31 20:54:30 INFO executor.CoarseGrainedExecutorBackend: Driver 
commanded a shutdown
    
    
    ## How was this patch tested?
    
    Manual test.
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sharkdtu/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14479.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14479
    
----
commit d8cf2b493a589b745d54b3b903848d4d0827e642
Author: sharkd <sharkd...@gmail.com>
Date:   2016-07-12T23:59:26Z

    rebase apache/master

commit 8b0c40ab555336899b684fc2a1d6cc1c0886cd11
Author: sharkd <sharkd...@gmail.com>
Date:   2016-07-11T16:49:56Z

    fix style

commit 888cf1fa2187e4f92286c74ba6a05196348eff79
Author: sharkd <sharkd...@gmail.com>
Date:   2016-07-12T23:59:26Z

    rebase apache/master

commit c470ab74b1bfc4814f0ca683102ed55b6c2a1410
Author: sharkd <sharkd...@gmail.com>
Date:   2016-07-11T16:49:56Z

    fix style

commit 8ae5ec71c9e12b4004d0563c9b581b590890369f
Author: sharkdtu <shark...@tencent.com>
Date:   2016-08-03T11:51:45Z

    SpillReader NPE when spillFile has no data

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to