GitHub user andrewor14 opened a pull request:

    https://github.com/apache/spark/pull/10240

    [SPARK-12155] Fix executor OOM in unified memory management

    **Problem.** In unified memory management, acquiring execution memory may 
lead to eviction of storage memory. However, the space freed from evicting 
cached blocks is distributed among all active tasks. Thus, an incorrect upper 
bound on the execution memory per task can cause the acquisition to fail, 
leading to OOM's and premature spills.
    
    **Example.** Suppose total memory is 1000B, cached blocks occupy 900B, 
`spark.memory.storageFraction` is 0.4, and there are two active tasks. In this 
case, the cap on task execution memory is 100B / 2 = 50B. If task A tries to 
acquire 200B, it will evict 100B of storage but can only acquire 50B because of 
the incorrect cap. For another example, see this [regression 
test](https://github.com/andrewor14/spark/blob/fix-oom/core/src/test/scala/org/apache/spark/memory/UnifiedMemoryManagerSuite.scala#L233).
    
    **Solution.** Fix the cap on task execution memory. It should take into 
account the space that could have been freed by storage in addition to the 
current amount of memory available to execution. In the example above, the 
correct cap would have been 600B / 2 = 300B.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andrewor14/spark fix-oom

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10240.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10240
    
----
commit 68337754b8541d8ac497c19ae531a79a91904708
Author: Andrew Or <[email protected]>
Date:   2015-12-09T23:45:48Z

    Pass in callbacks (gross)

commit 35392f5e5e8c152e8cf516ffb4f7c8def0df8361
Author: Andrew Or <[email protected]>
Date:   2015-12-10T00:02:18Z

    Notify all on task completion

commit cd0c680161e9c6f8044401ec1c2c3a4e83d4b6a1
Author: Andrew Or <[email protected]>
Date:   2015-12-10T02:10:13Z

    Rename silly method names + add detailed comments

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to