GitHub user andrewor14 opened a pull request:
https://github.com/apache/spark/pull/10240
[SPARK-12155] Fix executor OOM in unified memory management
**Problem.** In unified memory management, acquiring execution memory may
lead to eviction of storage memory. However, the space freed from evicting
cached blocks is distributed among all active tasks. Thus, an incorrect upper
bound on the execution memory per task can cause the acquisition to fail,
leading to OOM's and premature spills.
**Example.** Suppose total memory is 1000B, cached blocks occupy 900B,
`spark.memory.storageFraction` is 0.4, and there are two active tasks. In this
case, the cap on task execution memory is 100B / 2 = 50B. If task A tries to
acquire 200B, it will evict 100B of storage but can only acquire 50B because of
the incorrect cap. For another example, see this [regression
test](https://github.com/andrewor14/spark/blob/fix-oom/core/src/test/scala/org/apache/spark/memory/UnifiedMemoryManagerSuite.scala#L233).
**Solution.** Fix the cap on task execution memory. It should take into
account the space that could have been freed by storage in addition to the
current amount of memory available to execution. In the example above, the
correct cap would have been 600B / 2 = 300B.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/andrewor14/spark fix-oom
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/10240.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #10240
----
commit 68337754b8541d8ac497c19ae531a79a91904708
Author: Andrew Or <[email protected]>
Date: 2015-12-09T23:45:48Z
Pass in callbacks (gross)
commit 35392f5e5e8c152e8cf516ffb4f7c8def0df8361
Author: Andrew Or <[email protected]>
Date: 2015-12-10T00:02:18Z
Notify all on task completion
commit cd0c680161e9c6f8044401ec1c2c3a4e83d4b6a1
Author: Andrew Or <[email protected]>
Date: 2015-12-10T02:10:13Z
Rename silly method names + add detailed comments
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]