GitHub user andrewor14 opened a pull request:
https://github.com/apache/spark/pull/9084
[SPARK-10983] Unified memory manager
This patch unifies the memory management of the storage and execution
regions such that either side can borrow memory from each other. When memory
pressure arises, storage will be evicted in favor of execution. To avoid
regressions in cases where storage is crucial, we dynamically allocate a
fraction of space for storage that execution cannot evict. Several
configurations are introduced:
- **spark.memory.fraction (default 0.75)**: âfraction of the heap space
used for execution and storage. The lower this is, the more frequently spills
and cached data eviction occur. The purpose of this config is to set aside
memory for internal metadata, user data structures, and imprecise size
estimation in the case of sparse, unusually large records.
- **spark.memory.storageFraction (default 0.5)**: size of the storage
region within the space set aside by `sâpark.memory.fraction`. âCached data
may only be evicted if total storage exceeds this region.
- **spark.memory.useLegacyMode (default false)**: whether to use the memory
management that existed in Spark 1.5 and before. This is mainly for backward
compatibility.
For a detailed description of the design, see
[SPARK-10000](https://issues.apache.org/jira/browse/SPARK-10000).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/andrewor14/spark unified-memory-manager
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9084.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9084
----
commit 6481bc17d66b6045195138de9ac490d479019182
Author: Andrew Or <[email protected]>
Date: 2015-10-09T02:12:13Z
Add UnifiedMemoryManager skeleton
commit e08e1fe14cc2f137a70a4b73e4f1950b8ef4cd63
Author: Andrew Or <[email protected]>
Date: 2015-10-09T20:29:03Z
First implementation of UnifiedMemoryManager
As of this commit, acquiring execution memory may cause cached
blocks to be evicted. There are still a couple of TODOs, notably
figuring out what the `maxExecutionMemory` and `maxStorageMemory`
should actually be. They may not be fixed since either side can
now borrow from the other.
commit 812c9744e9d9047518f9cb596386753d04b5d046
Author: Andrew Or <[email protected]>
Date: 2015-10-09T21:34:39Z
Use a global lock for all memory manager interactions
Without this, we cannot both avoid deadlocks and race conditions,
because each individual component ShuffleMemoryManager,
MemoryStore and MemoryManager all have their own respective
locks.
This commit allows us to simplify several unintuitive control
flows that were introduced to avoid acquiring locks in different
orders. Since we have only one lock now, these code blocks can
be significantly simplified.
A forseeable downside to this is parallelism is reduced,
but memory acquisitions and releases don't actually occur that
frequently, so performance should not suffer noticeably.
Additional investigations about this should ensue.
commit cc5f64c50ad7cadda81c550b90ac6894b88fa92a
Author: Andrew Or <[email protected]>
Date: 2015-10-09T22:12:53Z
Use a dynamic max for execution and storage
Previously, ShuffleMemoryManager will allow each task to acquire
up to 1/N of the entire storage + execution region. What we want
is more like 1/N of the space not occupied by storage, since the
"max" now varies over time.
commit ad8a6c47ff1338724e42cc1ffee229a1e5611e12
Author: Andrew Or <[email protected]>
Date: 2015-10-09T22:35:26Z
Fix notifyAll() IllegalMonitorStateException
This happened because we were still calling `this.notifyAll()`
in ShuffleMemoryManager when we were holding the `memoryManager`
lock. Easy fix.
commit 0dc9a9597bbd0c26a68dfdd0ed0a9fb3e9d13a79
Author: Andrew Or <[email protected]>
Date: 2015-10-09T22:51:45Z
Minor: update a few comments
commit 5a4ffb907614105f311d9c76d907a803487c5207
Author: Andrew Or <[email protected]>
Date: 2015-10-09T22:56:05Z
Register blocks evicted by execution
commit b519540804902ec3acd21814b94f5d9ec06cb088
Author: Andrew Or <[email protected]>
Date: 2015-10-09T23:02:42Z
Minor: more comment updates
commit a65799e930e9cd4de6a54ff87d27632080def773
Author: Andrew Or <[email protected]>
Date: 2015-10-12T21:43:08Z
Add tests for UnifiedMemoryManager + TODOs
Tests are passing in this commit, but there are follow-ups that
need to be done (see TODOs added in this commit). More tests will
be added in the future.
commit 6e913a5b9ef0a1758c66756a9e2b2f03ec8699a6
Author: Andrew Or <[email protected]>
Date: 2015-10-12T23:46:56Z
Clean up test code, resolve TODOs
As of this commit all *MemoryManagerSuite's are documented and
pass tests.
commit 3eef5a4c69c41ab6502461128d3b6181ab38f56b
Author: Andrew Or <[email protected]>
Date: 2015-10-12T23:51:07Z
Fix tests
TaskContext was not stubbed correctly in ShuffleMemoryManagerSuite.
In particular, this patch added some code that does some things
to the active TaskMetrics, but this was not part of the mocked
TaskContext object.
commit c56600b216f7d04dc3837cd1489fa5c661953de1
Author: Andrew Or <[email protected]>
Date: 2015-10-13T00:59:15Z
Minor: Add TODOs, fix a few comments
commit 01ff5335b2b6554ed9388b015145b8dfd9bc1977
Author: Andrew Or <[email protected]>
Date: 2015-10-13T00:59:36Z
Merge branch 'master' of github.com:apache/spark into unified-memory-manager
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]