GitHub user andrewor14 opened a pull request:
https://github.com/apache/spark/pull/8805
[SPARK-10677] [SQL] [WIP] UnsafeExternalSorter atomic acquire release
Currently `UnsafeExternalSorter` acquires memory like this:
(1) Request memory for a task
(2) If (1) fails, spill and release all of our memory
(3) Try request memory for a task again
(4) If (3) fails, throw exception
Because we release all of our memory, however, we introduce the chance for
other tasks to jump in and steal our memory allocation, which causes issues
like SPARK-10474. Instead, we should just release a fraction of the memory and
keep the ones that we *know* we're going to use immediately afterwards.
WIP because tests are coming.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/andrewor14/spark unsafe-atomic-release-acquire
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/8805.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #8805
----
commit 6ee2025f26883ca652565dea82562aa1acd4aaee
Author: Andrew Or <[email protected]>
Date: 2015-09-17T23:12:34Z
Atomic release and acquire
This commit adds the functionality to reserve bytes when a sorter
spills, such that we release only a fraction of all the memory we
occupied. Note that we need to do this on the bytes level because
we have things like overflow pages.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]