[GitHub] spark pull request #20480: [Spark-23306] Fix the oom caused by contention

2018-02-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20480


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20480: [Spark-23306] Fix the oom caused by contention

2018-02-01 Thread zhzhan
GitHub user zhzhan opened a pull request:

https://github.com/apache/spark/pull/20480

[Spark-23306] Fix the oom caused by contention

## What changes were proposed in this pull request?

here is race condition in TaskMemoryManger, which may cause OOM.

The memory released may be taken by another task because there is a gap 
between releaseMemory and acquireMemory, e.g., UnifiedMemoryManager, causing 
the OOM. if the current is the only one that can perform spill. It can happen 
to BytesToBytesMap, as it only spill required bytes.

Loop on current consumer if it still has memory to release.

## How was this patch tested?

The race contention is hard to reproduce, but the current logic seems 
causing the issue.

Please review http://spark.apache.org/contributing.html before opening a 
pull request.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhzhan/spark oom

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20480.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20480


commit df96f0c126833b0e812cd715ae1538dbd38afac4
Author: Zhan Zhang 
Date:   2018-01-12T19:51:19Z

fix the oom caused by contention




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org