GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/7734

    [SPARK-9419] ShuffleMemoryManager and MemoryStore should track memory on a 
per-task, not per-thread, basis

    Spark's ShuffleMemoryManager and MemoryStore track memory on a per-thread 
basis, which causes problems in the handful of cases where we have tasks that 
use multiple threads. In PythonRDD, RRDD, ScriptTransformation, and PipedRDD we 
consume the input iterator in a separate thread in order to write it to an 
external process.  As a result, these RDD's input iterators are consumed in a 
different thread than the thread that created them, which can cause problems in 
our memory allocation tracking. For example, if allocations are performed in 
one thread but deallocations are performed in a separate thread then memory may 
be leaked or we may get errors complaining that more memory was allocated than 
was freed.
    
    I think that the right way to fix this is to change our accounting to be 
performed on a per-task instead of per-thread basis.  Note that the current 
per-thread tracking has caused problems in the past; SPARK-3731 (#2668) fixes a 
memory leak in PythonRDD that was caused by this issue (that fix is no longer 
necessary as of this patch).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark memory-tracking-fixes

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/7734.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #7734
    
----
commit c9e8e54df30350085770e6247dd21673c9ecd516
Author: Josh Rosen <[email protected]>
Date:   2015-07-28T17:58:34Z

    Use TaskAttemptIds to track unroll memory

commit 2e1e0f8870f68ab4611a4faebc7a779ba9bc174a
Author: Josh Rosen <[email protected]>
Date:   2015-07-28T18:09:11Z

    Use TaskAttemptIds to track shuffle memory

commit 1b0083b078772ed9eedf12f35bd7f6f7187aa50d
Author: Josh Rosen <[email protected]>
Date:   2015-07-28T18:09:45Z

    Roll back fix in PySpark, which is no longer necessary

commit 5e2f01e1a9eeb013443fe98c06c9c85ef368b99e
Author: Josh Rosen <[email protected]>
Date:   2015-07-28T18:13:06Z

    Fix capitalization

commit fa78ee80637580acc7d847f1858f3060fe9e8988
Author: Josh Rosen <[email protected]>
Date:   2015-07-28T18:17:57Z

    Move Executor's cleanup into Task so that TaskContext is defined when 
cleanup is performed

commit f57f3f2248050dbad4516b773cf225b10bc40d80
Author: Josh Rosen <[email protected]>
Date:   2015-07-28T18:22:56Z

    More thread -> task changes

commit 7b0f04b13b6816598f691b3de31820959e93550c
Author: Josh Rosen <[email protected]>
Date:   2015-07-28T18:40:10Z

    Fix ShuffleMemoryManagerSuite

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to