GitHub user ericl opened a pull request:

    https://github.com/apache/spark/pull/15016

    [SPARK-16525] RowBasedKeyValueBatch should use default page size to prevent 
OOMs

    ## What changes were proposed in this pull request?
    
    Before this task, we would always allocate 64MB per aggregation task, even 
when running in low-memory situations such as local mode. This changes it to 
use the memory manager default page size, which is automatically reduced from 
64MB in these situations.
    
    cc @ooq @JoshRosen 
    
    ## How was this patch tested?
    
    Tested manually with `bin/spark-shell --master=local[32]` and verifying 
that `(1 to math.pow(10, 3).toInt).toDF("n").withColumn("m", 'n % 
2).groupBy('m).agg(sum('n)).show` does not crash.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ericl/spark sc-4483

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/15016.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #15016
    
----
commit 5e642aeb60e41fbc8e09789c3693ebd76ba78324
Author: Eric Liang <[email protected]>
Date:   2016-09-08T20:53:22Z

    use default page size

commit 1d6a8456a23e3f582855997d3f0b0a9dbbd3018a
Author: Eric Liang <[email protected]>
Date:   2016-09-08T21:08:43Z

    Thu Sep  8 14:08:43 PDT 2016

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to