GitHub user guowei2 opened a pull request:

    https://github.com/apache/spark/pull/1822

    [SPARK-2873] using ExternalAppendOnlyMap to resolve OOM when aggregating

    Using ExternalAppendOnlyMap to resolve OOM when aggregating.
    Using "spark.shuffle.spill" to open it or not 
    Hive udaf does not support yet for udaf need Serializable
    
    Join has  the same problem. but using ExternalAppendOnlyMap as CoGroupedRDD 
 seems to reduce performance. i try another way by using ExternalAppendOnlyMap. 
but it needs testing .i will commit it in another batch.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/guowei2/spark sql-memory-patch

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1822.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1822
    
----
commit 87627e700e3205499726aa1ab5c1ee6e56433b5e
Author: guowei <[email protected]>
Date:   2014-08-06T04:02:38Z

    [SPARK-2873] use ExternalAppendOnlyMap to resolve aggregate's OOM

commit f889700ec522aab688c8c3be8bb4a9402776f35f
Author: guowei <[email protected]>
Date:   2014-08-06T04:11:43Z

    [SPARK-2873] use ExternalAppendOnlyMap to resolve aggregate's OOM

commit 21b573548f742d8e6066364642bc70eece512bd5
Author: guowei <[email protected]>
Date:   2014-08-06T07:53:18Z

    [SPARK-2873] use ExternalAppendOnlyMap to resolve aggregate's OOM

commit d2be8323535c106f336ebb8148acf54cd351cdae
Author: guowei <[email protected]>
Date:   2014-08-06T08:50:48Z

    [SPARK-2873] use ExternalAppendOnlyMap to resolve aggregate's OOM

commit e3a88b115c608edf12806d2698f74e9289508e7d
Author: guowei <[email protected]>
Date:   2014-08-06T09:13:36Z

    [SPARK-2873] use ExternalAppendOnlyMap to resolve aggregate's OOM

commit 2a4786a92e00c671bb422f2de38547dde7721a9c
Author: guowei <[email protected]>
Date:   2014-08-06T09:14:14Z

    Merge branch 'sql-memory-patch' of https://github.com/guowei2/spark into 
sql-memory-patch

commit 475da9d3b6304892af3d41471aceb3f81b0cc490
Author: guowei <[email protected]>
Date:   2014-08-06T09:15:39Z

    [SPARK-2873] use ExternalAppendOnlyMap to resolve aggregate's OOM

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to