GitHub user guowei2 opened a pull request:
https://github.com/apache/spark/pull/1822
[SPARK-2873] using ExternalAppendOnlyMap to resolve OOM when aggregating
Using ExternalAppendOnlyMap to resolve OOM when aggregating.
Using "spark.shuffle.spill" to open it or not
Hive udaf does not support yet for udaf need Serializable
Join has the same problem. but using ExternalAppendOnlyMap as CoGroupedRDD
seems to reduce performance. i try another way by using ExternalAppendOnlyMap.
but it needs testing .i will commit it in another batch.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/guowei2/spark sql-memory-patch
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/1822.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1822
----
commit 87627e700e3205499726aa1ab5c1ee6e56433b5e
Author: guowei <[email protected]>
Date: 2014-08-06T04:02:38Z
[SPARK-2873] use ExternalAppendOnlyMap to resolve aggregate's OOM
commit f889700ec522aab688c8c3be8bb4a9402776f35f
Author: guowei <[email protected]>
Date: 2014-08-06T04:11:43Z
[SPARK-2873] use ExternalAppendOnlyMap to resolve aggregate's OOM
commit 21b573548f742d8e6066364642bc70eece512bd5
Author: guowei <[email protected]>
Date: 2014-08-06T07:53:18Z
[SPARK-2873] use ExternalAppendOnlyMap to resolve aggregate's OOM
commit d2be8323535c106f336ebb8148acf54cd351cdae
Author: guowei <[email protected]>
Date: 2014-08-06T08:50:48Z
[SPARK-2873] use ExternalAppendOnlyMap to resolve aggregate's OOM
commit e3a88b115c608edf12806d2698f74e9289508e7d
Author: guowei <[email protected]>
Date: 2014-08-06T09:13:36Z
[SPARK-2873] use ExternalAppendOnlyMap to resolve aggregate's OOM
commit 2a4786a92e00c671bb422f2de38547dde7721a9c
Author: guowei <[email protected]>
Date: 2014-08-06T09:14:14Z
Merge branch 'sql-memory-patch' of https://github.com/guowei2/spark into
sql-memory-patch
commit 475da9d3b6304892af3d41471aceb3f81b0cc490
Author: guowei <[email protected]>
Date: 2014-08-06T09:15:39Z
[SPARK-2873] use ExternalAppendOnlyMap to resolve aggregate's OOM
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]