Davies Liu created SPARK-2538:
---------------------------------
Summary: External aggregation in Python
Key: SPARK-2538
URL: https://issues.apache.org/jira/browse/SPARK-2538
Project: Spark
Issue Type: Improvement
Components: PySpark
Affects Versions: 1.0.0, 1.0.1
Reporter: Davies Liu
Fix For: 1.0.1, 1.0.0
For huge reduce tasks, user will got out of memory exception when all the data
can not fit in memory.
It should put some of the data into disks and then merge them together, just
like what we do in Scala.
--
This message was sent by Atlassian JIRA
(v6.2#6252)