Mridul Muralidharan created SPARK-6165: ------------------------------------------
Summary: Aggregate and reduce should spool to disk and complete Key: SPARK-6165 URL: https://issues.apache.org/jira/browse/SPARK-6165 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.4.0 Reporter: Mridul Muralidharan Priority: Minor To prevent data from workers causing OOM at master, we have the property 'spark.driver.maxResultSize'. But the OOM at master can be due to two reasons : a) Data being sent from workers is too large - causing OOM at master. b) Large number of moderate (to low) sized data being sent to master causing OOM. (For example: 500k tasks, 1k each) spark.driver.maxResultSize protects against both - but (b) should be handled more gracefully by master : example spool it to disk, aggregate without waiting for entire result set to be fetched, etc. Currently we are forced to use treeReduce and co to work around this problem : adding to the latency of jobs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org