[
https://issues.apache.org/jira/browse/SPARK-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-6165:
--------------------------------
Labels: bulk-closed (was: )
> Aggregate and reduce should be able to work with very large number of tasks.
> ----------------------------------------------------------------------------
>
> Key: SPARK-6165
> URL: https://issues.apache.org/jira/browse/SPARK-6165
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 1.4.0
> Reporter: Mridul Muralidharan
> Priority: Minor
> Labels: bulk-closed
>
> To prevent data from workers causing OOM at master, we have the property
> 'spark.driver.maxResultSize'.
> But the OOM at master can be due to two reasons :
> a) Data being sent from workers is too large - causing OOM at master.
> b) Large number of moderate (to low) sized data being sent to master causing
> OOM.
> (For example: 500k tasks, 1k each)
> spark.driver.maxResultSize protects against both - but (b) should be handled
> more gracefully by master : example spool it to disk, aggregate without
> waiting for entire result set to be fetched, etc.
> Currently we are forced to use treeReduce and co to work around this problem
> : adding to the latency of jobs.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]