[
https://issues.apache.org/jira/browse/SPARK-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2696:
---------------------------------
Assignee: Hossein Falaki
> Reduce default spark.serializer.objectStreamReset
> --------------------------------------------------
>
> Key: SPARK-2696
> URL: https://issues.apache.org/jira/browse/SPARK-2696
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.0.0
> Reporter: Hossein Falaki
> Assignee: Hossein Falaki
> Labels: configuration
> Fix For: 1.1.0, 1.0.3
>
>
> The current default value of spark.serializer.objectStreamReset is 10,000.
> When trying to re-partition (e.g., to 64 partitions) a large file (e.g.,
> 500MB), containing 1MB records, the serializer will cache 10000 x 1MB x 64 =
> 640 GB which will cause it to go out of memory.
> We think 100 would be a more reasonable default value for this configuration
> parameter.
--
This message was sent by Atlassian JIRA
(v6.2#6252)