[
https://issues.apache.org/jira/browse/SPARK-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aaron Davidson updated SPARK-1866:
----------------------------------
Description:
Take the following example:
{code}
val x = 5
val instances = new org.apache.hadoop.fs.Path("/") /* non-serializable */
sc.parallelize(0 until 10).map { _ =>
val instances = 3
(instances, x)
}.collect
{code}
This produces a "java.io.NotSerializableException: org.apache.hadoop.fs.Path",
despite the fact that the outer instances is not actually used within the
closure. If you change the name of the outer variable instances to something
else, the code executes correctly, indicating that it is the fact that the two
variables share a name that causes the issue.
Additionally, if the outer scope is not used (i.e., we do not reference "x" in
the above example), the issue does not appear.
was:
Take the following example:
{code}
val x = 5
val instances = new org.apache.hadoop.fs.Path("/") /* non-serializable */
sc.parallelize(0 until 10).map { _ =>
val instances = 3
(instances, x)
}.collect
{code}
This produces a "java.io.NotSerializableException: org.apache.hadoop.fs.Path",
despite the fact that the outer instances is not actually used within the
closure. If you change the name of the outer variable instances to something
else, the code executes correctly, indicating that it is the fact that the two
variables share a name that causes the issue.
> Closure cleaner does not null shadowed fields when outer scope is referenced
> ----------------------------------------------------------------------------
>
> Key: SPARK-1866
> URL: https://issues.apache.org/jira/browse/SPARK-1866
> Project: Spark
> Issue Type: Bug
> Affects Versions: 1.0.0
> Reporter: Aaron Davidson
> Priority: Critical
> Fix For: 1.1.0, 1.0.1
>
>
> Take the following example:
> {code}
> val x = 5
> val instances = new org.apache.hadoop.fs.Path("/") /* non-serializable */
> sc.parallelize(0 until 10).map { _ =>
> val instances = 3
> (instances, x)
> }.collect
> {code}
> This produces a "java.io.NotSerializableException:
> org.apache.hadoop.fs.Path", despite the fact that the outer instances is not
> actually used within the closure. If you change the name of the outer
> variable instances to something else, the code executes correctly, indicating
> that it is the fact that the two variables share a name that causes the issue.
> Additionally, if the outer scope is not used (i.e., we do not reference "x"
> in the above example), the issue does not appear.
--
This message was sent by Atlassian JIRA
(v6.2#6252)