[
https://issues.apache.org/jira/browse/SPARK-16787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Rosen updated SPARK-16787:
-------------------------------
Target Version/s: 2.0.1 (was: 1.6.3, 2.0.1)
> SparkContext.addFile() should not fail if called twice with the same file
> -------------------------------------------------------------------------
>
> Key: SPARK-16787
> URL: https://issues.apache.org/jira/browse/SPARK-16787
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.6.2, 2.0.0
> Reporter: Josh Rosen
> Assignee: Josh Rosen
>
> The behavior of SparkContext.addFile() changed slightly with the introduction
> of the Netty-RPC-based file server, which was introduced in Spark 1.6 (where
> it was disabled by default) and became the default / only file server in
> Spark 2.0.0.
> Prior to 2.0, calling SparkContext.addFile() twice with the same path would
> succeed and would cause future tasks to receive an updated copy of the file.
> This behavior was never explicitly documented but Spark has behaved this way
> since very early 1.x versions (some of the relevant lines in
> Executor.updateDependencies() have existed since 2012).
> In 2.0 (or 1.6 with the Netty file server enabled), the second addFile() call
> will fail with a requirement error because NettyStreamManager tries to guard
> against duplicate file registration.
> I believe that this change of behavior was unintentional and propose to
> remove the {{require}} check so that Spark 2.0 matches 1.x's default behavior.
> This problem also affects addJar() in a more subtle way: the
> fileServer.addJar() call will also fail with an exception but that exception
> is logged and ignored due to some code which was added in 2014 in order to
> ignore errors caused by missing Spark examples JARs when running on YARN
> cluster mode (AFAIK).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]