Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/21771#discussion_r202536141
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1555,6 +1559,9 @@ class SparkContext(config: SparkConf) extends Logging
{
Utils.fetchFile(uri.toString, new
File(SparkFiles.getRootDirectory()), conf,
env.securityManager, hadoopConfiguration, timestamp, useCache =
false)
postEnvironmentUpdate()
+ } else {
+ logWarning(s"The path $path has been added already. Overwriting of
added paths " +
--- End diff --
@HyukjinKwon Our support receives a few "bug" reports per months. For now
we can provide a link to the note at least. The warning itself is needed to our
support engineers to detect such kind of problems from logs of already finished
jobs. Actually customers do not say in their bug reports that files/jars
weren't overwritten (it would be easier). They report problems like calling a
method from a lib crashes due to incompatible signature of method or a class
doesn't exists. Or final result of a Spark job is not correct because a
config/resource files added via `addFile()` is not up to date. Now I can
detect the situation from logs and provide a link to docs for
`addFile()`/`addJar()`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]