Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21771#discussion_r202536141
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1555,6 +1559,9 @@ class SparkContext(config: SparkConf) extends Logging 
{
           Utils.fetchFile(uri.toString, new 
File(SparkFiles.getRootDirectory()), conf,
             env.securityManager, hadoopConfiguration, timestamp, useCache = 
false)
           postEnvironmentUpdate()
    +    } else {
    +      logWarning(s"The path $path has been added already. Overwriting of 
added paths " +
    --- End diff --
    
    @HyukjinKwon Our support receives a few "bug" reports per months. For now 
we can provide a link to the note at least. The warning itself is needed to our 
support engineers to detect such kind of problems from logs of already finished 
jobs. Actually customers do not say in their bug reports that files/jars 
weren't overwritten (it would be easier). They report problems like calling a 
method from a lib crashes due to incompatible signature of method or a class 
doesn't exists. Or final result of a Spark job is not correct because a 
config/resource files added via `addFile()`  is not up to date. Now I can 
detect the situation from logs and provide a link to docs for 
`addFile()`/`addJar()`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to