Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r195655196
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1517,9 +1517,12 @@ class SparkContext(config: SparkConf) extends 
Logging {
        * only supported for Hadoop-supported filesystems.
        */
       def addFile(path: String, recursive: Boolean): Unit = {
    -    val uri = new Path(path).toUri
    +    var uri = new Path(path).toUri
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null | "local" =>
    +        // SPARK-24195: Local is not a valid scheme for FileSystem, we 
should only keep path here.
    +        uri = new Path(uri.getPath).toUri
    --- End diff --
    
    @HyukjinKwon @jiangxb1987 
    Thanks for your explain, I think I know what's your meaning about `we 
getPath doesn't include scheme`. Actually the purpose of this code `uri = new 
Path(uri.getPath).toUri`, is to reassign the var in +1520, we don't want the 
uri including local scheme.
    ```
    Can't we just do new File(uri.getPath).getCanonicalFile.toURI.toString 
without this line?
    ```
    We can't because like I explained above, if we didn't do `uri = new 
Path(uri.getPath).toUri`, will get a exception like below:
    ```
    No FileSystem for scheme: local
    java.io.IOException: No FileSystem for scheme: local
        at 
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2586)
        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2593)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
        at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
        at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1830)
        at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:690)
        at org.apache.spark.util.Utils$.fetchFile(Utils.scala:486)
        at org.apache.spark.SparkContext.addFile(SparkContext.scala:1557)
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to