Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21533#discussion_r197603726
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1517,9 +1517,19 @@ class SparkContext(config: SparkConf) extends
Logging {
* only supported for Hadoop-supported filesystems.
*/
def addFile(path: String, recursive: Boolean): Unit = {
- val uri = new Path(path).toUri
+ var uri = new Path(path).toUri
+ // mark the original path's scheme is local or not, there is no need
to add the local file
+ // in file server.
+ var localFile = false
val schemeCorrectedPath = uri.getScheme match {
- case null | "local" => new File(path).getCanonicalFile.toURI.toString
+ case null =>
+ new File(path).getCanonicalFile.toURI.toString
+ case "local" =>
+ localFile = true
+ val tmpPath = new File(uri.getPath).getCanonicalFile.toURI.toString
+ // SPARK-24195: Local is not a valid scheme for FileSystem, we
should only keep path here.
+ uri = new Path(uri.getPath).toUri
+ tmpPath
--- End diff --
`This makes me think whether supporting "local" scheme in addFile is
meaningful or not? Because file with "local" scheme is already existed on every
node, also it should be aware by the user, so adding it seems not meaingful.`
Yeah, agree with you. The last change wants to treat "local" file without
adding to fileServer and correct its scheme to "file:", but maybe add a local
file, the behavior itself is a no-op? So we just forbidden user pass a file
with "local" scheme in addFile?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]