[
https://issues.apache.org/jira/browse/SPARK-22587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Saisai Shao resolved SPARK-22587.
---------------------------------
Resolution: Fixed
> Spark job fails if fs.defaultFS and application jar are different url
> ---------------------------------------------------------------------
>
> Key: SPARK-22587
> URL: https://issues.apache.org/jira/browse/SPARK-22587
> Project: Spark
> Issue Type: Bug
> Components: Spark Submit
> Affects Versions: 1.6.3
> Reporter: Prabhu Joseph
> Assignee: Mingjie Tang
>
> Spark Job fails if the fs.defaultFs and url where application jar resides are
> different and having same scheme,
> spark-submit --conf spark.master=yarn-cluster wasb://XXX/tmp/test.py
> core-site.xml fs.defaultFS is set to wasb:///YYY. Hadoop list works (hadoop
> fs -ls) works for both the url XXX and YYY.
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS:
> wasb://XXX/tmp/test.py, expected: wasb://YYY
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:665)
> at
> org.apache.hadoop.fs.azure.NativeAzureFileSystem.checkPath(NativeAzureFileSystem.java:1251)
>
> at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:485)
> at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:396)
> at
> org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:507)
>
> at
> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:660)
> at
> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:912)
>
> at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:172)
> at org.apache.spark.deploy.yarn.Client.run(Client.scala:1248)
> at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1307)
> at org.apache.spark.deploy.yarn.Client.main(Client.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:751)
>
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {code}
> The code Client.copyFileToRemote tries to resolve the path of application jar
> (XXX) from the FileSystem object created using fs.defaultFS url (YYY) instead
> of the actual url of application jar.
> val destFs = destDir.getFileSystem(hadoopConf)
> val srcFs = srcPath.getFileSystem(hadoopConf)
> getFileSystem will create the filesystem based on the url of the path and so
> this is fine. But the below lines of code tries to get the srcPath (XXX url)
> from the destFs (YYY url) and so it fails.
> var destPath = srcPath
> val qualifiedDestPath = destFs.makeQualified(destPath)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]