Josh Rosen created SPARK-4434:
---------------------------------
Summary: spark-submit cluster deploy mode JAR URLs are broken in
1.1.1
Key: SPARK-4434
URL: https://issues.apache.org/jira/browse/SPARK-4434
Project: Spark
Issue Type: Bug
Components: Deploy, Spark Core
Affects Versions: 1.1.1, 1.2.0
Reporter: Josh Rosen
Priority: Blocker
When submitting a driver using {{spark-submit}} in cluster mode, Spark 1.1.0
allowed you to omit the {{file://}} or {{hdfs://}} prefix from the application
JAR URL, e.g.
{code}
./bin/spark-submit --deploy-mode cluster --master
spark://joshs-mbp.att.net:7077 --class org.apache.spark.examples.SparkPi
/Users/joshrosen/Documents/old-spark-releases/spark-1.1.0-bin-hadoop1/lib/spark-examples-1.1.0-hadoop1.0.4.jar
{code}
In Spark 1.1.1 and 1.2.0, this same command now fails with an error:
{code}
./bin/spark-submit --deploy-mode cluster --master
spark://joshs-mbp.att.net:7077 --class org.apache.spark.examples.SparkPi
/Users/joshrosen/Documents/Spark/examples/target/scala-2.10/spark-examples_2.10-1.1.2-SNAPSHOT.jar
Jar url
'file:/Users/joshrosen/Documents/Spark/examples/target/scala-2.10/spark-examples_2.10-1.1.2-SNAPSHOT.jar'
is not in valid format.
Must be a jar file path in URL format (e.g. hdfs://XX.jar, file://XX.jar)
Usage: DriverClient [options] launch <active-master> <jar-url> <main-class>
[driver options]
Usage: DriverClient kill <active-master> <driver-id>
{code}
I tried changing my URL to conform to the new format, but this either resulted
in an error or a job that failed:
{code}
./bin/spark-submit --deploy-mode cluster --master
spark://joshs-mbp.att.net:7077 --class org.apache.spark.examples.SparkPi
file:///Users/joshrosen/Documents/Spark/examples/target/scala-2.10/spark-examples_2.10-1.1.2-SNAPSHOT.jar
Jar url
'file:///Users/joshrosen/Documents/Spark/examples/target/scala-2.10/spark-examples_2.10-1.1.2-SNAPSHOT.jar'
is not in valid format.
Must be a jar file path in URL format (e.g. hdfs://XX.jar, file://XX.jar)
{code}
If I omit the extra slash:
{code}
./bin/spark-submit --deploy-mode cluster --master
spark://joshs-mbp.att.net:7077 --class org.apache.spark.examples.SparkPi
file://Users/joshrosen/Documents/Spark/examples/target/scala-2.10/spark-examples_2.10-1.1.2-SNAPSHOT.jar
Sending launch command to spark://joshs-mbp.att.net:7077
Driver successfully submitted as driver-20141116143235-0002
... waiting before polling master for driver state
... polling master for driver state
State of driver-20141116143235-0002 is ERROR
Exception from cluster was: java.lang.IllegalArgumentException: Wrong FS:
file://Users/joshrosen/Documents/Spark/examples/target/scala-2.10/spark-examples_2.10-1.1.2-SNAPSHOT.jar,
expected: file:///
java.lang.IllegalArgumentException: Wrong FS:
file://Users/joshrosen/Documents/Spark/examples/target/scala-2.10/spark-examples_2.10-1.1.2-SNAPSHOT.jar,
expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:381)
at
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:55)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:393)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:329)
at
org.apache.spark.deploy.worker.DriverRunner.org$apache$spark$deploy$worker$DriverRunner$$downloadUserJar(DriverRunner.scala:157)
at
org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:74)
{code}
This bug effectively prevents users from using {{spark-submit}} in cluster mode
to run drivers whose JARs are stored on shared cluster filesystems.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]