Dong Lei created SPARK-8369:
-------------------------------
Summary: Support dependency jar and files on HDFS in standalone
cluster mode
Key: SPARK-8369
URL: https://issues.apache.org/jira/browse/SPARK-8369
Project: Spark
Issue Type: New Feature
Components: Spark Core
Reporter: Dong Lei
Currently, in standalone cluster mode, spark can take care of the app-jar
whether the app-jar is specified by file:// or hdfs://. But the dependencies
specified by --jars and --files do not support a hdfs:// prefix.
For example:
spark-submit
...
--jars hdfs://path1/1.jar hdfs://path2/2.jar
--files hdfs://path3/3.file hdfs://path4/4.file
hdfs://path5/app.jar
only app.jar will be downloaded to the driver and distributed to executors,
others (1.jar, 2.jar. 3.file, 4.file) will not.
I think such a feature is useful for users.
----------------------------
To support such a feature, I think we can treat the jars and files like the app
jar in DriverRunner. We download them and replace the remote addresses with
local addresses. And the DriverWrapper will not be aware.
The problem is it's not easy to replace these addresses than replace the
location app jar, because we have a placeholder for app jar "<<USER_JAR>>". We
may need to do some string matching to achieve it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]