[
https://issues.apache.org/jira/browse/SPARK-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15028437#comment-15028437
]
tawan commented on SPARK-8369:
------------------------------
I am working on it now and will post a PR for it.
> Support dependency jar and files on HDFS in standalone cluster mode
> -------------------------------------------------------------------
>
> Key: SPARK-8369
> URL: https://issues.apache.org/jira/browse/SPARK-8369
> Project: Spark
> Issue Type: New Feature
> Components: Spark Core
> Reporter: Dong Lei
>
> Currently, in standalone cluster mode, spark can take care of the app-jar
> whether the app-jar is specified by file:// or hdfs://. But the dependencies
> specified by --jars and --files do not support a hdfs:// prefix.
> For example:
> spark-submit
> ...
> --jars hdfs://path1/1.jar hdfs://path2/2.jar
> --files hdfs://path3/3.file hdfs://path4/4.file
> hdfs://path5/app.jar
> only app.jar will be downloaded to the driver and distributed to executors,
> others (1.jar, 2.jar. 3.file, 4.file) will not.
> I think such a feature is useful for users.
> ----------------------------
> To support such a feature, I think we can treat the jars and files like the
> app jar in DriverRunner. We download them and replace the remote addresses
> with local addresses. And the DriverWrapper will not be aware.
> The problem is it's not easy to replace these addresses than replace the
> location app jar, because we have a placeholder for app jar "<<USER_JAR>>".
> We may need to do some string matching to achieve it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]