[
https://issues.apache.org/jira/browse/SPARK-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15648018#comment-15648018
]
Ian Hummel commented on SPARK-17568:
------------------------------------
This would be a very helpful feature for us to have in our environment where
we're behind a firewall but still want to allow ad hoc spark sessions to pull
in 3rd party dependencies from our Artifactory. We tested the changes in the
PR and everything looks good, so would love to see this merged.
> Add spark-submit option for user to override ivy settings used to resolve
> packages/artifacts
> --------------------------------------------------------------------------------------------
>
> Key: SPARK-17568
> URL: https://issues.apache.org/jira/browse/SPARK-17568
> Project: Spark
> Issue Type: Improvement
> Components: Deploy, Spark Core
> Reporter: Bryan Cutler
>
> The {{--packages}} option to {{spark-submit}} uses Ivy to map Maven
> coordinates to package jars. Currently, the IvySettings are hard-coded with
> Maven Central as the last repository in the chain of resolvers.
> At IBM, we have heard from several enterprise clients that are frustrated
> with lack of control over their local Spark installations. These clients want
> to ensure that certain artifacts can be excluded or patched due to security
> or license issues. For example, a package may use a vulnerable SSL protocol;
> or a package may link against an AGPL library written by a litigious
> competitor.
> While additional repositories and exclusions can be added on the spark-submit
> command line, this falls short of what is needed. With Maven Central always
> as a fall-back repository, it is difficult to ensure only approved artifacts
> are used and it is often the exclusions that site admins are not aware of
> that can cause problems. Also, known exclusions are better handled through a
> centralized managed repository rather than as command line arguments.
> To resolve these issues, we propose the following change: allow the user to
> specify an Ivy Settings XML file to pass in as an optional argument to
> {{spark-submit}} (or specify in a config file) to define alternate
> repositories used to resolve artifacts instead of the hard-coded defaults.
> The use case for this would be to define a managed repository (such as Nexus)
> in the settings file so that all requests for artifacts go through one
> location only.
> Example usage:
> {noformat}
> $SPARK_HOME/bin/spark-submit --conf
> spark.ivy.settings=/path/to/ivysettings.xml myapp.jar
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]