[ 
https://issues.apache.org/jira/browse/SPARK-24646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523022#comment-16523022
 ] 

Saisai Shao commented on SPARK-24646:
-------------------------------------

Hi [~vanzin], here is a specific example:

We have a customized {{ServiceCredentialProvider}} named HS2 
{{ServiceCredentialProvider}} which is located in our own jar {{foo}}. So when 
SparkSubmit process launches YARN client, this {{foo}} jar should be existed in 
SparkSubmit process classpath. 

If this {{foo}} jar is a local resource added by {{--jars}}, then it will be 
existed in the classpath, but if it is a remote jar (for example on HDFS), then 
only yarn-client mode will download and add this {{foo}} jar to classpath, in 
yarn-cluster mode, it will not be downloaded, so this specific HS2 
{{ServiceCredentialProvider}} will not be loaded.

When using spark-submit script, user could decide to add the remote jar or 
local jar. But in the Livy scenario, Livy only supports remote jars (jars on 
the hdfs), and we only configure to support yarn cluster mode. So in this 
scenario, we cannot load this customized {{ServiceCredentialProvider}} in yarn 
cluster mode.

So the fix is to force to download the jars to local SparkSubmit process with 
configuration "spark.yarn.dist.forceDownloadSchemes", to use it easily, I 
propose to add wildcard '*' support, which will download all the remote 
resource without checking the scheme.

> Support wildcard '*' for to spark.yarn.dist.forceDownloadSchemes
> ----------------------------------------------------------------
>
>                 Key: SPARK-24646
>                 URL: https://issues.apache.org/jira/browse/SPARK-24646
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.3.0
>            Reporter: Saisai Shao
>            Priority: Minor
>
> In the case of getting tokens via customized {{ServiceCredentialProvider}}, 
> it is required that {{ServiceCredentialProvider}} be available in local 
> spark-submit process classpath. In this case, all the configured remote 
> sources should be forced to download to local.
> For the ease of using this configuration, here propose to add wildcard '*' 
> support to {{spark.yarn.dist.forceDownloadSchemes}}, also clarify the usage 
> of this configuration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to