[jira] [Updated] (SPARK-32775) [k8s] Spark client dependency support ignores non-local paths

2020-09-01 Thread Xuzhou Yin (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuzhou Yin updated SPARK-32775:
---
Description: 
According to the logic of this line: 
[https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,]
 Spark filters out all paths which are not local (ie. no scheme or 
[file://|file:///] scheme). It may cause non-local dependencies not loaded by 
Driver.

For example, when starting a Spark job with 
spark.jars=*local*:///local/path/1.jar,*s3*://s3/path/2.jar,*file*:///local/path/3.jar,
 it seems like this logic will upload *file*:///local/path/3.jar to s3, and 
reset spark.jars to only s3://transformed/path/3.jar, while completely ignoring 
local:///local/path/1.jar and s3:///s3/path/2.jar.

We need to fix this logic such that Spark uploads local files to S3, and 
transforms the paths while keeping all other paths as they are.

  was:
According to the logic of this line: 
[https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,]
 Spark filters out all paths which are not local (ie. no scheme or 
[file://|file:///] scheme). It may cause non-local dependencies not loaded by 
Driver.

For example, when starting a Spark job with 
spark.jars=*local*:///local/path/1.jar,*s3*://s3/path/2.jar,*file*:///local/path/3.jar,
 it seems like this logic will upload *file*:///local/path/3.jar to s3, and 
reset spark.jars to only s3://transformed/path/3.jar, while completely ignoring 
local:///local/path/1.jar and s3:///s3/path/2.jar.

We need to fix this logic such that Spark upload local files to S3, and 
transform the paths while keeping all other paths as they are.


> [k8s] Spark client dependency support ignores non-local paths
> -
>
> Key: SPARK-32775
> URL: https://issues.apache.org/jira/browse/SPARK-32775
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.0.0
>Reporter: Xuzhou Yin
>Priority: Major
>
> According to the logic of this line: 
> [https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,]
>  Spark filters out all paths which are not local (ie. no scheme or 
> [file://|file:///] scheme). It may cause non-local dependencies not loaded by 
> Driver.
> For example, when starting a Spark job with 
> spark.jars=*local*:///local/path/1.jar,*s3*://s3/path/2.jar,*file*:///local/path/3.jar,
>  it seems like this logic will upload *file*:///local/path/3.jar to s3, and 
> reset spark.jars to only s3://transformed/path/3.jar, while completely 
> ignoring local:///local/path/1.jar and s3:///s3/path/2.jar.
> We need to fix this logic such that Spark uploads local files to S3, and 
> transforms the paths while keeping all other paths as they are.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32775) [k8s] Spark client dependency support ignores non-local paths

2020-09-01 Thread Xuzhou Yin (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuzhou Yin updated SPARK-32775:
---
Description: 
According to the logic of this line: 
[https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,]
 Spark filters out all paths which are not local (ie. no scheme or 
[file://|file:///] scheme). It may cause non-local dependencies not loaded by 
Driver.

For example, when starting a Spark job with 
spark.jars=*local*:///local/path/1.jar,*s3*://s3/path/2.jar,*file*:///local/path/3.jar,
 it seems like this logic will upload *file*:///local/path/3.jar to s3, and 
reset spark.jars to only s3://transformed/path/3.jar, while completely ignoring 
local:///local/path/1.jar and s3:///s3/path/2.jar.

We need to fix this logic such that Spark upload local files to S3, and 
transform the paths while keeping all other paths as they are.

  was:
According to the logic of this line: 
[https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,]
 Spark filters out all paths which are not local (ie. no scheme or 
[file://|file:///] scheme). It may cause non-local dependencies not loaded by 
Driver.

For example, when starting a Spark job with 
spark.jars=local:///local/path/1.jar,s3://s3/path/2.jar,[file:///local/path/3.jar],
 it seems like this logic will upload [file:///local/path/3.jar] to s3, and 
reset spark.jars to only s3://upload/path/3.jar, while completely ignoring 
local:///local/path/1.jar and s3:///s3/path/2.jar.

We need to fix this logic such that Spark upload local files to S3, and 
transform the paths while keeping all other paths as they are.


> [k8s] Spark client dependency support ignores non-local paths
> -
>
> Key: SPARK-32775
> URL: https://issues.apache.org/jira/browse/SPARK-32775
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.0.0
>Reporter: Xuzhou Yin
>Priority: Major
>
> According to the logic of this line: 
> [https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,]
>  Spark filters out all paths which are not local (ie. no scheme or 
> [file://|file:///] scheme). It may cause non-local dependencies not loaded by 
> Driver.
> For example, when starting a Spark job with 
> spark.jars=*local*:///local/path/1.jar,*s3*://s3/path/2.jar,*file*:///local/path/3.jar,
>  it seems like this logic will upload *file*:///local/path/3.jar to s3, and 
> reset spark.jars to only s3://transformed/path/3.jar, while completely 
> ignoring local:///local/path/1.jar and s3:///s3/path/2.jar.
> We need to fix this logic such that Spark upload local files to S3, and 
> transform the paths while keeping all other paths as they are.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org