[jira] [Updated] (SPARK-32775) [k8s] Spark client dependency support ignores non-local paths
[ https://issues.apache.org/jira/browse/SPARK-32775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuzhou Yin updated SPARK-32775: --- Description: According to the logic of this line: [https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,] Spark filters out all paths which are not local (ie. no scheme or [file://|file:///] scheme). It may cause non-local dependencies not loaded by Driver. For example, when starting a Spark job with spark.jars=*local*:///local/path/1.jar,*s3*://s3/path/2.jar,*file*:///local/path/3.jar, it seems like this logic will upload *file*:///local/path/3.jar to s3, and reset spark.jars to only s3://transformed/path/3.jar, while completely ignoring local:///local/path/1.jar and s3:///s3/path/2.jar. We need to fix this logic such that Spark uploads local files to S3, and transforms the paths while keeping all other paths as they are. was: According to the logic of this line: [https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,] Spark filters out all paths which are not local (ie. no scheme or [file://|file:///] scheme). It may cause non-local dependencies not loaded by Driver. For example, when starting a Spark job with spark.jars=*local*:///local/path/1.jar,*s3*://s3/path/2.jar,*file*:///local/path/3.jar, it seems like this logic will upload *file*:///local/path/3.jar to s3, and reset spark.jars to only s3://transformed/path/3.jar, while completely ignoring local:///local/path/1.jar and s3:///s3/path/2.jar. We need to fix this logic such that Spark upload local files to S3, and transform the paths while keeping all other paths as they are. > [k8s] Spark client dependency support ignores non-local paths > - > > Key: SPARK-32775 > URL: https://issues.apache.org/jira/browse/SPARK-32775 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.0.0 >Reporter: Xuzhou Yin >Priority: Major > > According to the logic of this line: > [https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,] > Spark filters out all paths which are not local (ie. no scheme or > [file://|file:///] scheme). It may cause non-local dependencies not loaded by > Driver. > For example, when starting a Spark job with > spark.jars=*local*:///local/path/1.jar,*s3*://s3/path/2.jar,*file*:///local/path/3.jar, > it seems like this logic will upload *file*:///local/path/3.jar to s3, and > reset spark.jars to only s3://transformed/path/3.jar, while completely > ignoring local:///local/path/1.jar and s3:///s3/path/2.jar. > We need to fix this logic such that Spark uploads local files to S3, and > transforms the paths while keeping all other paths as they are. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32775) [k8s] Spark client dependency support ignores non-local paths
[ https://issues.apache.org/jira/browse/SPARK-32775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuzhou Yin updated SPARK-32775: --- Description: According to the logic of this line: [https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,] Spark filters out all paths which are not local (ie. no scheme or [file://|file:///] scheme). It may cause non-local dependencies not loaded by Driver. For example, when starting a Spark job with spark.jars=*local*:///local/path/1.jar,*s3*://s3/path/2.jar,*file*:///local/path/3.jar, it seems like this logic will upload *file*:///local/path/3.jar to s3, and reset spark.jars to only s3://transformed/path/3.jar, while completely ignoring local:///local/path/1.jar and s3:///s3/path/2.jar. We need to fix this logic such that Spark upload local files to S3, and transform the paths while keeping all other paths as they are. was: According to the logic of this line: [https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,] Spark filters out all paths which are not local (ie. no scheme or [file://|file:///] scheme). It may cause non-local dependencies not loaded by Driver. For example, when starting a Spark job with spark.jars=local:///local/path/1.jar,s3://s3/path/2.jar,[file:///local/path/3.jar], it seems like this logic will upload [file:///local/path/3.jar] to s3, and reset spark.jars to only s3://upload/path/3.jar, while completely ignoring local:///local/path/1.jar and s3:///s3/path/2.jar. We need to fix this logic such that Spark upload local files to S3, and transform the paths while keeping all other paths as they are. > [k8s] Spark client dependency support ignores non-local paths > - > > Key: SPARK-32775 > URL: https://issues.apache.org/jira/browse/SPARK-32775 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.0.0 >Reporter: Xuzhou Yin >Priority: Major > > According to the logic of this line: > [https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L161,] > Spark filters out all paths which are not local (ie. no scheme or > [file://|file:///] scheme). It may cause non-local dependencies not loaded by > Driver. > For example, when starting a Spark job with > spark.jars=*local*:///local/path/1.jar,*s3*://s3/path/2.jar,*file*:///local/path/3.jar, > it seems like this logic will upload *file*:///local/path/3.jar to s3, and > reset spark.jars to only s3://transformed/path/3.jar, while completely > ignoring local:///local/path/1.jar and s3:///s3/path/2.jar. > We need to fix this logic such that Spark upload local files to S3, and > transform the paths while keeping all other paths as they are. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org