[
https://issues.apache.org/jira/browse/SPARK-48417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849399#comment-17849399
]
Ravi Dalal commented on SPARK-48417:
------------------------------------
For anyone facing this issue, use following configuration to read file from GCS
when spark.jars.packages is used:
{code:java}
config("spark.jars",
"https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-hadoop3-2.2.22.jar")
config("spark.hadoop.fs.AbstractFileSystem.gs.impl",
"com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS")
config("spark.hadoop.fs.gs.impl",
"com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem"){code}
When spark.jars.pacakges is not used, following configuration alone works:
{code:java}
config("spark.jars",
"https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-hadoop3-2.2.22.jar")
config("spark.hadoop.fs.AbstractFileSystem.gs.impl",
"com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS") {code}
> Filesystems do not load with spark.jars.packages configuration
> --------------------------------------------------------------
>
> Key: SPARK-48417
> URL: https://issues.apache.org/jira/browse/SPARK-48417
> Project: Spark
> Issue Type: Bug
> Components: Input/Output
> Affects Versions: 3.5.1
> Reporter: Ravi Dalal
> Priority: Major
> Attachments: pyspark_mleap.py,
> pyspark_spark_jar_package_config_logs.txt,
> pyspark_without_spark_jar_package_config_logs.txt
>
>
> When we use spark.jars.packages configuration parameter in Python
> SparkSession Builder (Pyspark), it appears that the filesystems are not
> loaded when session starts. Because of this, Spark fails to read file from
> Google Cloud Storage (GCS) bucket (with GCS Connector).
> I tested this with different packages so it does not appear specific to a
> particular package. I will attach the sample code and debug logs.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]