[ 
https://issues.apache.org/jira/browse/HADOOP-17402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241935#comment-17241935
 ] 

Rafal Wojdyla edited comment on HADOOP-17402 at 12/1/20, 11:22 PM:
-------------------------------------------------------------------

[[email protected]] thanks for the links. I'm with you on the long term 
vision. In the meantime tho, is there something we can do to bring GCS 
connector on par with S3 (specifically the {{core-default}} config). I'm mostly 
thinking of pyspark users, for whom java ecosystem may be a puzzle. 
Spark/pyspark loads {{core-default}} from {{hadoop-common}}. Afaiu in pyspark 
context the auto service doesn't actually register the {{gs}} scheme (at least 
in the context where connector jar is loaded via {{spark.jars}}, which is 
likely), so Spark users are forced to add the config manually.

One might argue that adding the config to {{core-default}} would still result 
in missing class error, but at least it would look the same as S3, and it would 
save on extra config. What do you think?


was (Author: ravwojdyla):
[[email protected]] thanks for the links. I'm with you on the long term 
vision. In the meantime tho, is there something we can do to bring GCS 
connector on par with S3 (specifically the {{core-default}} config). I'm mostly 
thinking of pyspark users, for whom java ecosystem may be a puzzle. 
Spark/pyspark loads {{core-default}} from {{hadoop-common}}. Afaiu in pyspark 
context the auto service doesn't actually register the {{gs}} scheme (at least 
in the context where connector jar is loaded via `spark.jars`, which is 
likely), so Spark users are forced to add the config manually.

One might argue that adding the config to {{core-default}} would still result 
in missing class error, but at least it would look the same as S3, and it would 
save on extra config. What do you think?

> Add GCS FS impl reference to core-default.xml
> ---------------------------------------------
>
>                 Key: HADOOP-17402
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17402
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Rafal Wojdyla
>            Priority: Major
>
> Akin to current S3 default configuration add GCS configuration, specifically 
> to declare the GCS implementation. [GCS 
> connector|https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage].
>  Has this not been done since the GCS connector is not part of the hadoop/ASF 
> codebase, or is there any other blocker?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to