hudi-bot opened a new issue, #15458:
URL: https://github.com/apache/hudi/issues/15458

   Currently, the GCS Ingestion (HUDI-4850) expects recent versions of Jars 
like protobuf and Guava to be provided to spark-submit explicitly, to override 
older versions shipped with Spark. These Jars are used by the gcs-connector 
which is a library from Google that helps connect to GCS. For more details see 
[https://docs.google.com/document/d/1VfvtdvhXw6oEHPgZ_4Be2rkPxIzE0kBCNUiVDsXnSAA/edit#]
 (section titled "Configure Spark to use newer versions of some Jars").
   
   See if it's possible to create a shaded+fat jar of gcs-connector for this 
use case instead, and avoid specifying things to spark-submit on the command 
line.
   
   An alternate approach to consider for the long term is HUDI-4930 (slim 
bundles).
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-4931
   - Type: Task
   - Epic: https://issues.apache.org/jira/browse/HUDI-1896
   
   
   ---
   
   
   ## Comments
   
   28/Sep/22 08:14;pramodbiligiri;Some useful references regarding this:
   - GCP docs on Cloud Storage connector: 
[https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage]
   - Hudi docs on GCS connectivity: https://hudi.apache.org/docs/gcs_hoodie/
   
    ;;;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to