yuqi1129 commented on PR #5079:
URL: https://github.com/apache/gravitino/pull/5079#issuecomment-2418988090

   @jerryshao 
   
   I have verify the code,  from the client size, the users should include the 
following dependencies if he wants to use gcs fileset
   
   ```xml
            <dependency>
               <groupId>org.apache.hadoop</groupId>
               <artifactId>hadoop-common</artifactId>
               <version>3.1.0</version>
           </dependency>
   
           <dependency>
               <groupId>org.apache.hadoop</groupId>
               <artifactId>hadoop-hdfs-client</artifactId>
               <version>3.1.0</version>
           </dependency>
          
           <dependency>
               <groupId>org.apache.gravitino</groupId>
               <artifactId>gcp-bundle</artifactId>
               <version>0.7.0-incubating-SNAPSHOT</version>
           </dependency>
   
           <dependency>
               <groupId>org.apache.gravitino</groupId>
               <artifactId>filesystem-hadoop3-runtime</artifactId>
               <version>0.7.0-incubating-SNAPSHOT</version>
           </dependency>
   ```
   
   suggested by @xloya, there may be conflicts if we include `hadoop-common` 
and `hadoop-hdfs-client` into `filesystem-hadoop3-runtime`, in some query 
engines like Spark and Trino, there may already be hdfs-reletad jars in the 
context. 
   
   The reason why we need to include `hadoop-hdfs-client` is that 
`filesystem-hadoop3-runtime` has shaded `hadoop-catalog` and `hadoop-catalog` 
contains `DistributedFileSysem` initialization logic even thought I just want 
to use GCS. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to