jerryshao commented on code in PR #5324:
URL: https://github.com/apache/gravitino/pull/5324#discussion_r1822060449


##########
docs/hadoop-catalog.md:
##########
@@ -25,19 +25,88 @@ Hadoop 3. If there's any compatibility issue, please create 
an [issue](https://g
 
 Besides the [common catalog 
properties](./gravitino-server-config.md#gravitino-catalog-properties-configuration),
 the Hadoop catalog has the following properties:
 
-| Property Name                                      | Description             
                                                                                
                                                                                
                                                                                
                                       | Default Value   | Required             
                                       | Since Version    |
-|----------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|-------------------------------------------------------------|------------------|
-| `location`                                         | The storage location 
managed by Hadoop catalog.                                                      
                                                                                
                                                                                
                                          | (none)          | No                
                                          | 0.5.0            |
-| `filesystem-providers`                             | The names (split by 
comma) of filesystem providers for the Hadoop catalog. Gravitino already 
support built-in `builtin-local`(`local file`) and `builtin-hdfs`(`hdfs`). If 
users want to support more file system and add it to Gravitino, they custom 
more file system by implementing `FileSystemProvider`.  | (none)          | No  
                                                        | 0.7.0-incubating |
-| `default-filesystem-provider`                      | The name default 
filesystem providers of this Hadoop catalog if users do not specify the scheme 
in the URI. Default value is `builtin-local`                                    
                                                                                
                                               | `builtin-local` | No           
                                               | 0.7.0-incubating |
-| `authentication.impersonation-enable`              | Whether to enable 
impersonation for the Hadoop catalog.                                           
                                                                                
                                                                                
                                             | `false`         | No             
                                             | 0.5.1            |
-| `authentication.type`                              | The type of 
authentication for Hadoop catalog, currently we only support `kerberos`, 
`simple`.                                                                       
                                                                                
                                                          | `simple`        | 
No                                                          | 0.5.1            |
-| `authentication.kerberos.principal`                | The principal of the 
Kerberos authentication                                                         
                                                                                
                                                                                
                                          | (none)          | required if the 
value of `authentication.type` is Kerberos. | 0.5.1            |
-| `authentication.kerberos.keytab-uri`               | The URI of The keytab 
for the Kerberos authentication.                                                
                                                                                
                                                                                
                                         | (none)          | required if the 
value of `authentication.type` is Kerberos. | 0.5.1            |
-| `authentication.kerberos.check-interval-sec`       | The check interval of 
Kerberos credential for Hadoop catalog.                                         
                                                                                
                                                                                
                                         | 60              | No                 
                                         | 0.5.1            |
-| `authentication.kerberos.keytab-fetch-timeout-sec` | The fetch timeout of 
retrieving Kerberos keytab from `authentication.kerberos.keytab-uri`.           
                                                                                
                                                                                
                                          | 60              | No                
                                          | 0.5.1            |
-
-For more about `filesystem-providers`, please refer to 
`HadoopFileSystemProvider` or `LocalFileSystemProvider` in the source code. 
Furthermore, you also need to place the jar of the file system provider into 
the `$GRAVITINO_HOME/catalogs/hadoop/libs` directory if it's not in the 
classpath.
+| Property Name                                      | Description             
                                                                                
                                                                                
                                                                                
                                      | Default Value   | Required              
                                      | Since Version    |
+|----------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|-------------------------------------------------------------|------------------|
+| `location`                                         | The storage location 
managed by Hadoop catalog.                                                      
                                                                                
                                                                                
                                         | (none)          | No                 
                                         | 0.5.0            |
+
+Apart from the above properties, to access fileset like HDFS, S3, GCS, OSS or 
custom fileset, you need to configure the following extra properties.
+
+#### HDFS fileset 
+
+| Property Name                                      | Description             
                                                                                
                                                                                
                                                                                
                                      | Default Value   | Required              
                                      | Since Version    |
+|----------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|-------------------------------------------------------------|------------------|
+| `authentication.impersonation-enable`              | Whether to enable 
impersonation for the Hadoop catalog.                                           
                                                                                
                                                                                
                                            | `false`         | No              
                                            | 0.5.1            |
+| `authentication.type`                              | The type of 
authentication for Hadoop catalog, currently we only support `kerberos`, 
`simple`.                                                                       
                                                                                
                                                         | `simple`        | No 
                                                         | 0.5.1            |
+| `authentication.kerberos.principal`                | The principal of the 
Kerberos authentication                                                         
                                                                                
                                                                                
                                         | (none)          | required if the 
value of `authentication.type` is Kerberos. | 0.5.1            |
+| `authentication.kerberos.keytab-uri`               | The URI of The keytab 
for the Kerberos authentication.                                                
                                                                                
                                                                                
                                        | (none)          | required if the 
value of `authentication.type` is Kerberos. | 0.5.1            |
+| `authentication.kerberos.check-interval-sec`       | The check interval of 
Kerberos credential for Hadoop catalog.                                         
                                                                                
                                                                                
                                        | 60              | No                  
                                        | 0.5.1            |
+| `authentication.kerberos.keytab-fetch-timeout-sec` | The fetch timeout of 
retrieving Kerberos keytab from `authentication.kerberos.keytab-uri`.           
                                                                                
                                                                                
                                         | 60              | No                 
                                         | 0.5.1            |
+
+#### S3 fileset
+
+| Configuration item             | Description                                 
                                                                                
                                                          | Default value   | 
Required                   | Since version    |
+|--------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|----------------------------|------------------|
+| `filesystem-providers`         | The file system providers to add. Set it to 
`s3` if it's a s3 fileset or a comma separated string that contains `s3` like 
`gs,s3` to support multiple kind of fileset including `s3`. | (none)          | 
Yes                        | 0.7.0-incubating |
+| `default-filesystem-provider`  | The name default filesystem providers of 
this Hadoop catalog if users do not specify the scheme in the URI. Default 
value is `builtin-local`, for S3, if we set this value, we can omit the prefix 
'oss://' in the location. | `builtin-local` | No                         | 
0.7.0-incubating |
+| `s3-endpoint`                  | The endpoint of the AWS s3.                 
                                                                                
                                                          | (none)          | 
Yes if it's a s3 fileset.  | 0.7.0-incubating |
+| `s3-access-key-id`             | The access key of the AWS s3.               
                                                                                
                                                          | (none)          | 
Yes if it's a s3 fileset.  | 0.7.0-incubating |
+| `s3-secret-access-key`         | The secret key of the AWS s3.               
                                                                                
                                                          | (none)          | 
Yes if it's a s3 fileset.  | 0.7.0-incubating |
+
+At the same time, you need to place the corresponding bundle jar 
[gravitno-aws-bundle-{version}.jar](https://repo1.maven.org/maven2/org/apache/gravitino/aws-bundle/)
 in the Hadoop environment.
+
+
+#### GCS fileset
+
+| Configuration item            | Description                                  
                                                                                
                                                                                
                | Default value   | Required                   | Since version  
  |
+|-------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|----------------------------|------------------|
+| `filesystem-providers`        | The file system providers to add. Set it to 
`gs` if it's a gcs fileset or a comma separated string that contains `gs` like 
`gs,s3` to support multiple kind of fileset including `gs`.                     
                  | (none)          | Yes                        | 
0.7.0-incubating |
+| `default-filesystem-provider` | The name default filesystem providers of 
this Hadoop catalog if users do not specify the scheme in the URI. Default 
value is `builtin-local`, for GCS, if we set this value, we can omit the prefix 
'gs://' in the location. | `builtin-local` | No                         | 
0.7.0-incubating |
+| `gcs-service-account-file`    | The path of GCS service account JSON file.   
                                                                                
                                                                                
                | (none)          | Yes if it's a gcs fileset. | 
0.7.0-incubating |
+
+In the meantime, you need to place the corresponding bundle jar 
[gravitno-gcp-bundle-{version}.jar](https://repo1.maven.org/maven2/org/apache/gravitino/gcp-bundle/)
 in the Hadoop environment.

Review Comment:
   You'd better specifically tell user where to put the jar, not vaguely saying 
"in the Hadoop environment".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to