(gravitino) branch main updated: [#5472] improvement(docs): Add example to use cloud storage fileset and polish hadoop-catalog document. (#6059)

fanng Tue, 14 Jan 2025 02:46:08 -0800

This is an automated email from the ASF dual-hosted git repository.

fanng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/gravitino.git



The following commit(s) were added to refs/heads/main by this push:
     new 5caa9de4f5 [#5472] improvement(docs): Add example to use cloud storage 
fileset and polish hadoop-catalog document. (#6059)
5caa9de4f5 is described below

commit 5caa9de4f54f7c2c92156c6a427082eeb28ad49b
Author: Qi Yu <[email protected]>
AuthorDate: Tue Jan 14 18:45:56 2025 +0800

    [#5472] improvement(docs): Add example to use cloud storage fileset and 
polish hadoop-catalog document. (#6059)
    
    ### What changes were proposed in this pull request?
    
    1. Add full example about how to use cloud storage fileset like S3, GCS,
    OSS and ADLS
    2. Polish how-to-use-gvfs.md and hadoop-catalog-md.
    3. Add document how fileset using credential.
    
    ### Why are the changes needed?
    
    For better user experience.
    
    Fix: #5472
    
    
    ### Does this PR introduce _any_ user-facing change?
    
    N/A.
    
    ### How was this patch tested?
    
    N/A
---
 .../gravitino/filesystem/gvfs_config.py            |   4 +-
 docs/hadoop-catalog-index.md                       |  26 +
 docs/hadoop-catalog-with-adls.md                   | 522 ++++++++++++++++++++
 docs/hadoop-catalog-with-gcs.md                    | 500 +++++++++++++++++++
 docs/hadoop-catalog-with-oss.md                    | 538 ++++++++++++++++++++
 docs/hadoop-catalog-with-s3.md                     | 541 +++++++++++++++++++++
 docs/hadoop-catalog.md                             |  87 +---
 docs/how-to-use-gvfs.md                            | 173 +------
 docs/manage-fileset-metadata-using-gravitino.md    |  59 +--
 9 files changed, 2157 insertions(+), 293 deletions(-)

diff --git a/clients/client-python/gravitino/filesystem/gvfs_config.py 
b/clients/client-python/gravitino/filesystem/gvfs_config.py
index 6fbd8a99d1..34db72adee 100644
--- a/clients/client-python/gravitino/filesystem/gvfs_config.py
+++ b/clients/client-python/gravitino/filesystem/gvfs_config.py
@@ -42,8 +42,8 @@ class GVFSConfig:
     GVFS_FILESYSTEM_OSS_SECRET_KEY = "oss_secret_access_key"
     GVFS_FILESYSTEM_OSS_ENDPOINT = "oss_endpoint"
 
-    GVFS_FILESYSTEM_AZURE_ACCOUNT_NAME = "abs_account_name"
-    GVFS_FILESYSTEM_AZURE_ACCOUNT_KEY = "abs_account_key"
+    GVFS_FILESYSTEM_AZURE_ACCOUNT_NAME = "azure_storage_account_name"
+    GVFS_FILESYSTEM_AZURE_ACCOUNT_KEY = "azure_storage_account_key"
 
     # This configuration marks the expired time of the credential. For 
instance, if the credential
     # fetched from Gravitino server has expired time of 3600 seconds, and the 
credential_expired_time_ration is 0.5
diff --git a/docs/hadoop-catalog-index.md b/docs/hadoop-catalog-index.md
new file mode 100644
index 0000000000..dfa7a18717
--- /dev/null
+++ b/docs/hadoop-catalog-index.md
@@ -0,0 +1,26 @@
+---
+title: "Hadoop catalog index"
+slug: /hadoop-catalog-index
+date: 2025-01-13
+keyword: Hadoop catalog index S3 GCS ADLS OSS
+license: "This software is licensed under the Apache License version 2."
+---
+
+### Hadoop catalog overall
+
+Gravitino Hadoop catalog index includes the following chapters:
+
+- [Hadoop catalog overview and features](./hadoop-catalog.md): This chapter 
provides an overview of the Hadoop catalog, its features, capabilities and 
related configurations.
+- [Manage Hadoop catalog with Gravitino 
API](./manage-fileset-metadata-using-gravitino.md): This chapter explains how 
to manage fileset metadata using Gravitino API and provides detailed examples.
+- [Using Hadoop catalog with Gravitino virtual file 
system](how-to-use-gvfs.md): This chapter explains how to use Hadoop catalog 
with the Gravitino virtual file system and provides detailed examples.
+
+### Hadoop catalog with cloud storage
+
+Apart from the above, you can also refer to the following topics to manage and 
access cloud storage like S3, GCS, ADLS, and OSS:
+
+- [Using Hadoop catalog to manage S3](./hadoop-catalog-with-s3.md). 
+- [Using Hadoop catalog to manage GCS](./hadoop-catalog-with-gcs.md). 
+- [Using Hadoop catalog to manage ADLS](./hadoop-catalog-with-adls.md). 
+- [Using Hadoop catalog to manage OSS](./hadoop-catalog-with-oss.md). 
+
+More storage options will be added soon. Stay tuned!
\ No newline at end of file
diff --git a/docs/hadoop-catalog-with-adls.md b/docs/hadoop-catalog-with-adls.md
new file mode 100644
index 0000000000..96126c6fab
--- /dev/null
+++ b/docs/hadoop-catalog-with-adls.md
@@ -0,0 +1,522 @@
+---
+title: "Hadoop catalog with ADLS"
+slug: /hadoop-catalog-with-adls
+date: 2025-01-03
+keyword: Hadoop catalog ADLS
+license: "This software is licensed under the Apache License version 2."
+---
+
+This document describes how to configure a Hadoop catalog with ADLS (aka. 
Azure Blob Storage (ABS), or Azure Data Lake Storage (v2)).
+
+## Prerequisites
+
+To set up a Hadoop catalog with ADLS, follow these steps:
+
+1. Download the 
[`gravitino-azure-bundle-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-azure-bundle)
 file.
+2. Place the downloaded file into the Gravitino Hadoop catalog classpath at 
`${GRAVITINO_HOME}/catalogs/hadoop/libs/`.
+3. Start the Gravitino server by running the following command:
+
+```bash
+$ ${GRAVITINO_HOME}/bin/gravitino-server.sh start
+```
+
+Once the server is up and running, you can proceed to configure the Hadoop 
catalog with ADLS. In the rest of this document we will use 
`http://localhost:8090` as the Gravitino server URL, please replace it with 
your actual server URL.
+
+## Configurations for creating a Hadoop catalog with ADLS
+
+### Configuration for a ADLS Hadoop catalog
+
+Apart from configurations mentioned in 
[Hadoop-catalog-catalog-configuration](./hadoop-catalog.md#catalog-properties), 
the following properties are required to configure a Hadoop catalog with ADLS:
+
+| Configuration item            | Description                                  
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+|-------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| `filesystem-providers`        | The file system providers to add. Set it to 
`abs` if it's a Azure Blob Storage fileset, or a comma separated string that 
contains `abs` like `oss,abs,s3` to support multiple kinds of fileset including 
`abs`.                                                                          
                                                                                
                                                                                
                  [...]
+| `default-filesystem-provider` | The name default filesystem providers of 
this Hadoop catalog if users do not specify the scheme in the URI. Default 
value is `builtin-local`, for Azure Blob Storage, if we set this value, we can 
omit the prefix 'abfss://' in the location.                                     
                                                                                
                                                                                
                        [...]
+| `azure-storage-account-name ` | The account name of Azure Blob Storage.      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| `azure-storage-account-key`   | The account key of Azure Blob Storage.       
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| `credential-providers`        | The credential provider types, separated by 
comma, possible value can be `adls-token`, `azure-account-key`. As the default 
authentication type is using account name and account key as the above, this 
configuration can enable credential vending provided by Gravitino server and 
client will no longer need to provide authentication information like 
account_name/account_key to access ADLS by GVFS. Once it's set, more 
configuration items are needed to make it  [...]
+
+
+### Configurations for a schema
+
+Refer to [Schema configurations](./hadoop-catalog.md#schema-properties) for 
more details.
+
+### Configurations for a fileset
+
+Refer to [Fileset configurations](./hadoop-catalog.md#fileset-properties) for 
more details.
+
+## Example of creating Hadoop catalog with ADLS
+
+This section demonstrates how to create the Hadoop catalog with ADLS in 
Gravitino, with a complete example.
+
+### Step1: Create a Hadoop catalog with ADLS
+
+First, you need to create a Hadoop catalog with ADLS. The following example 
shows how to create a Hadoop catalog with ADLS:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "example_catalog",
+  "type": "FILESET",
+  "comment": "This is a ADLS fileset catalog",
+  "provider": "hadoop",
+  "properties": {
+    "location": "abfss://[email protected]/path",
+    "azure-storage-account-name": "The account name of the Azure Blob Storage",
+    "azure-storage-account-key": "The account key of the Azure Blob Storage",
+    "filesystem-providers": "abs"
+  }
+}' http://localhost:8090/api/metalakes/metalake/catalogs
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+GravitinoClient gravitinoClient = GravitinoClient
+    .builder("http://localhost:8090";)
+    .withMetalake("metalake")
+    .build();
+
+Map<String, String> adlsProperties = ImmutableMap.<String, String>builder()
+    .put("location", 
"abfss://[email protected]/path")
+    .put("azure-storage-account-name", "azure storage account name")
+    .put("azure-storage-account-key", "azure storage account key")
+    .put("filesystem-providers", "abs")
+    .build();
+
+Catalog adlsCatalog = gravitinoClient.createCatalog("example_catalog",
+    Type.FILESET,
+    "hadoop", // provider, Gravitino only supports "hadoop" for now.
+    "This is a ADLS fileset catalog",
+    adlsProperties);
+// ...
+
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="metalake")
+adls_properties = {
+    "location": "abfss://[email protected]/path",
+    "azure-storage-account-name": "azure storage account name",
+    "azure-storage-account-key": "azure storage account key",
+    "filesystem-providers": "abs"
+}
+
+adls_properties = gravitino_client.create_catalog(name="example_catalog",
+                                                  type=Catalog.Type.FILESET,
+                                                  provider="hadoop",
+                                                  comment="This is a ADLS 
fileset catalog",
+                                                  properties=adls_properties)
+```
+
+</TabItem>
+</Tabs>
+
+### Step2: Create a schema
+
+Once the catalog is created, you can create a schema. The following example 
shows how to create a schema:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "test_schema",
+  "comment": "This is a ADLS schema",
+  "properties": {
+    "location": "abfss://[email protected]/path"
+  }
+}' http://localhost:8090/api/metalakes/metalake/catalogs/test_catalog/schemas
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+Catalog catalog = gravitinoClient.loadCatalog("test_catalog");
+
+SupportsSchemas supportsSchemas = catalog.asSchemas();
+
+Map<String, String> schemaProperties = ImmutableMap.<String, String>builder()
+    .put("location", 
"abfss://[email protected]/path")
+    .build();
+Schema schema = supportsSchemas.createSchema("test_schema",
+    "This is a ADLS schema",
+    schemaProperties
+);
+// ...
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="metalake")
+catalog: Catalog = gravitino_client.load_catalog(name="test_catalog")
+catalog.as_schemas().create_schema(name="test_schema",
+                                   comment="This is a ADLS schema",
+                                   properties={"location": 
"abfss://[email protected]/path"})
+```
+
+</TabItem>
+</Tabs>
+
+### Step3: Create a fileset
+
+After creating the schema, you can create a fileset. The following example 
shows how to create a fileset:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "example_fileset",
+  "comment": "This is an example fileset",
+  "type": "MANAGED",
+  "storageLocation": 
"abfss://[email protected]/path/example_fileset",
+  "properties": {
+    "k1": "v1"
+  }
+}' 
http://localhost:8090/api/metalakes/metalake/catalogs/test_catalog/schemas/test_schema/filesets
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+GravitinoClient gravitinoClient = GravitinoClient
+    .builder("http://localhost:8090";)
+    .withMetalake("metalake")
+    .build();
+
+Catalog catalog = gravitinoClient.loadCatalog("test_catalog");
+FilesetCatalog filesetCatalog = catalog.asFilesetCatalog();
+
+Map<String, String> propertiesMap = ImmutableMap.<String, String>builder()
+        .put("k1", "v1")
+        .build();
+
+filesetCatalog.createFileset(
+    NameIdentifier.of("test_schema", "example_fileset"),
+    "This is an example fileset",
+    Fileset.Type.MANAGED,
+    "abfss://[email protected]/path/example_fileset",
+    propertiesMap,
+);
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="metalake")
+
+catalog: Catalog = gravitino_client.load_catalog(name="test_catalog")
+catalog.as_fileset_catalog().create_fileset(ident=NameIdentifier.of("test_schema",
 "example_fileset"),
+                                            type=Fileset.Type.MANAGED,
+                                            comment="This is an example 
fileset",
+                                            
storage_location="abfss://[email protected]/path/example_fileset",
+                                            properties={"k1": "v1"})
+```
+
+</TabItem>
+</Tabs>
+
+## Accessing a fileset with ADLS
+
+### Using the GVFS Java client to access the fileset
+
+To access fileset with Azure Blob Storage(ADLS) using the GVFS Java client, 
based on the [basic GVFS configurations](./how-to-use-gvfs.md#configuration-1), 
you need to add the following configurations:
+
+| Configuration item           | Description                             | 
Default value | Required | Since version    |
+|------------------------------|-----------------------------------------|---------------|----------|------------------|
+| `azure-storage-account-name` | The account name of Azure Blob Storage. | 
(none)        | Yes      | 0.8.0-incubating |
+| `azure-storage-account-key`  | The account key of Azure Blob Storage.  | 
(none)        | Yes      | 0.8.0-incubating |
+
+:::note
+If the catalog has enabled [credential 
vending](security/credential-vending.md), the properties above can be omitted. 
More details can be found in [Fileset with credential 
vending](#fileset-with-credential-vending).
+:::
+
+```java
+Configuration conf = new Configuration();
+conf.set("fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs");
+conf.set("fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem");
+conf.set("fs.gravitino.server.uri", "http://localhost:8090";);
+conf.set("fs.gravitino.client.metalake", "test_metalake");
+conf.set("azure-storage-account-name", "account_name_of_adls");
+conf.set("azure-storage-account-key", "account_key_of_adls");
+Path filesetPath = new 
Path("gvfs://fileset/test_catalog/test_schema/test_fileset/new_dir");
+FileSystem fs = filesetPath.getFileSystem(conf);
+fs.mkdirs(filesetPath);
+...
+```
+
+Similar to Spark configurations, you need to add ADLS (bundle) jars to the 
classpath according to your environment.
+
+If your wants to custom your hadoop version or there is already a hadoop 
version in your project, you can add the following dependencies to your 
`pom.xml`:
+
+```xml
+  <dependency>
+    <groupId>org.apache.hadoop</groupId>
+    <artifactId>hadoop-common</artifactId>
+    <version>${HADOOP_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.hadoop</groupId>
+    <artifactId>hadoop-azure</artifactId>
+    <version>${HADOOP_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>filesystem-hadoop3-runtime</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>gravitino-azure</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+```
+
+Or use the bundle jar with Hadoop environment if there is no Hadoop 
environment:
+
+```xml
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>gravitino-azure-bundle</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>filesystem-hadoop3-runtime</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+```
+
+### Using Spark to access the fileset
+
+The following code snippet shows how to use **PySpark 3.1.3 with Hadoop 
environment(Hadoop 3.2.0)** to access the fileset:
+
+Before running the following code, you need to install required packages:
+
+```bash
+pip install pyspark==3.1.3
+pip install apache-gravitino==${GRAVITINO_VERSION}
+```
+Then you can run the following code:
+
+```python
+from pyspark.sql import SparkSession
+import os
+
+gravitino_url = "http://localhost:8090";
+metalake_name = "test"
+
+catalog_name = "your_adls_catalog"
+schema_name = "your_adls_schema"
+fileset_name = "your_adls_fileset"
+
+os.environ["PYSPARK_SUBMIT_ARGS"] = "--jars 
/path/to/gravitino-azure-{gravitino-version}.jar,/path/to/gravitino-filesystem-hadoop3-runtime-{gravitino-version}.jar,/path/to/hadoop-azure-3.2.0.jar,/path/to/azure-storage-7.0.0.jar,/path/to/wildfly-openssl-1.0.4.Final.jar
 --master local[1] pyspark-shell"
+spark = SparkSession.builder
+    .appName("adls_fileset_test")
+    .config("spark.hadoop.fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs")
+    .config("spark.hadoop.fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem")
+    .config("spark.hadoop.fs.gravitino.server.uri", "http://localhost:8090";)
+    .config("spark.hadoop.fs.gravitino.client.metalake", "test")
+    .config("spark.hadoop.azure-storage-account-name", "azure_account_name")
+    .config("spark.hadoop.azure-storage-account-key", "azure_account_name")
+    .config("spark.hadoop.fs.azure.skipUserGroupMetadataDuringInitialization", 
"true")
+    .config("spark.driver.memory", "2g")
+    .config("spark.driver.port", "2048")
+    .getOrCreate()
+
+data = [("Alice", 25), ("Bob", 30), ("Cathy", 45)]
+columns = ["Name", "Age"]
+spark_df = spark.createDataFrame(data, schema=columns)
+gvfs_path = 
f"gvfs://fileset/{catalog_name}/{schema_name}/{fileset_name}/people"
+
+spark_df.coalesce(1).write
+    .mode("overwrite")
+    .option("header", "true")
+    .csv(gvfs_path)
+```
+
+If your Spark **without Hadoop environment**, you can use the following code 
snippet to access the fileset:
+
+```python
+## Replace the following code snippet with the above code snippet with the 
same environment variables
+
+os.environ["PYSPARK_SUBMIT_ARGS"] = "--jars 
/path/to/gravitino-azure-bundle-{gravitino-version}.jar,/path/to/gravitino-filesystem-hadoop3-runtime-{gravitino-version}.jar
 --master local[1] pyspark-shell"
+```
+
+- 
[`gravitino-azure-bundle-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-azure-bundle)
 is the Gravitino ADLS jar with Hadoop environment(3.3.1) and `hadoop-azure` 
jar.
+- 
[`gravitino-azure-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-azure)
 is a condensed version of the Gravitino ADLS bundle jar without Hadoop 
environment and `hadoop-azure` jar.
+- `hadoop-azure-3.2.0.jar` and `azure-storage-7.0.0.jar` can be found in the 
Hadoop distribution in the `${HADOOP_HOME}/share/hadoop/tools/lib` directory.
+
+
+Please choose the correct jar according to your environment.
+
+:::note
+In some Spark versions, a Hadoop environment is necessary for the driver, 
adding the bundle jars with '--jars' may not work. If this is the case, you 
should add the jars to the spark CLASSPATH directly.
+:::
+
+### Accessing a fileset using the Hadoop fs command
+
+The following are examples of how to use the `hadoop fs` command to access the 
fileset in Hadoop 3.1.3.
+
+1. Adding the following contents to the 
`${HADOOP_HOME}/etc/hadoop/core-site.xml` file:
+
+```xml
+  <property>
+    <name>fs.AbstractFileSystem.gvfs.impl</name>
+    <value>org.apache.gravitino.filesystem.hadoop.Gvfs</value>
+  </property>
+
+  <property>
+    <name>fs.gvfs.impl</name>
+    
<value>org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem</value>
+  </property>
+
+  <property>
+    <name>fs.gravitino.server.uri</name>
+    <value>http://localhost:8090</value>
+  </property>
+
+  <property>
+    <name>fs.gravitino.client.metalake</name>
+    <value>test</value>
+  </property>
+
+  <property>
+    <name>azure-storage-account-name</name>
+    <value>account_name</value>
+  </property>
+  <property>
+    <name>azure-storage-account-key</name>
+    <value>account_key</value>
+  </property>
+```
+
+2. Add the necessary jars to the Hadoop classpath.
+
+For ADLS, you need to add 
`gravitino-filesystem-hadoop3-runtime-${gravitino-version}.jar`, 
`gravitino-azure-${gravitino-version}.jar` and 
`hadoop-azure-${hadoop-version}.jar` located at 
`${HADOOP_HOME}/share/hadoop/tools/lib/` to the Hadoop classpath. 
+
+3. Run the following command to access the fileset:
+
+```shell
+./${HADOOP_HOME}/bin/hadoop dfs -ls 
gvfs://fileset/adls_catalog/adls_schema/adls_fileset
+./${HADOOP_HOME}/bin/hadoop dfs -put /path/to/local/file 
gvfs://fileset/adls_catalog/adls_schema/adls_fileset
+```
+
+### Using the GVFS Python client to access a fileset
+
+In order to access fileset with Azure Blob storage (ADLS) using the GVFS 
Python client, apart from [basic GVFS 
configurations](./how-to-use-gvfs.md#configuration-1), you need to add the 
following configurations:
+
+| Configuration item           | Description                            | 
Default value | Required | Since version    |
+|------------------------------|----------------------------------------|---------------|----------|------------------|
+| `azure_storage_account_name` | The account name of Azure Blob Storage | 
(none)        | Yes      | 0.8.0-incubating |
+| `azure_storage_account_key`  | The account key of Azure Blob Storage  | 
(none)        | Yes      | 0.8.0-incubating |
+
+:::note
+If the catalog has enabled [credential 
vending](security/credential-vending.md), the properties above can be omitted.
+:::
+
+Please install the `gravitino` package before running the following code:
+
+```bash
+pip install apache-gravitino==${GRAVITINO_VERSION}
+```
+
+```python
+from gravitino import gvfs
+options = {
+    "cache_size": 20,
+    "cache_expired_time": 3600,
+    "auth_type": "simple",
+    "azure_storage_account_name": "azure_account_name",
+    "azure_storage_account_key": "azure_account_key"
+}
+fs = gvfs.GravitinoVirtualFileSystem(server_uri="http://localhost:8090";, 
metalake_name="test_metalake", options=options)
+fs.ls("gvfs://fileset/{adls_catalog}/{adls_schema}/{adls_fileset}/")
+```
+
+
+### Using fileset with pandas
+
+The following are examples of how to use the pandas library to access the ADLS 
fileset
+
+```python
+import pandas as pd
+
+storage_options = {
+    "server_uri": "http://localhost:8090";, 
+    "metalake_name": "test",
+    "options": {
+        "azure_storage_account_name": "azure_account_name",
+        "azure_storage_account_key": "azure_account_key"
+    }
+}
+ds = 
pd.read_csv(f"gvfs://fileset/${catalog_name}/${schema_name}/${fileset_name}/people/part-00000-51d366e2-d5eb-448d-9109-32a96c8a14dc-c000.csv",
+                 storage_options=storage_options)
+ds.head()
+```
+
+For other use cases, please refer to the [Gravitino Virtual File 
System](./how-to-use-gvfs.md) document.
+
+## Fileset with credential vending
+
+Since 0.8.0-incubating, Gravitino supports credential vending for ADLS 
fileset. If the catalog has been [configured with 
credential](./security/credential-vending.md), you can access ADLS fileset 
without providing authentication information like `azure-storage-account-name` 
and `azure-storage-account-key` in the properties.
+
+### How to create an ADLS Hadoop catalog with credential enabled
+
+Apart from configuration method in 
[create-adls-hadoop-catalog](#configuration-for-a-adls-hadoop-catalog), 
properties needed by 
[adls-credential](./security/credential-vending.md#adls-credentials) should 
also be set to enable credential vending for ADLS fileset.
+
+### How to access ADLS fileset with credential
+
+If the catalog has been configured with credential, you can access ADLS 
fileset without providing authentication information via GVFS Java/Python 
client and Spark. Let's see how to access ADLS fileset with credential:
+
+GVFS Java client:
+
+```java
+Configuration conf = new Configuration();
+conf.set("fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs");
+conf.set("fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem");
+conf.set("fs.gravitino.server.uri", "http://localhost:8090";);
+conf.set("fs.gravitino.client.metalake", "test_metalake");
+// No need to set azure-storage-account-name and azure-storage-account-name
+Path filesetPath = new 
Path("gvfs://fileset/adls_test_catalog/test_schema/test_fileset/new_dir");
+FileSystem fs = filesetPath.getFileSystem(conf);
+fs.mkdirs(filesetPath);
+...
+```
+
+Spark:
+
+```python
+spark = SparkSession.builder
+    .appName("adls_fielset_test")
+    .config("spark.hadoop.fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs")
+    .config("spark.hadoop.fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem")
+    .config("spark.hadoop.fs.gravitino.server.uri", "http://localhost:8090";)
+    .config("spark.hadoop.fs.gravitino.client.metalake", "test")
+    # No need to set azure-storage-account-name and azure-storage-account-name
+    .config("spark.driver.memory", "2g")
+    .config("spark.driver.port", "2048")
+    .getOrCreate()
+```
+
+Python client and Hadoop command are similar to the above examples.
+
diff --git a/docs/hadoop-catalog-with-gcs.md b/docs/hadoop-catalog-with-gcs.md
new file mode 100644
index 0000000000..a3eb034b4f
--- /dev/null
+++ b/docs/hadoop-catalog-with-gcs.md
@@ -0,0 +1,500 @@
+---
+title: "Hadoop catalog with GCS"
+slug: /hadoop-catalog-with-gcs
+date: 2024-01-03
+keyword: Hadoop catalog GCS
+license: "This software is licensed under the Apache License version 2."
+---
+
+This document describes how to configure a Hadoop catalog with GCS.
+
+## Prerequisites
+To set up a Hadoop catalog with OSS, follow these steps:
+
+1. Download the 
[`gravitino-gcp-bundle-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-gcp-bundle)
 file.
+2. Place the downloaded file into the Gravitino Hadoop catalog classpath at 
`${GRAVITINO_HOME}/catalogs/hadoop/libs/`.
+3. Start the Gravitino server by running the following command:
+
+```bash
+$ ${GRAVITINO_HOME}/bin/gravitino-server.sh start
+```
+
+Once the server is up and running, you can proceed to configure the Hadoop 
catalog with GCS. In the rest of this document we will use 
`http://localhost:8090` as the Gravitino server URL, please replace it with 
your actual server URL.
+
+## Configurations for creating a Hadoop catalog with GCS
+
+### Configurations for a GCS Hadoop catalog
+
+Apart from configurations mentioned in 
[Hadoop-catalog-catalog-configuration](./hadoop-catalog.md#catalog-properties), 
the following properties are required to configure a Hadoop catalog with GCS:
+
+| Configuration item            | Description                                  
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+|-------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| `filesystem-providers`        | The file system providers to add. Set it to 
`gcs` if it's a GCS fileset, a comma separated string that contains `gcs` like 
`gcs,s3` to support multiple kinds of fileset including `gcs`.                  
                                                                                
                                                                                
                                                                                
                [...]
+| `default-filesystem-provider` | The name default filesystem providers of 
this Hadoop catalog if users do not specify the scheme in the URI. Default 
value is `builtin-local`, for GCS, if we set this value, we can omit the prefix 
'gs://' in the location.                                                        
                                                                                
                                                                                
                       [...]
+| `gcs-service-account-file`    | The path of GCS service account JSON file.   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| `credential-providers`        | The credential provider types, separated by 
comma, possible value can be `gcs-token`. As the default authentication type is 
using service account as the above, this configuration can enable credential 
vending provided by Gravitino server and client will no longer need to provide 
authentication information like service account to access GCS by GVFS. Once 
it's set, more configuration items are needed to make it works, please see 
[gcs-credential-vending](se [...]
+
+
+### Configurations for a schema
+
+Refer to [Schema configurations](./hadoop-catalog.md#schema-properties) for 
more details.
+
+### Configurations for a fileset
+
+Refer to [Fileset configurations](./hadoop-catalog.md#fileset-properties) for 
more details.
+
+## Example of creating Hadoop catalog with GCS
+
+This section will show you how to use the Hadoop catalog with GCS in 
Gravitino, including detailed examples.
+
+### Create a Hadoop catalog with GCS
+
+First, you need to create a Hadoop catalog with GCS. The following example 
shows how to create a Hadoop catalog with GCS:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "test_catalog",
+  "type": "FILESET",
+  "comment": "This is a GCS fileset catalog",
+  "provider": "hadoop",
+  "properties": {
+    "location": "gs://bucket/root",
+    "gcs-service-account-file": "path_of_gcs_service_account_file",
+    "filesystem-providers": "gcs"
+  }
+}' http://localhost:8090/api/metalakes/metalake/catalogs
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+GravitinoClient gravitinoClient = GravitinoClient
+    .builder("http://localhost:8090";)
+    .withMetalake("metalake")
+    .build();
+
+Map<String, String> gcsProperties = ImmutableMap.<String, String>builder()
+    .put("location", "gs://bucket/root")
+    .put("gcs-service-account-file", "path_of_gcs_service_account_file")
+    .put("filesystem-providers", "gcs")
+    .build();
+
+Catalog gcsCatalog = gravitinoClient.createCatalog("test_catalog", 
+    Type.FILESET,
+    "hadoop", // provider, Gravitino only supports "hadoop" for now.
+    "This is a GCS fileset catalog",
+    gcsProperties);
+// ...
+
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="metalake")
+gcs_properties = {
+    "location": "gs://bucket/root",
+    "gcs-service-account-file": "path_of_gcs_service_account_file",
+    "filesystem-providers": "gcs"
+}
+
+gcs_properties = gravitino_client.create_catalog(name="test_catalog",
+                                                 type=Catalog.Type.FILESET,
+                                                 provider="hadoop",
+                                                 comment="This is a GCS 
fileset catalog",
+                                                 properties=gcs_properties)
+```
+
+</TabItem>
+</Tabs>
+
+### Step2: Create a schema
+
+Once you have created a Hadoop catalog with GCS, you can create a schema. The 
following example shows how to create a schema:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "test_schema",
+  "comment": "This is a GCS schema",
+  "properties": {
+    "location": "gs://bucket/root/schema"
+  }
+}' http://localhost:8090/api/metalakes/metalake/catalogs/test_catalog/schemas
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+Catalog catalog = gravitinoClient.loadCatalog("test_catalog");
+
+SupportsSchemas supportsSchemas = catalog.asSchemas();
+
+Map<String, String> schemaProperties = ImmutableMap.<String, String>builder()
+    .put("location", "gs://bucket/root/schema")
+    .build();
+Schema schema = supportsSchemas.createSchema("test_schema",
+    "This is a GCS schema",
+    schemaProperties
+);
+// ...
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="metalake")
+catalog: Catalog = gravitino_client.load_catalog(name="test_catalog")
+catalog.as_schemas().create_schema(name="test_schema",
+                                   comment="This is a GCS schema",
+                                   properties={"location": 
"gs://bucket/root/schema"})
+```
+
+</TabItem>
+</Tabs>
+
+
+### Step3: Create a fileset
+
+After creating a schema, you can create a fileset. The following example shows 
how to create a fileset:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "example_fileset",
+  "comment": "This is an example fileset",
+  "type": "MANAGED",
+  "storageLocation": "gs://bucket/root/schema/example_fileset",
+  "properties": {
+    "k1": "v1"
+  }
+}' 
http://localhost:8090/api/metalakes/metalake/catalogs/test_catalog/schemas/test_schema/filesets
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+GravitinoClient gravitinoClient = GravitinoClient
+    .builder("http://localhost:8090";)
+    .withMetalake("metalake")
+    .build();
+
+Catalog catalog = gravitinoClient.loadCatalog("test_catalog");
+FilesetCatalog filesetCatalog = catalog.asFilesetCatalog();
+
+Map<String, String> propertiesMap = ImmutableMap.<String, String>builder()
+        .put("k1", "v1")
+        .build();
+
+filesetCatalog.createFileset(
+    NameIdentifier.of("test_schema", "example_fileset"),
+    "This is an example fileset",
+    Fileset.Type.MANAGED,
+    "gs://bucket/root/schema/example_fileset",
+    propertiesMap,
+);
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="metalake")
+
+catalog: Catalog = gravitino_client.load_catalog(name="test_catalog")
+catalog.as_fileset_catalog().create_fileset(ident=NameIdentifier.of("test_schema",
 "example_fileset"),
+                                            type=Fileset.Type.MANAGED,
+                                            comment="This is an example 
fileset",
+                                            
storage_location="gs://bucket/root/schema/example_fileset",
+                                            properties={"k1": "v1"})
+```
+
+</TabItem>
+</Tabs>
+
+## Accessing a fileset with GCS
+
+### Using the GVFS Java client to access the fileset
+
+To access fileset with GCS using the GVFS Java client, based on the [basic 
GVFS configurations](./how-to-use-gvfs.md#configuration-1), you need to add the 
following configurations:
+
+| Configuration item         | Description                                | 
Default value | Required | Since version    |
+|----------------------------|--------------------------------------------|---------------|----------|------------------|
+| `gcs-service-account-file` | The path of GCS service account JSON file. | 
(none)        | Yes      | 0.7.0-incubating |
+
+:::note
+If the catalog has enabled [credential 
vending](security/credential-vending.md), the properties above can be omitted. 
More details can be found in [Fileset with credential 
vending](#fileset-with-credential-vending).
+:::
+
+```java
+Configuration conf = new Configuration();
+conf.set("fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs");
+conf.set("fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem");
+conf.set("fs.gravitino.server.uri", "http://localhost:8090";);
+conf.set("fs.gravitino.client.metalake", "test_metalake");
+conf.set("gcs-service-account-file", "/path/your-service-account-file.json");
+Path filesetPath = new 
Path("gvfs://fileset/test_catalog/test_schema/test_fileset/new_dir");
+FileSystem fs = filesetPath.getFileSystem(conf);
+fs.mkdirs(filesetPath);
+...
+```
+
+Similar to Spark configurations, you need to add GCS (bundle) jars to the 
classpath according to your environment.
+If your wants to custom your hadoop version or there is already a hadoop 
version in your project, you can add the following dependencies to your 
`pom.xml`:
+
+```xml
+  <dependency>
+    <groupId>org.apache.hadoop</groupId>
+    <artifactId>hadoop-common</artifactId>
+    <version>${HADOOP_VERSION}</version>
+  </dependency>
+  <dependency>
+    <groupId>com.google.cloud.bigdataoss</groupId>
+    <artifactId>gcs-connector</artifactId>
+    <version>${GCS_CONNECTOR_VERSION}</version>
+  </dependency>
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>filesystem-hadoop3-runtime</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>gravitino-gcp</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+```
+
+Or use the bundle jar with Hadoop environment if there is no Hadoop 
environment:
+
+```xml
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>gravitino-gcp-bundle</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>filesystem-hadoop3-runtime</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+```
+
+### Using Spark to access the fileset
+
+The following code snippet shows how to use **PySpark 3.1.3 with Hadoop 
environment(Hadoop 3.2.0)** to access the fileset:
+
+Before running the following code, you need to install required packages:
+
+```bash
+pip install pyspark==3.1.3
+pip install apache-gravitino==${GRAVITINO_VERSION}
+```
+Then you can run the following code:
+
+```python
+from pyspark.sql import SparkSession
+import os
+
+gravitino_url = "http://localhost:8090";
+metalake_name = "test"
+
+catalog_name = "your_gcs_catalog"
+schema_name = "your_gcs_schema"
+fileset_name = "your_gcs_fileset"
+
+os.environ["PYSPARK_SUBMIT_ARGS"] = "--jars 
/path/to/gravitino-gcp-{gravitino-version}.jar,/path/to/gravitino-filesystem-hadoop3-runtime-{gravitino-version}.jar,/path/to/gcs-connector-hadoop3-2.2.22-shaded.jar
 --master local[1] pyspark-shell"
+spark = SparkSession.builder
+    .appName("gcs_fielset_test")
+    .config("spark.hadoop.fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs")
+    .config("spark.hadoop.fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem")
+    .config("spark.hadoop.fs.gravitino.server.uri", "http://localhost:8090";)
+    .config("spark.hadoop.fs.gravitino.client.metalake", "test_metalake")
+    .config("spark.hadoop.gcs-service-account-file", 
"/path/to/gcs-service-account-file.json")
+    .config("spark.driver.memory", "2g")
+    .config("spark.driver.port", "2048")
+    .getOrCreate()
+
+data = [("Alice", 25), ("Bob", 30), ("Cathy", 45)]
+columns = ["Name", "Age"]
+spark_df = spark.createDataFrame(data, schema=columns)
+gvfs_path = 
f"gvfs://fileset/{catalog_name}/{schema_name}/{fileset_name}/people"
+
+spark_df.coalesce(1).write
+    .mode("overwrite")
+    .option("header", "true")
+    .csv(gvfs_path)
+```
+
+If your Spark **without Hadoop environment**, you can use the following code 
snippet to access the fileset:
+
+```python
+## Replace the following code snippet with the above code snippet with the 
same environment variables
+
+os.environ["PYSPARK_SUBMIT_ARGS"] = "--jars 
/path/to/gravitino-gcp-bundle-{gravitino-version}.jar,/path/to/gravitino-filesystem-hadoop3-runtime-{gravitino-version}.jar,
 --master local[1] pyspark-shell"
+```
+
+- 
[`gravitino-gcp-bundle-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-gcp-bundle)
 is the Gravitino GCP jar with Hadoop environment(3.3.1) and `gcs-connector`.
+- 
[`gravitino-gcp-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-gcp)
 is a condensed version of the Gravitino GCP bundle jar without Hadoop 
environment and 
[`gcs-connector`](https://github.com/GoogleCloudDataproc/hadoop-connectors/releases/download/v2.2.22/gcs-connector-hadoop3-2.2.22-shaded.jar)
 
+
+Please choose the correct jar according to your environment.
+
+:::note
+In some Spark versions, a Hadoop environment is needed by the driver, adding 
the bundle jars with '--jars' may not work. If this is the case, you should add 
the jars to the spark CLASSPATH directly.
+:::
+
+### Accessing a fileset using the Hadoop fs command
+
+The following are examples of how to use the `hadoop fs` command to access the 
fileset in Hadoop 3.1.3.
+
+1. Adding the following contents to the 
`${HADOOP_HOME}/etc/hadoop/core-site.xml` file:
+
+```xml
+  <property>
+    <name>fs.AbstractFileSystem.gvfs.impl</name>
+    <value>org.apache.gravitino.filesystem.hadoop.Gvfs</value>
+  </property>
+
+  <property>
+    <name>fs.gvfs.impl</name>
+    
<value>org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem</value>
+  </property>
+
+  <property>
+    <name>fs.gravitino.server.uri</name>
+    <value>http://localhost:8090</value>
+  </property>
+
+  <property>
+    <name>fs.gravitino.client.metalake</name>
+    <value>test</value>
+  </property>
+
+  <property>
+    <name>gcs-service-account-file</name>
+    <value>/path/your-service-account-file.json</value>
+  </property>
+```
+
+2. Add the necessary jars to the Hadoop classpath.
+
+For GCS, you need to add 
`gravitino-filesystem-hadoop3-runtime-${gravitino-version}.jar`, 
`gravitino-gcp-${gravitino-version}.jar` and 
[`gcs-connector-hadoop3-2.2.22-shaded.jar`](https://github.com/GoogleCloudDataproc/hadoop-connectors/releases/download/v2.2.22/gcs-connector-hadoop3-2.2.22-shaded.jar)
 to Hadoop classpath.
+
+3. Run the following command to access the fileset:
+
+```shell
+./${HADOOP_HOME}/bin/hadoop dfs -ls 
gvfs://fileset/gcs_catalog/gcs_schema/gcs_example
+./${HADOOP_HOME}/bin/hadoop dfs -put /path/to/local/file 
gvfs://fileset/gcs_catalog/gcs_schema/gcs_example
+```
+
+### Using the GVFS Python client to access a fileset
+
+In order to access fileset with GCS using the GVFS Python client, apart from 
[basic GVFS configurations](./how-to-use-gvfs.md#configuration-1), you need to 
add the following configurations:
+
+| Configuration item         | Description                               | 
Default value | Required | Since version    |
+|----------------------------|-------------------------------------------|---------------|----------|------------------|
+| `gcs_service_account_file` | The path of GCS service account JSON file.| 
(none)        | Yes      | 0.7.0-incubating |
+
+:::note
+If the catalog has enabled [credential 
vending](security/credential-vending.md), the properties above can be omitted.
+:::
+
+Please install the `gravitino` package before running the following code:
+
+```bash
+pip install apache-gravitino==${GRAVITINO_VERSION}
+```
+
+```python
+from gravitino import gvfs
+options = {
+    "cache_size": 20,
+    "cache_expired_time": 3600,
+    "auth_type": "simple",
+    "gcs_service_account_file": "path_of_gcs_service_account_file.json",
+}
+fs = gvfs.GravitinoVirtualFileSystem(server_uri="http://localhost:8090";, 
metalake_name="test_metalake", options=options)
+fs.ls("gvfs://fileset/{catalog_name}/{schema_name}/{fileset_name}/")
+```
+
+### Using fileset with pandas
+
+The following are examples of how to use the pandas library to access the GCS 
fileset
+
+```python
+import pandas as pd
+
+storage_options = {
+    "server_uri": "http://localhost:8090";, 
+    "metalake_name": "test",
+    "options": {
+        "gcs_service_account_file": "path_of_gcs_service_account_file.json",
+    }
+}
+ds = 
pd.read_csv(f"gvfs://fileset/${catalog_name}/${schema_name}/${fileset_name}/people/part-00000-51d366e2-d5eb-448d-9109-32a96c8a14dc-c000.csv",
+                 storage_options=storage_options)
+ds.head()
+```
+
+For other use cases, please refer to the [Gravitino Virtual File 
System](./how-to-use-gvfs.md) document.
+
+## Fileset with credential vending
+
+Since 0.8.0-incubating, Gravitino supports credential vending for GCS fileset. 
If the catalog has been [configured with 
credential](./security/credential-vending.md), you can access GCS fileset 
without providing authentication information like `gcs-service-account-file` in 
the properties.
+
+### How to create a GCS Hadoop catalog with credential enabled
+
+Apart from configuration method in 
[create-gcs-hadoop-catalog](#configurations-for-a-gcs-hadoop-catalog), 
properties needed by 
[gcs-credential](./security/credential-vending.md#gcs-credentials) should also 
be set to enable credential vending for GCS fileset.
+
+### How to access GCS fileset with credential
+
+If the catalog has been configured with credential, you can access GCS fileset 
without providing authentication information via GVFS Java/Python client and 
Spark. Let's see how to access GCS fileset with credential:
+
+GVFS Java client:
+
+```java
+Configuration conf = new Configuration();
+conf.set("fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs");
+conf.set("fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem");
+conf.set("fs.gravitino.server.uri", "http://localhost:8090";);
+conf.set("fs.gravitino.client.metalake", "test_metalake");
+// No need to set gcs-service-account-file
+Path filesetPath = new 
Path("gvfs://fileset/gcs_test_catalog/test_schema/test_fileset/new_dir");
+FileSystem fs = filesetPath.getFileSystem(conf);
+fs.mkdirs(filesetPath);
+...
+```
+
+Spark:
+
+```python
+spark = SparkSession.builder
+    .appName("gcs_fileset_test")
+    .config("spark.hadoop.fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs")
+    .config("spark.hadoop.fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem")
+    .config("spark.hadoop.fs.gravitino.server.uri", "http://localhost:8090";)
+    .config("spark.hadoop.fs.gravitino.client.metalake", "test")
+    # No need to set gcs-service-account-file
+    .config("spark.driver.memory", "2g")
+    .config("spark.driver.port", "2048")
+    .getOrCreate()
+```
+
+Python client and Hadoop command are similar to the above examples.
diff --git a/docs/hadoop-catalog-with-oss.md b/docs/hadoop-catalog-with-oss.md
new file mode 100644
index 0000000000..e63935c720
--- /dev/null
+++ b/docs/hadoop-catalog-with-oss.md
@@ -0,0 +1,538 @@
+---
+title: "Hadoop catalog with OSS"
+slug: /hadoop-catalog-with-oss
+date: 2025-01-03
+keyword: Hadoop catalog OSS
+license: "This software is licensed under the Apache License version 2."
+---
+
+This document explains how to configure a Hadoop catalog with Aliyun OSS 
(Object Storage Service) in Gravitino.
+
+## Prerequisites
+
+To set up a Hadoop catalog with OSS, follow these steps:
+
+1. Download the 
[`gravitino-aliyun-bundle-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-aliyun-bundle)
 file.
+2. Place the downloaded file into the Gravitino Hadoop catalog classpath at 
`${GRAVITINO_HOME}/catalogs/hadoop/libs/`.
+3. Start the Gravitino server by running the following command:
+
+```bash
+$ ${GRAVITINO_HOME}/bin/gravitino-server.sh start
+```
+
+Once the server is up and running, you can proceed to configure the Hadoop 
catalog with OSS. In the rest of this document we will use 
`http://localhost:8090` as the Gravitino server URL, please replace it with 
your actual server URL.
+
+## Configurations for creating a Hadoop catalog with OSS
+
+### Configuration for an OSS Hadoop catalog
+
+In addition to the basic configurations mentioned in 
[Hadoop-catalog-catalog-configuration](./hadoop-catalog.md#catalog-properties), 
the following properties are required to configure a Hadoop catalog with OSS:
+
+| Configuration item             | Description                                 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+|--------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| `filesystem-providers`         | The file system providers to add. Set it to 
`oss` if it's a OSS fileset, or a comma separated string that contains `oss` 
like `oss,gs,s3` to support multiple kinds of fileset including `oss`.          
                                                                                
                                                                                
                                                                                
                 [...]
+| `default-filesystem-provider`  | The name default filesystem providers of 
this Hadoop catalog if users do not specify the scheme in the URI. Default 
value is `builtin-local`, for OSS, if we set this value, we can omit the prefix 
'oss://' in the location.                                                       
                                                                                
                                                                                
                      [...]
+| `oss-endpoint`                 | The endpoint of the Aliyun OSS.             
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| `oss-access-key-id`            | The access key of the Aliyun OSS.           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| `oss-secret-access-key`        | The secret key of the Aliyun OSS.           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| `credential-providers`         | The credential provider types, separated by 
comma, possible value can be `oss-token`, `oss-secret-key`. As the default 
authentication type is using AKSK as the above, this configuration can enable 
credential vending provided by Gravitino server and client will no longer need 
to provide authentication information like AKSK to access OSS by GVFS. Once 
it's set, more configuration items are needed to make it works, please see 
[oss-credential-vending](secur [...]
+
+
+### Configurations for a schema
+
+To create a schema, refer to [Schema 
configurations](./hadoop-catalog.md#schema-properties).
+
+### Configurations for a fileset
+
+For instructions on how to create a fileset, refer to [Fileset 
configurations](./hadoop-catalog.md#fileset-properties) for more details.
+
+## Example of creating Hadoop catalog/schema/fileset with OSS
+
+This section will show you how to use the Hadoop catalog with OSS in 
Gravitino, including detailed examples.
+
+### Step1: Create a Hadoop catalog with OSS
+
+First, you need to create a Hadoop catalog for OSS. The following examples 
demonstrate how to create a Hadoop catalog with OSS:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "test_catalog",
+  "type": "FILESET",
+  "comment": "This is a OSS fileset catalog",
+  "provider": "hadoop",
+  "properties": {
+    "location": "oss://bucket/root",
+    "oss-access-key-id": "access_key",
+    "oss-secret-access-key": "secret_key",
+    "oss-endpoint": "http://oss-cn-hangzhou.aliyuncs.com";,
+    "filesystem-providers": "oss"
+  }
+}' http://localhost:8090/api/metalakes/metalake/catalogs
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+GravitinoClient gravitinoClient = GravitinoClient
+    .builder("http://localhost:8090";)
+    .withMetalake("metalake")
+    .build();
+
+Map<String, String> ossProperties = ImmutableMap.<String, String>builder()
+    .put("location", "oss://bucket/root")
+    .put("oss-access-key-id", "access_key")
+    .put("oss-secret-access-key", "secret_key")
+    .put("oss-endpoint", "http://oss-cn-hangzhou.aliyuncs.com";)
+    .put("filesystem-providers", "oss")
+    .build();
+
+Catalog ossCatalog = gravitinoClient.createCatalog("test_catalog",
+    Type.FILESET,
+    "hadoop", // provider, Gravitino only supports "hadoop" for now.
+    "This is a OSS fileset catalog",
+    ossProperties);
+// ...
+
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="metalake")
+oss_properties = {
+    "location": "oss://bucket/root",
+    "oss-access-key-id": "access_key"
+    "oss-secret-access-key": "secret_key",
+    "oss-endpoint": "ossProperties",
+    "filesystem-providers": "oss"
+}
+
+oss_catalog = gravitino_client.create_catalog(name="test_catalog",
+                                              type=Catalog.Type.FILESET,
+                                              provider="hadoop",
+                                              comment="This is a OSS fileset 
catalog",
+                                              properties=oss_properties)
+```
+
+</TabItem>
+</Tabs>
+
+Step 2: Create a Schema
+
+Once the Hadoop catalog with OSS is created, you can create a schema inside 
that catalog. Below are examples of how to do this:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "test_schema",
+  "comment": "This is a OSS schema",
+  "properties": {
+    "location": "oss://bucket/root/schema"
+  }
+}' http://localhost:8090/api/metalakes/metalake/catalogs/test_catalog/schemas
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+Catalog catalog = gravitinoClient.loadCatalog("test_catalog");
+
+SupportsSchemas supportsSchemas = catalog.asSchemas();
+
+Map<String, String> schemaProperties = ImmutableMap.<String, String>builder()
+    .put("location", "oss://bucket/root/schema")
+    .build();
+Schema schema = supportsSchemas.createSchema("test_schema",
+    "This is a OSS schema",
+    schemaProperties
+);
+// ...
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="metalake")
+catalog: Catalog = gravitino_client.load_catalog(name="test_catalog")
+catalog.as_schemas().create_schema(name="test_schema",
+                                   comment="This is a OSS schema",
+                                   properties={"location": 
"oss://bucket/root/schema"})
+```
+
+</TabItem>
+</Tabs>
+
+
+### Create a fileset
+
+Now that the schema is created, you can create a fileset inside it. Here’s how:
+
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "example_fileset",
+  "comment": "This is an example fileset",
+  "type": "MANAGED",
+  "storageLocation": "oss://bucket/root/schema/example_fileset",
+  "properties": {
+    "k1": "v1"
+  }
+}' 
http://localhost:8090/api/metalakes/metalake/catalogs/test_catalog/schemas/test_schema/filesets
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+GravitinoClient gravitinoClient = GravitinoClient
+    .builder("http://localhost:8090";)
+    .withMetalake("metalake")
+    .build();
+
+Catalog catalog = gravitinoClient.loadCatalog("test_catalog");
+FilesetCatalog filesetCatalog = catalog.asFilesetCatalog();
+
+Map<String, String> propertiesMap = ImmutableMap.<String, String>builder()
+        .put("k1", "v1")
+        .build();
+
+filesetCatalog.createFileset(
+    NameIdentifier.of("test_schema", "example_fileset"),
+    "This is an example fileset",
+    Fileset.Type.MANAGED,
+    "oss://bucket/root/schema/example_fileset",
+    propertiesMap,
+);
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="metalake")
+
+catalog: Catalog = gravitino_client.load_catalog(name="test_catalog")
+catalog.as_fileset_catalog().create_fileset(ident=NameIdentifier.of("test_schema",
 "example_fileset"),
+                                            type=Fileset.Type.MANAGED,
+                                            comment="This is an example 
fileset",
+                                            
storage_location="oss://bucket/root/schema/example_fileset",
+                                            properties={"k1": "v1"})
+```
+
+</TabItem>
+</Tabs>
+
+## Accessing a fileset with OSS
+
+### Using the GVFS Java client to access the fileset
+
+To access fileset with OSS using the GVFS Java client, based on the [basic 
GVFS configurations](./how-to-use-gvfs.md#configuration-1), you need to add the 
following configurations:
+
+| Configuration item      | Description                       | Default value 
| Required | Since version    |
+|-------------------------|-----------------------------------|---------------|----------|------------------|
+| `oss-endpoint`          | The endpoint of the Aliyun OSS.   | (none)        
| Yes      | 0.7.0-incubating |
+| `oss-access-key-id`     | The access key of the Aliyun OSS. | (none)        
| Yes      | 0.7.0-incubating |
+| `oss-secret-access-key` | The secret key of the Aliyun OSS. | (none)        
| Yes      | 0.7.0-incubating |
+
+:::note
+If the catalog has enabled [credential 
vending](security/credential-vending.md), the properties above can be omitted. 
More details can be found in [Fileset with credential 
vending](#fileset-with-credential-vending).
+:::
+
+```java
+Configuration conf = new Configuration();
+conf.set("fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs");
+conf.set("fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem");
+conf.set("fs.gravitino.server.uri", "http://localhost:8090";);
+conf.set("fs.gravitino.client.metalake", "test_metalake");
+conf.set("oss-endpoint", "http://localhost:8090";);
+conf.set("oss-access-key-id", "minio");
+conf.set("oss-secret-access-key", "minio123"); 
+Path filesetPath = new 
Path("gvfs://fileset/test_catalog/test_schema/test_fileset/new_dir");
+FileSystem fs = filesetPath.getFileSystem(conf);
+fs.mkdirs(filesetPath);
+...
+```
+
+Similar to Spark configurations, you need to add OSS (bundle) jars to the 
classpath according to your environment.
+If your wants to custom your hadoop version or there is already a hadoop 
version in your project, you can add the following dependencies to your 
`pom.xml`:
+
+```xml
+  <dependency>
+    <groupId>org.apache.hadoop</groupId>
+    <artifactId>hadoop-common</artifactId>
+    <version>${HADOOP_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.hadoop</groupId>
+    <artifactId>hadoop-aliyun</artifactId>
+    <version>${HADOOP_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>filesystem-hadoop3-runtime</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>gravitino-aliyun</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+```
+
+Or use the bundle jar with Hadoop environment if there is no Hadoop 
environment:
+
+```xml
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>gravitino-aliyun-bundle</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>filesystem-hadoop3-runtime</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+```
+
+### Using Spark to access the fileset
+
+The following code snippet shows how to use **PySpark 3.1.3 with Hadoop 
environment(Hadoop 3.2.0)** to access the fileset:
+
+Before running the following code, you need to install required packages:
+
+```bash
+pip install pyspark==3.1.3
+pip install apache-gravitino==${GRAVITINO_VERSION}
+```
+Then you can run the following code:
+
+```python
+from pyspark.sql import SparkSession
+import os
+
+gravitino_url = "http://localhost:8090";
+metalake_name = "test"
+
+catalog_name = "your_oss_catalog"
+schema_name = "your_oss_schema"
+fileset_name = "your_oss_fileset"
+
+os.environ["PYSPARK_SUBMIT_ARGS"] = "--jars 
/path/to/gravitino-aliyun-{gravitino-version}.jar,/path/to/gravitino-filesystem-hadoop3-runtime-{gravitino-version}.jar,/path/to/aliyun-sdk-oss-2.8.3.jar,/path/to/hadoop-aliyun-3.2.0.jar,/path/to/jdom-1.1.jar
 --master local[1] pyspark-shell"
+spark = SparkSession.builder
+    .appName("oss_fileset_test")
+    .config("spark.hadoop.fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs")
+    .config("spark.hadoop.fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem")
+    .config("spark.hadoop.fs.gravitino.server.uri", "${_URL}")
+    .config("spark.hadoop.fs.gravitino.client.metalake", "test")
+    .config("spark.hadoop.oss-access-key-id", os.environ["OSS_ACCESS_KEY_ID"])
+    .config("spark.hadoop.oss-secret-access-key", 
os.environ["OSS_SECRET_ACCESS_KEY"])
+    .config("spark.hadoop.oss-endpoint", "http://oss-cn-hangzhou.aliyuncs.com";)
+    .config("spark.driver.memory", "2g")
+    .config("spark.driver.port", "2048")
+    .getOrCreate()
+
+data = [("Alice", 25), ("Bob", 30), ("Cathy", 45)]
+columns = ["Name", "Age"]
+spark_df = spark.createDataFrame(data, schema=columns)
+gvfs_path = 
f"gvfs://fileset/{catalog_name}/{schema_name}/{fileset_name}/people"
+
+spark_df.coalesce(1).write
+    .mode("overwrite")
+    .option("header", "true")
+    .csv(gvfs_path)
+```
+
+If your Spark **without Hadoop environment**, you can use the following code 
snippet to access the fileset:
+
+```python
+## Replace the following code snippet with the above code snippet with the 
same environment variables
+
+os.environ["PYSPARK_SUBMIT_ARGS"] = "--jars 
/path/to/gravitino-aliyun-bundle-{gravitino-version}.jar,/path/to/gravitino-filesystem-hadoop3-runtime-{gravitino-version}.jar,
 --master local[1] pyspark-shell"
+```
+
+- 
[`gravitino-aliyun-bundle-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-aliyun-bundle)
 is the Gravitino Aliyun jar with Hadoop environment(3.3.1) and `hadoop-oss` 
jar.
+- 
[`gravitino-aliyun-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-aliyun)
 is a condensed version of the Gravitino Aliyun bundle jar without Hadoop 
environment and `hadoop-aliyun` jar.
+-`hadoop-aliyun-3.2.0.jar` and `aliyun-sdk-oss-2.8.3.jar` can be found in the 
Hadoop distribution in the `${HADOOP_HOME}/share/hadoop/tools/lib` directory.
+
+Please choose the correct jar according to your environment.
+
+:::note
+In some Spark versions, a Hadoop environment is needed by the driver, adding 
the bundle jars with '--jars' may not work. If this is the case, you should add 
the jars to the spark CLASSPATH directly.
+:::
+
+### Accessing a fileset using the Hadoop fs command
+
+The following are examples of how to use the `hadoop fs` command to access the 
fileset in Hadoop 3.1.3.
+
+1. Adding the following contents to the 
`${HADOOP_HOME}/etc/hadoop/core-site.xml` file:
+
+```xml
+  <property>
+    <name>fs.AbstractFileSystem.gvfs.impl</name>
+    <value>org.apache.gravitino.filesystem.hadoop.Gvfs</value>
+  </property>
+
+  <property>
+    <name>fs.gvfs.impl</name>
+    
<value>org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem</value>
+  </property>
+
+  <property>
+    <name>fs.gravitino.server.uri</name>
+    <value>http://localhost:8090</value>
+  </property>
+
+  <property>
+    <name>fs.gravitino.client.metalake</name>
+    <value>test</value>
+  </property>
+
+  <property>
+    <name>oss-endpoint</name>
+    <value>http://oss-cn-hangzhou.aliyuncs.com</value>
+  </property>
+
+  <property>
+    <name>oss-access-key-id</name>
+    <value>access-key</value>
+  </property>
+  
+  <property>
+  <name>oss-secret-access-key</name>
+    <value>secret-key</value>
+  </property>
+```
+
+2. Add the necessary jars to the Hadoop classpath.
+
+For OSS, you need to add 
`gravitino-filesystem-hadoop3-runtime-${gravitino-version}.jar`, 
`gravitino-aliyun-${gravitino-version}.jar` and 
`hadoop-aliyun-${hadoop-version}.jar` located at 
`${HADOOP_HOME}/share/hadoop/tools/lib/` to Hadoop classpath. 
+
+3. Run the following command to access the fileset:
+
+```shell
+./${HADOOP_HOME}/bin/hadoop dfs -ls 
gvfs://fileset/oss_catalog/oss_schema/oss_fileset
+./${HADOOP_HOME}/bin/hadoop dfs -put /path/to/local/file 
gvfs://fileset/oss_catalog/schema/oss_fileset
+```
+
+### Using the GVFS Python client to access a fileset
+
+In order to access fileset with OSS using the GVFS Python client, apart from 
[basic GVFS configurations](./how-to-use-gvfs.md#configuration-1), you need to 
add the following configurations:
+
+| Configuration item      | Description                       | Default value 
| Required | Since version    |
+|-------------------------|-----------------------------------|---------------|----------|------------------|
+| `oss_endpoint`          | The endpoint of the Aliyun OSS.   | (none)        
| Yes      | 0.7.0-incubating |
+| `oss_access_key_id`     | The access key of the Aliyun OSS. | (none)        
| Yes      | 0.7.0-incubating |
+| `oss_secret_access_key` | The secret key of the Aliyun OSS. | (none)        
| Yes      | 0.7.0-incubating |
+
+:::note
+If the catalog has enabled [credential 
vending](security/credential-vending.md), the properties above can be omitted.
+:::
+
+Please install the `gravitino` package before running the following code:
+
+```bash
+pip install apache-gravitino==${GRAVITINO_VERSION}
+```
+
+```python
+from gravitino import gvfs
+options = {
+    "cache_size": 20,
+    "cache_expired_time": 3600,
+    "auth_type": "simple",
+    "oss_endpoint": "http://localhost:8090";,
+    "oss_access_key_id": "minio",
+    "oss_secret_access_key": "minio123"
+}
+fs = gvfs.GravitinoVirtualFileSystem(server_uri="http://localhost:8090";, 
metalake_name="test_metalake", options=options)
+
+fs.ls("gvfs://fileset/{catalog_name}/{schema_name}/{fileset_name}/")
+```
+
+
+### Using fileset with pandas
+
+The following are examples of how to use the pandas library to access the OSS 
fileset
+
+```python
+import pandas as pd
+
+storage_options = {
+    "server_uri": "http://localhost:8090";, 
+    "metalake_name": "test",
+    "options": {
+        "oss_access_key_id": "access_key",
+        "oss_secret_access_key": "secret_key",
+        "oss_endpoint": "http://oss-cn-hangzhou.aliyuncs.com";
+    }
+}
+ds = 
pd.read_csv(f"gvfs://fileset/${catalog_name}/${schema_name}/${fileset_name}/people/part-00000-51d366e2-d5eb-448d-9109-32a96c8a14dc-c000.csv",
+                 storage_options=storage_options)
+ds.head()
+```
+For other use cases, please refer to the [Gravitino Virtual File 
System](./how-to-use-gvfs.md) document.
+
+## Fileset with credential vending
+
+Since 0.8.0-incubating, Gravitino supports credential vending for OSS fileset. 
If the catalog has been [configured with 
credential](./security/credential-vending.md), you can access OSS fileset 
without providing authentication information like `oss-access-key-id` and 
`oss-secret-access-key` in the properties.
+
+### How to create a OSS Hadoop catalog with credential enabled
+
+Apart from configuration method in 
[create-oss-hadoop-catalog](#configuration-for-an-oss-hadoop-catalog), 
properties needed by 
[oss-credential](./security/credential-vending.md#oss-credentials) should also 
be set to enable credential vending for OSS fileset.
+
+### How to access OSS fileset with credential
+
+If the catalog has been configured with credential, you can access OSS fileset 
without providing authentication information via GVFS Java/Python client and 
Spark. Let's see how to access OSS fileset with credential:
+
+GVFS Java client:
+
+```java
+Configuration conf = new Configuration();
+conf.set("fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs");
+conf.set("fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem");
+conf.set("fs.gravitino.server.uri", "http://localhost:8090";);
+conf.set("fs.gravitino.client.metalake", "test_metalake");
+// No need to set oss-access-key-id and oss-secret-access-key
+Path filesetPath = new 
Path("gvfs://fileset/oss_test_catalog/test_schema/test_fileset/new_dir");
+FileSystem fs = filesetPath.getFileSystem(conf);
+fs.mkdirs(filesetPath);
+...
+```
+
+Spark:
+
+```python
+spark = SparkSession.builder
+    .appName("oss_fileset_test")
+    .config("spark.hadoop.fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs")
+    .config("spark.hadoop.fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem")
+    .config("spark.hadoop.fs.gravitino.server.uri", "http://localhost:8090";)
+    .config("spark.hadoop.fs.gravitino.client.metalake", "test")
+    # No need to set oss-access-key-id and oss-secret-access-key
+    .config("spark.driver.memory", "2g")
+    .config("spark.driver.port", "2048")
+    .getOrCreate()
+```
+
+Python client and Hadoop command are similar to the above examples.
+
+
diff --git a/docs/hadoop-catalog-with-s3.md b/docs/hadoop-catalog-with-s3.md
new file mode 100644
index 0000000000..7d56f2b9ab
--- /dev/null
+++ b/docs/hadoop-catalog-with-s3.md
@@ -0,0 +1,541 @@
+---
+title: "Hadoop catalog with S3"
+slug: /hadoop-catalog-with-s3
+date: 2025-01-03
+keyword: Hadoop catalog S3
+license: "This software is licensed under the Apache License version 2."
+---
+
+This document explains how to configure a Hadoop catalog with S3 in Gravitino.
+
+## Prerequisites
+
+To create a Hadoop catalog with S3, follow these steps:
+
+1. Download the 
[`gravitino-aws-bundle-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-aws-bundle)
 file.
+2. Place this file in the Gravitino Hadoop catalog classpath at 
`${GRAVITINO_HOME}/catalogs/hadoop/libs/`.
+3. Start the Gravitino server using the following command:
+
+```bash
+$ ${GRAVITINO_HOME}/bin/gravitino-server.sh start
+```
+
+Once the server is up and running, you can proceed to configure the Hadoop 
catalog with S3. In the rest of this document we will use 
`http://localhost:8090` as the Gravitino server URL, please replace it with 
your actual server URL.
+
+## Configurations for creating a Hadoop catalog with S3
+
+### Configurations for S3 Hadoop Catalog
+
+In addition to the basic configurations mentioned in 
[Hadoop-catalog-catalog-configuration](./hadoop-catalog.md#catalog-properties), 
the following properties are necessary to configure a Hadoop catalog with S3:
+
+| Configuration item             | Description                                 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+|--------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| `filesystem-providers`         | The file system providers to add. Set it to 
`s3` if it's a S3 fileset, or a comma separated string that contains `s3` like 
`gs,s3` to support multiple kinds of fileset including `s3`.                    
                                                                                
                                                                                
                                                                                
               [...]
+| `default-filesystem-provider`  | The name default filesystem providers of 
this Hadoop catalog if users do not specify the scheme in the URI. Default 
value is `builtin-local`, for S3, if we set this value, we can omit the prefix 
's3a://' in the location.                                                       
                                                                                
                                                                                
                       [...]
+| `s3-endpoint`                  | The endpoint of the AWS S3. This 
configuration is optional for S3 service, but required for other S3-compatible 
storage services like MinIO.                                                    
                                                                                
                                                                                
                                                                                
                          [...]
+| `s3-access-key-id`             | The access key of the AWS S3.               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| `s3-secret-access-key`         | The secret key of the AWS S3.               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| `credential-providers`         | The credential provider types, separated by 
comma, possible value can be `s3-token`, `s3-secret-key`. As the default 
authentication type is using AKSK as the above, this configuration can enable 
credential vending provided by Gravitino server and client will no longer need 
to provide authentication information like AKSK to access S3 by GVFS. Once it's 
set, more configuration items are needed to make it works, please see 
[s3-credential-vending](security/ [...]
+
+### Configurations for a schema
+
+To learn how to create a schema, refer to [Schema 
configurations](./hadoop-catalog.md#schema-properties).
+
+### Configurations for a fileset
+
+For more details on creating a fileset, Refer to [Fileset 
configurations](./hadoop-catalog.md#fileset-properties).
+
+
+## Using the Hadoop catalog with S3
+
+This section demonstrates how to use the Hadoop catalog with S3 in Gravitino, 
with a complete example.
+
+### Step1: Create a Hadoop Catalog with S3
+
+First of all, you need to create a Hadoop catalog with S3. The following 
example shows how to create a Hadoop catalog with S3:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "test_catalog",
+  "type": "FILESET",
+  "comment": "This is a S3 fileset catalog",
+  "provider": "hadoop",
+  "properties": {
+    "location": "s3a://bucket/root",
+    "s3-access-key-id": "access_key",
+    "s3-secret-access-key": "secret_key",
+    "s3-endpoint": "http://s3.ap-northeast-1.amazonaws.com";,
+    "filesystem-providers": "s3"
+  }
+}' http://localhost:8090/api/metalakes/metalake/catalogs
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+GravitinoClient gravitinoClient = GravitinoClient
+    .builder("http://localhost:8090";)
+    .withMetalake("metalake")
+    .build();
+
+Map<String, String> s3Properties = ImmutableMap.<String, String>builder()
+    .put("location", "s3a://bucket/root")
+    .put("s3-access-key-id", "access_key")
+    .put("s3-secret-access-key", "secret_key")
+    .put("s3-endpoint", "http://s3.ap-northeast-1.amazonaws.com";)
+    .put("filesystem-providers", "s3")
+    .build();
+
+Catalog s3Catalog = gravitinoClient.createCatalog("test_catalog",
+    Type.FILESET,
+    "hadoop", // provider, Gravitino only supports "hadoop" for now.
+    "This is a S3 fileset catalog",
+    s3Properties);
+// ...
+
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="metalake")
+s3_properties = {
+    "location": "s3a://bucket/root",
+    "s3-access-key-id": "access_key"
+    "s3-secret-access-key": "secret_key",
+    "s3-endpoint": "http://s3.ap-northeast-1.amazonaws.com";,
+    "filesystem-providers": "s3"
+}
+
+s3_catalog = gravitino_client.create_catalog(name="test_catalog",
+                                             type=Catalog.Type.FILESET,
+                                             provider="hadoop",
+                                             comment="This is a S3 fileset 
catalog",
+                                             properties=s3_properties)
+```
+
+</TabItem>
+</Tabs>
+
+:::note
+When using S3 with Hadoop, ensure that the location value starts with s3a:// 
(not s3://) for AWS S3. For example, use s3a://bucket/root, as the s3:// format 
is not supported by the hadoop-aws library.
+:::
+
+### Step2: Create a schema
+
+Once your Hadoop catalog with S3 is created, you can create a schema under the 
catalog. Here are examples of how to do that:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "test_schema",
+  "comment": "This is a S3 schema",
+  "properties": {
+    "location": "s3a://bucket/root/schema"
+  }
+}' http://localhost:8090/api/metalakes/metalake/catalogs/test_catalog/schemas
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+Catalog catalog = gravitinoClient.loadCatalog("hive_catalog");
+
+SupportsSchemas supportsSchemas = catalog.asSchemas();
+
+Map<String, String> schemaProperties = ImmutableMap.<String, String>builder()
+    .put("location", "s3a://bucket/root/schema")
+    .build();
+Schema schema = supportsSchemas.createSchema("test_schema",
+    "This is a S3 schema",
+    schemaProperties
+);
+// ...
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="metalake")
+catalog: Catalog = gravitino_client.load_catalog(name="test_catalog")
+catalog.as_schemas().create_schema(name="test_schema",
+                                   comment="This is a S3 schema",
+                                   properties={"location": 
"s3a://bucket/root/schema"})
+```
+
+</TabItem>
+</Tabs>
+
+### Step3: Create a fileset
+
+After creating the schema, you can create a fileset. Here are examples for 
creating a fileset:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "example_fileset",
+  "comment": "This is an example fileset",
+  "type": "MANAGED",
+  "storageLocation": "s3a://bucket/root/schema/example_fileset",
+  "properties": {
+    "k1": "v1"
+  }
+}' 
http://localhost:8090/api/metalakes/metalake/catalogs/test_catalog/schemas/test_schema/filesets
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+GravitinoClient gravitinoClient = GravitinoClient
+    .builder("http://localhost:8090";)
+    .withMetalake("metalake")
+    .build();
+
+Catalog catalog = gravitinoClient.loadCatalog("test_catalog");
+FilesetCatalog filesetCatalog = catalog.asFilesetCatalog();
+
+Map<String, String> propertiesMap = ImmutableMap.<String, String>builder()
+      .put("k1", "v1")
+      .build();
+
+filesetCatalog.createFileset(
+    NameIdentifier.of("test_schema", "example_fileset"),
+    "This is an example fileset",
+    Fileset.Type.MANAGED,
+    "s3a://bucket/root/schema/example_fileset",
+    propertiesMap,
+);
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="metalake")
+
+catalog: Catalog = gravitino_client.load_catalog(name="catalog")
+catalog.as_fileset_catalog().create_fileset(ident=NameIdentifier.of("schema", 
"example_fileset"),
+                                            type=Fileset.Type.MANAGED,
+                                            comment="This is an example 
fileset",
+                                            
storage_location="s3a://bucket/root/schema/example_fileset",
+                                            properties={"k1": "v1"})
+```
+
+</TabItem>
+</Tabs>
+
+## Accessing a fileset with S3
+
+### Using the GVFS Java client to access the fileset
+
+To access fileset with S3 using the GVFS Java client, based on the [basic GVFS 
configurations](./how-to-use-gvfs.md#configuration-1), you need to add the 
following configurations:
+
+| Configuration item     | Description                                         
                                                                                
         | Default value | Required | Since version    |
+|------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|------------------|
+| `s3-endpoint`          | The endpoint of the AWS S3. This configuration is 
optional for S3 service, but required for other S3-compatible storage services 
like MinIO. | (none)        | No       | 0.7.0-incubating |
+| `s3-access-key-id`     | The access key of the AWS S3.                       
                                                                                
         | (none)        | Yes      | 0.7.0-incubating |
+| `s3-secret-access-key` | The secret key of the AWS S3.                       
                                                                                
         | (none)        | Yes      | 0.7.0-incubating |
+
+:::note
+- `s3-endpoint` is an optional configuration for AWS S3, however, it is 
required for other S3-compatible storage services like MinIO.
+- If the catalog has enabled [credential 
vending](security/credential-vending.md), the properties above can be omitted. 
More details can be found in [Fileset with credential 
vending](#fileset-with-credential-vending).
+:::
+
+```java
+Configuration conf = new Configuration();
+conf.set("fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs");
+conf.set("fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem");
+conf.set("fs.gravitino.server.uri", "http://localhost:8090";);
+conf.set("fs.gravitino.client.metalake", "test_metalake");
+conf.set("s3-endpoint", "http://localhost:8090";);
+conf.set("s3-access-key-id", "minio");
+conf.set("s3-secret-access-key", "minio123");
+
+Path filesetPath = new 
Path("gvfs://fileset/adls_catalog/adls_schema/adls_fileset/new_dir");
+FileSystem fs = filesetPath.getFileSystem(conf);
+fs.mkdirs(filesetPath);
+...
+```
+
+Similar to Spark configurations, you need to add S3 (bundle) jars to the 
classpath according to your environment.
+
+```xml
+  <dependency>
+    <groupId>org.apache.hadoop</groupId>
+    <artifactId>hadoop-common</artifactId>
+    <version>${HADOOP_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.hadoop</groupId>
+    <artifactId>hadoop-aws</artifactId>
+    <version>${HADOOP_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>filesystem-hadoop3-runtime</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>gravitino-aws</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+```
+
+Or use the bundle jar with Hadoop environment if there is no Hadoop 
environment:
+
+
+```xml
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>gravitino-aws-bundle</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.gravitino</groupId>
+    <artifactId>filesystem-hadoop3-runtime</artifactId>
+    <version>${GRAVITINO_VERSION}</version>
+  </dependency>
+```
+
+### Using Spark to access the fileset
+
+The following Python code demonstrates how to use **PySpark 3.1.3 with Hadoop 
environment(Hadoop 3.2.0)** to access the fileset:
+
+Before running the following code, you need to install required packages:
+
+```bash
+pip install pyspark==3.1.3
+pip install apache-gravitino==${GRAVITINO_VERSION}
+```
+Then you can run the following code:
+
+```python
+from pyspark.sql import SparkSession
+import os
+
+gravitino_url = "http://localhost:8090";
+metalake_name = "test"
+
+catalog_name = "your_s3_catalog"
+schema_name = "your_s3_schema"
+fileset_name = "your_s3_fileset"
+
+os.environ["PYSPARK_SUBMIT_ARGS"] = "--jars 
/path/to/gravitino-aws-${gravitino-version}.jar,/path/to/gravitino-filesystem-hadoop3-runtime-${gravitino-version}-SNAPSHOT.jar,/path/to/hadoop-aws-3.2.0.jar,/path/to/aws-java-sdk-bundle-1.11.375.jar
 --master local[1] pyspark-shell"
+spark = SparkSession.builder
+    .appName("s3_fielset_test")
+    .config("spark.hadoop.fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs")
+    .config("spark.hadoop.fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem")
+    .config("spark.hadoop.fs.gravitino.server.uri", "http://localhost:8090";)
+    .config("spark.hadoop.fs.gravitino.client.metalake", "test")
+    .config("spark.hadoop.s3-access-key-id", os.environ["S3_ACCESS_KEY_ID"])
+    .config("spark.hadoop.s3-secret-access-key", 
os.environ["S3_SECRET_ACCESS_KEY"])
+    .config("spark.hadoop.s3-endpoint", 
"http://s3.ap-northeast-1.amazonaws.com";)
+    .config("spark.driver.memory", "2g")
+    .config("spark.driver.port", "2048")
+    .getOrCreate()
+
+data = [("Alice", 25), ("Bob", 30), ("Cathy", 45)]
+columns = ["Name", "Age"]
+spark_df = spark.createDataFrame(data, schema=columns)
+gvfs_path = 
f"gvfs://fileset/{catalog_name}/{schema_name}/{fileset_name}/people"
+
+spark_df.coalesce(1).write
+    .mode("overwrite")
+    .option("header", "true")
+    .csv(gvfs_path)
+```
+
+If your Spark **without Hadoop environment**, you can use the following code 
snippet to access the fileset:
+    
+```python
+## Replace the following code snippet with the above code snippet with the 
same environment variables
+os.environ["PYSPARK_SUBMIT_ARGS"] = "--jars 
/path/to/gravitino-aws-bundle-${gravitino-version}.jar,/path/to/gravitino-filesystem-hadoop3-runtime-${gravitino-version}-SNAPSHOT.jar
 --master local[1] pyspark-shell"
+```
+
+- 
[`gravitino-aws-bundle-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-aws-bundle)
 is the Gravitino AWS jar with Hadoop environment(3.3.1) and `hadoop-aws` jar.
+- 
[`gravitino-aws-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-aws)
 is a condensed version of the Gravitino AWS bundle jar without Hadoop 
environment and `hadoop-aws` jar.
+- `hadoop-aws-3.2.0.jar` and `aws-java-sdk-bundle-1.11.375.jar` can be found 
in the Hadoop distribution in the `${HADOOP_HOME}/share/hadoop/tools/lib` 
directory. 
+
+Please choose the correct jar according to your environment.
+
+:::note
+In some Spark versions, a Hadoop environment is needed by the driver, adding 
the bundle jars with '--jars' may not work. If this is the case, you should add 
the jars to the spark CLASSPATH directly.
+:::
+
+### Accessing a fileset using the Hadoop fs command
+
+The following are examples of how to use the `hadoop fs` command to access the 
fileset in Hadoop 3.1.3.
+
+1. Adding the following contents to the 
`${HADOOP_HOME}/etc/hadoop/core-site.xml` file:
+
+```xml
+  <property>
+    <name>fs.AbstractFileSystem.gvfs.impl</name>
+    <value>org.apache.gravitino.filesystem.hadoop.Gvfs</value>
+  </property>
+
+  <property>
+    <name>fs.gvfs.impl</name>
+    
<value>org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem</value>
+  </property>
+
+  <property>
+    <name>fs.gravitino.server.uri</name>
+    <value>http://localhost:8090</value>
+  </property>
+
+  <property>
+    <name>fs.gravitino.client.metalake</name>
+    <value>test</value>
+  </property>
+
+  <property>
+    <name>s3-endpoint</name>
+    <value>http://s3.ap-northeast-1.amazonaws.com</value>
+  </property>
+
+  <property>
+    <name>s3-access-key-id</name>
+    <value>access-key</value>
+  </property>
+  
+  <property>
+  <name>s3-secret-access-key</name>
+    <value>secret-key</value>
+  </property>
+```
+
+2. Add the necessary jars to the Hadoop classpath. 
+
+For S3, you need to add 
`gravitino-filesystem-hadoop3-runtime-${gravitino-version}.jar`, 
`gravitino-aws-${gravitino-version}.jar` and `hadoop-aws-${hadoop-version}.jar` 
located at `${HADOOP_HOME}/share/hadoop/tools/lib/` to Hadoop classpath. 
+
+3. Run the following command to access the fileset:
+
+```shell
+./${HADOOP_HOME}/bin/hadoop dfs -ls 
gvfs://fileset/s3_catalog/s3_schema/s3_fileset
+./${HADOOP_HOME}/bin/hadoop dfs -put /path/to/local/file 
gvfs://fileset/s3_catalog/s3_schema/s3_fileset
+```
+
+### Using the GVFS Python client to access a fileset
+
+In order to access fileset with S3 using the GVFS Python client, apart from 
[basic GVFS configurations](./how-to-use-gvfs.md#configuration-1), you need to 
add the following configurations:
+
+| Configuration item     | Description                                         
                                                                                
         | Default value | Required | Since version    |
+|------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|------------------|
+| `s3_endpoint`          | The endpoint of the AWS S3. This configuration is 
optional for S3 service, but required for other S3-compatible storage services 
like MinIO. | (none)        | No       | 0.7.0-incubating |
+| `s3_access_key_id`     | The access key of the AWS S3.                       
                                                                                
         | (none)        | Yes      | 0.7.0-incubating |
+| `s3_secret_access_key` | The secret key of the AWS S3.                       
                                                                                
         | (none)        | Yes      | 0.7.0-incubating |
+
+:::note
+- `s3_endpoint` is an optional configuration for AWS S3, however, it is 
required for other S3-compatible storage services like MinIO.
+- If the catalog has enabled [credential 
vending](security/credential-vending.md), the properties above can be omitted.
+:::
+
+Please install the `gravitino` package before running the following code:
+
+```bash
+pip install apache-gravitino==${GRAVITINO_VERSION}
+```
+
+```python
+from gravitino import gvfs
+options = {
+    "cache_size": 20,
+    "cache_expired_time": 3600,
+    "auth_type": "simple",
+    "s3_endpoint": "http://localhost:8090";,
+    "s3_access_key_id": "minio",
+    "s3_secret_access_key": "minio123"
+}
+fs = gvfs.GravitinoVirtualFileSystem(server_uri="http://localhost:8090";, 
metalake_name="test_metalake", options=options)
+fs.ls("gvfs://fileset/{catalog_name}/{schema_name}/{fileset_name}/")           
                                                              ")
+```
+
+### Using fileset with pandas
+
+The following are examples of how to use the pandas library to access the S3 
fileset
+
+```python
+import pandas as pd
+
+storage_options = {
+    "server_uri": "http://localhost:8090";, 
+    "metalake_name": "test",
+    "options": {
+        "s3_access_key_id": "access_key",
+        "s3_secret_access_key": "secret_key",
+        "s3_endpoint": "http://s3.ap-northeast-1.amazonaws.com";
+    }
+}
+ds = 
pd.read_csv(f"gvfs://fileset/${catalog_name}/${schema_name}/${fileset_name}/people/part-00000-51d366e2-d5eb-448d-9109-32a96c8a14dc-c000.csv",
+                 storage_options=storage_options)
+ds.head()
+```
+
+For more use cases, please refer to the [Gravitino Virtual File 
System](./how-to-use-gvfs.md) document.
+
+## Fileset with credential vending
+
+Since 0.8.0-incubating, Gravitino supports credential vending for S3 fileset. 
If the catalog has been [configured with 
credential](./security/credential-vending.md), you can access S3 fileset 
without providing authentication information like `s3-access-key-id` and 
`s3-secret-access-key` in the properties.
+
+### How to create a S3 Hadoop catalog with credential enabled
+
+Apart from configuration method in 
[create-s3-hadoop-catalog](#configurations-for-s3-hadoop-catalog), properties 
needed by [s3-credential](./security/credential-vending.md#s3-credentials) 
should also be set to enable credential vending for S3 fileset.
+
+### How to access S3 fileset with credential
+
+If the catalog has been configured with credential, you can access S3 fileset 
without providing authentication information via GVFS Java/Python client and 
Spark. Let's see how to access S3 fileset with credential:
+
+GVFS Java client:
+
+```java
+Configuration conf = new Configuration();
+conf.set("fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs");
+conf.set("fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem");
+conf.set("fs.gravitino.server.uri", "http://localhost:8090";);
+conf.set("fs.gravitino.client.metalake", "test_metalake");
+// No need to set s3-access-key-id and s3-secret-access-key
+Path filesetPath = new 
Path("gvfs://fileset/test_catalog/test_schema/test_fileset/new_dir");
+FileSystem fs = filesetPath.getFileSystem(conf);
+fs.mkdirs(filesetPath);
+...
+```
+
+Spark:
+
+```python
+spark = SparkSession.builder
+    .appName("s3_fileset_test")
+    .config("spark.hadoop.fs.AbstractFileSystem.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.Gvfs")
+    .config("spark.hadoop.fs.gvfs.impl", 
"org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem")
+    .config("spark.hadoop.fs.gravitino.server.uri", "http://localhost:8090";)
+    .config("spark.hadoop.fs.gravitino.client.metalake", "test")
+    # No need to set s3-access-key-id and s3-secret-access-key
+    .config("spark.driver.memory", "2g")
+    .config("spark.driver.port", "2048")
+    .getOrCreate()
+```
+
+Python client and Hadoop command are similar to the above examples.
+
+
diff --git a/docs/hadoop-catalog.md b/docs/hadoop-catalog.md
index cbdae84689..4b951aedc6 100644
--- a/docs/hadoop-catalog.md
+++ b/docs/hadoop-catalog.md
@@ -9,9 +9,9 @@ license: "This software is licensed under the Apache License 
version 2."
 ## Introduction
 
 Hadoop catalog is a fileset catalog that using Hadoop Compatible File System 
(HCFS) to manage
-the storage location of the fileset. Currently, it supports local filesystem 
and HDFS. For
-object storage like S3, GCS, Azure Blob Storage and OSS, you can put the 
hadoop object store jar like
-`gravitino-aws-bundle-{gravitino-version}.jar` into the 
`$GRAVITINO_HOME/catalogs/hadoop/libs` directory to enable the support.
+the storage location of the fileset. Currently, it supports the local 
filesystem and HDFS. Since 0.7.0-incubating, Gravitino supports 
[S3](hadoop-catalog-with-S3.md), [GCS](hadoop-catalog-with-gcs.md), 
[OSS](hadoop-catalog-with-oss.md) and [Azure Blob 
Storage](hadoop-catalog-with-adls.md) through Hadoop catalog. 
+
+The rest of this document will use HDFS or local file as an example to 
illustrate how to use the Hadoop catalog. For S3, GCS, OSS and Azure Blob 
Storage, the configuration is similar to HDFS, please refer to the 
corresponding document for more details.
 
 Note that Gravitino uses Hadoop 3 dependencies to build Hadoop catalog. 
Theoretically, it should be
 compatible with both Hadoop 2.x and 3.x, since Gravitino doesn't leverage any 
new features in
@@ -23,17 +23,19 @@ Hadoop 3. If there's any compatibility issue, please create 
an [issue](https://g
 
 Besides the [common catalog 
properties](./gravitino-server-config.md#apache-gravitino-catalog-properties-configuration),
 the Hadoop catalog has the following properties:
 
-| Property Name                  | Description                                 
                                                        | Default Value | 
Required | Since Version    |
-|--------------------------------|-----------------------------------------------------------------------------------------------------|---------------|----------|------------------|
-| `location`                     | The storage location managed by Hadoop 
catalog.                                                     | (none)        | 
No       | 0.5.0            |
-| `filesystem-conn-timeout-secs` | The timeout of getting the file system 
using Hadoop FileSystem client instance. Time unit: seconds. | 6             | 
No       | 0.8.0-incubating |
-| `credential-providers`         | The credential provider types, separated by 
comma.                                                  | (none)        | No    
   | 0.8.0-incubating |
+| Property Name                  | Description                                 
                                                                                
                                                                                
                                                                                
                       | Default Value   | Required | Since Version    |
+|--------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|----------|------------------|
+| `location`                     | The storage location managed by Hadoop 
catalog.                                                                        
                                                                                
                                                                                
                            | (none)          | No       | 0.5.0            |
+| `default-filesystem-provider`  | The default filesystem provider of this 
Hadoop catalog if users do not specify the scheme in the URI. Candidate values 
are 'builtin-local', 'builtin-hdfs', 's3', 'gcs', 'abs' and 'oss'. Default 
value is `builtin-local`. For S3, if we set this value to 's3', we can omit the 
prefix 's3a://' in the location. | `builtin-local` | No       | 
0.7.0-incubating |
+| `filesystem-providers`         | The file system providers to add. Users 
needs to set this configuration to support cloud storage or custom HCFS. For 
instance, set it to `s3` or a comma separated string that contains `s3` like 
`gs,s3` to support multiple kinds of fileset including `s3`.                    
                                 | (none)          | Yes      | 
0.7.0-incubating |
+| `credential-providers`         | The credential provider types, separated by 
comma.                                                                          
                                                                                
                                                                                
                       | (none)          | No       | 0.8.0-incubating |
+| `filesystem-conn-timeout-secs` | The timeout of getting the file system 
using Hadoop FileSystem client instance. Time unit: seconds.                    
                                                                                
                                                                                
                            | 6               | No       | 0.8.0-incubating |
 
 Please refer to [Credential vending](./security/credential-vending.md) for 
more details about credential vending.
 
-Apart from the above properties, to access fileset like HDFS, S3, GCS, OSS or 
custom fileset, you need to configure the following extra properties.
+### HDFS fileset 
 
-#### HDFS fileset 
+Apart from the above properties, to access fileset like HDFS fileset, you need 
to configure the following extra properties.
 
 | Property Name                                      | Description             
                                                                       | 
Default Value | Required                                                    | 
Since Version |
 
|----------------------------------------------------|------------------------------------------------------------------------------------------------|---------------|-------------------------------------------------------------|---------------|
@@ -44,66 +46,13 @@ Apart from the above properties, to access fileset like 
HDFS, S3, GCS, OSS or cu
 | `authentication.kerberos.check-interval-sec`       | The check interval of 
Kerberos credential for Hadoop catalog.                                  | 60   
         | No                                                          | 0.5.1  
       |
 | `authentication.kerberos.keytab-fetch-timeout-sec` | The fetch timeout of 
retrieving Kerberos keytab from `authentication.kerberos.keytab-uri`.     | 60  
          | No                                                          | 0.5.1 
        |
 
-#### S3 fileset
-
-| Configuration item            | Description                                  
                                                                                
                                                                                
                | Default value   | Required                  | Since version   
 |
-|-------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|---------------------------|------------------|
-| `filesystem-providers`        | The file system providers to add. Set it to 
`s3` if it's a S3 fileset, or a comma separated string that contains `s3` like 
`gs,s3` to support multiple kinds of fileset including `s3`.                    
                  | (none)          | Yes                       | 
0.7.0-incubating |
-| `default-filesystem-provider` | The name default filesystem providers of 
this Hadoop catalog if users do not specify the scheme in the URI. Default 
value is `builtin-local`, for S3, if we set this value, we can omit the prefix 
's3a://' in the location. | `builtin-local` | No                        | 
0.7.0-incubating |
-| `s3-endpoint`                 | The endpoint of the AWS S3.                  
                                                                                
                                                                                
                | (none)          | Yes if it's a S3 fileset. | 
0.7.0-incubating |
-| `s3-access-key-id`            | The access key of the AWS S3.                
                                                                                
                                                                                
                | (none)          | Yes if it's a S3 fileset. | 
0.7.0-incubating |
-| `s3-secret-access-key`        | The secret key of the AWS S3.                
                                                                                
                                                                                
                | (none)          | Yes if it's a S3 fileset. | 
0.7.0-incubating |
-
-Please refer to [S3 
credentials](./security/credential-vending.md#s3-credentials) for credential 
related configurations.
-
-At the same time, you need to place the corresponding bundle jar 
[`gravitino-aws-bundle-${version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-aws-bundle/)
 in the directory `${GRAVITINO_HOME}/catalogs/hadoop/libs`.
-
-#### GCS fileset
-
-| Configuration item            | Description                                  
                                                                                
                                                                                
                | Default value   | Required                   | Since version  
  |
-|-------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|----------------------------|------------------|
-| `filesystem-providers`        | The file system providers to add. Set it to 
`gs` if it's a GCS fileset, a comma separated string that contains `gs` like 
`gs,s3` to support multiple kinds of fileset including `gs`.                    
                    | (none)          | Yes                        | 
0.7.0-incubating |
-| `default-filesystem-provider` | The name default filesystem providers of 
this Hadoop catalog if users do not specify the scheme in the URI. Default 
value is `builtin-local`, for GCS, if we set this value, we can omit the prefix 
'gs://' in the location. | `builtin-local` | No                         | 
0.7.0-incubating |
-| `gcs-service-account-file`    | The path of GCS service account JSON file.   
                                                                                
                                                                                
                | (none)          | Yes if it's a GCS fileset. | 
0.7.0-incubating |
-
-Please refer to [GCS 
credentials](./security/credential-vending.md#gcs-credentials) for credential 
related configurations.
-
-In the meantime, you need to place the corresponding bundle jar 
[`gravitino-gcp-bundle-${version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-gcp-bundle/)
 in the directory `${GRAVITINO_HOME}/catalogs/hadoop/libs`.
-
-#### OSS fileset
-
-| Configuration item            | Description                                  
                                                                                
                                                                                
                 | Default value   | Required                   | Since version 
   |
-|-------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|----------------------------|------------------|
-| `filesystem-providers`        | The file system providers to add. Set it to 
`oss` if it's a OSS fileset, or a comma separated string that contains `oss` 
like `oss,gs,s3` to support multiple kinds of fileset including `oss`.          
                     | (none)          | Yes                        | 
0.7.0-incubating |
-| `default-filesystem-provider` | The name default filesystem providers of 
this Hadoop catalog if users do not specify the scheme in the URI. Default 
value is `builtin-local`, for OSS, if we set this value, we can omit the prefix 
'oss://' in the location. | `builtin-local` | No                         | 
0.7.0-incubating |
-| `oss-endpoint`                | The endpoint of the Aliyun OSS.              
                                                                                
                                                                                
                 | (none)          | Yes if it's a OSS fileset. | 
0.7.0-incubating |
-| `oss-access-key-id`           | The access key of the Aliyun OSS.            
                                                                                
                                                                                
                 | (none)          | Yes if it's a OSS fileset. | 
0.7.0-incubating |
-| `oss-secret-access-key`       | The secret key of the Aliyun OSS.            
                                                                                
                                                                                
                 | (none)          | Yes if it's a OSS fileset. | 
0.7.0-incubating |
-
-Please refer to [OSS 
credentials](./security/credential-vending.md#oss-credentials) for credential 
related configurations.
-
-In the meantime, you need to place the corresponding bundle jar 
[`gravitino-aliyun-bundle-${version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-aliyun-bundle/)
 in the directory `${GRAVITINO_HOME}/catalogs/hadoop/libs`.
-
-
-#### Azure Blob Storage fileset
-
-| Configuration item                | Description                              
                                                                                
                                                                                
                                      | Default value   | Required              
                    | Since version    |
-|-----------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|-------------------------------------------|------------------|
-| `filesystem-providers`            | The file system providers to add. Set it 
to `abs` if it's a Azure Blob Storage fileset, or a comma separated string that 
contains `abs` like `oss,abs,s3` to support multiple kinds of fileset including 
`abs`.                                | (none)          | Yes                   
                    | 0.8.0-incubating |
-| `default-filesystem-provider`     | The name default filesystem providers of 
this Hadoop catalog if users do not specify the scheme in the URI. Default 
value is `builtin-local`, for Azure Blob Storage, if we set this value, we can 
omit the prefix 'abfss://' in the location. | `builtin-local` | No              
                          | 0.8.0-incubating |
-| `azure-storage-account-name `     | The account name of Azure Blob Storage.  
                                                                                
                                                                                
                                      | (none)          | Yes if it's a Azure 
Blob Storage fileset. | 0.8.0-incubating |
-| `azure-storage-account-key`       | The account key of Azure Blob Storage.   
                                                                                
                                                                                
                                      | (none)          | Yes if it's a Azure 
Blob Storage fileset. | 0.8.0-incubating |
-
-Please refer to [ADLS 
credentials](./security/credential-vending.md#adls-credentials) for credential 
related configurations.
-
-Similar to the above, you need to place the corresponding bundle jar 
[`gravitino-azure-bundle-${version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-azure-bundle/)
 in the directory `${GRAVITINO_HOME}/catalogs/hadoop/libs`.
-
-:::note
-- Gravitino contains builtin file system providers for local file 
system(`builtin-local`) and HDFS(`builtin-hdfs`), that is to say if 
`filesystem-providers` is not set, Gravitino will still support local file 
system and HDFS. Apart from that, you can set the `filesystem-providers` to 
support other file systems like S3, GCS, OSS or custom file system.
-- `default-filesystem-provider` is used to set the default file system 
provider for the Hadoop catalog. If the user does not specify the scheme in the 
URI, Gravitino will use the default file system provider to access the fileset. 
For example, if the default file system provider is set to `builtin-local`, the 
user can omit the prefix `file:///` in the location. 
-:::
+### Hadoop catalog with Cloud Storage
+- For S3, please refer to 
[Hadoop-catalog-with-s3](./hadoop-catalog-with-s3.md) for more details.
+- For GCS, please refer to 
[Hadoop-catalog-with-gcs](./hadoop-catalog-with-gcs.md) for more details.
+- For OSS, please refer to 
[Hadoop-catalog-with-oss](./hadoop-catalog-with-oss.md) for more details.
+- For Azure Blob Storage, please refer to 
[Hadoop-catalog-with-adls](./hadoop-catalog-with-adls.md) for more details.
 
-#### How to custom your own HCFS file system fileset?
+### How to custom your own HCFS file system fileset?
 
 Developers and users can custom their own HCFS file system fileset by 
implementing the `FileSystemProvider` interface in the jar 
[gravitino-catalog-hadoop](https://repo1.maven.org/maven2/org/apache/gravitino/catalog-hadoop/).
 The `FileSystemProvider` interface is defined as follows:
 
diff --git a/docs/how-to-use-gvfs.md b/docs/how-to-use-gvfs.md
index aff3b74adf..cbbb67dd37 100644
--- a/docs/how-to-use-gvfs.md
+++ b/docs/how-to-use-gvfs.md
@@ -42,7 +42,9 @@ the path mapping and convert automatically.
 
 ### Prerequisites
 
-+ A Hadoop environment with HDFS or other Hadoop Compatible File System (HCFS) 
implementations like S3, GCS, etc. GVFS has been tested against Hadoop 3.3.1. 
It is recommended to use Hadoop 3.3.1 or later, but it should work with Hadoop 
2.x. Please create an [issue](https://www.github.com/apache/gravitino/issues) 
if you find any compatibility issues.
+ - GVFS has been tested against Hadoop 3.3.1. It is recommended to use Hadoop 
3.3.1 or later, but it should work with Hadoop 2.
+  x. Please create an [issue](https://www.github.com/apache/gravitino/issues) 
if you find any
+  compatibility issues.
 
 ### Configuration
 
@@ -64,55 +66,8 @@ the path mapping and convert automatically.
 | `fs.gravitino.fileset.cache.evictionMillsAfterAccess` | The value of time 
that the cache expires after accessing in the Gravitino Virtual File System. 
The value is in `milliseconds`.                                                 
                         | `3600000`     | No                                  
| 0.5.0           |
 | `fs.gravitino.fileset.cache.evictionMillsAfterAccess` | The value of time 
that the cache expires after accessing in the Gravitino Virtual File System. 
The value is in `milliseconds`.                                                 
                         | `3600000`     | No                                  
| 0.5.0           |
 
-Apart from the above properties, to access fileset like S3, GCS, OSS and 
custom fileset, you need to configure the following extra properties.
-
-#### S3 fileset
-
-| Configuration item     | Description                   | Default value | 
Required                  | Since version    |
-|------------------------|-------------------------------|---------------|---------------------------|------------------|
-| `s3-endpoint`          | The endpoint of the AWS S3.   | (none)        | Yes 
if it's a S3 fileset. | 0.7.0-incubating |
-| `s3-access-key-id`     | The access key of the AWS S3. | (none)        | Yes 
if it's a S3 fileset. | 0.7.0-incubating |
-| `s3-secret-access-key` | The secret key of the AWS S3. | (none)        | Yes 
if it's a S3 fileset. | 0.7.0-incubating |
-
-At the same time, you need to add the corresponding bundle jar
-1. 
[`gravitino-aws-bundle-${version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-aws-bundle/)
 in the classpath if no hadoop environment is available, or
-2. 
[`gravitino-aws-${version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-aws/)
 and hadoop-aws jar and other necessary dependencies in the classpath.
-
-
-#### GCS fileset
-
-| Configuration item         | Description                                | 
Default value | Required                   | Since version    |
-|----------------------------|--------------------------------------------|---------------|----------------------------|------------------|
-| `gcs-service-account-file` | The path of GCS service account JSON file. | 
(none)        | Yes if it's a GCS fileset. | 0.7.0-incubating |
-
-In the meantime, you need to add the corresponding bundle jar
-1. 
[`gravitino-gcp-bundle-${version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-gcp-bundle/)
 in the classpath if no hadoop environment is available, or
-2. or 
[`gravitino-gcp-${version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-gcp/)
 and [gcs-connector 
jar](https://github.com/GoogleCloudDataproc/hadoop-connectors/releases) and 
other necessary dependencies in the classpath.
-
-
-#### OSS fileset
-
-| Configuration item              | Description                                
                                                                                
                                                                    | Default 
value | Required                  | Since version    |
-|---------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|---------------------------|------------------|
-| `oss-endpoint`                  | The endpoint of the Aliyun OSS.            
                                                                                
                                                                    | (none)    
    | Yes if it's a OSS fileset.| 0.7.0-incubating |
-| `oss-access-key-id`             | The access key of the Aliyun OSS.          
                                                                                
                                                                    | (none)    
    | Yes if it's a OSS fileset.| 0.7.0-incubating |
-| `oss-secret-access-key`         | The secret key of the Aliyun OSS.          
                                                                                
                                                                    | (none)    
    | Yes if it's a OSS fileset.| 0.7.0-incubating |
-
-
-In the meantime, you need to place the corresponding bundle jar
-1. 
[`gravitino-aliyun-bundle-${version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-aliyun-bundle/)
 in the classpath if no hadoop environment is available, or
-2. 
[`gravitino-aliyun-${version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-aliyun/)
 and hadoop-aliyun jar and other necessary dependencies in the classpath.
-
-#### Azure Blob Storage fileset
-
-| Configuration item                | Description                              
                                                                                
                                                                                
       | Default value | Required                                  | Since 
version    |
-|-----------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|-------------------------------------------|------------------|
-| `azure-storage-account-name`      | The account name of Azure Blob Storage.  
                                                                                
                                                                                
       | (none)        | Yes if it's a Azure Blob Storage fileset. | 
0.8.0-incubating |
-| `azure-storage-account-key`       | The account key of Azure Blob Storage.   
                                                                                
                                                                                
       | (none)        | Yes if it's a Azure Blob Storage fileset. | 
0.8.0-incubating |
-
-Similar to the above, you need to place the corresponding bundle jar
-1. 
[`gravitino-azure-bundle-${version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-azure-bundle/)
 in the classpath if no hadoop environment is available, or 
-2. 
[`gravitino-azure-${version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-azure/)
 and hadoop-azure jar and other necessary dependencies in the classpath.
+Apart from the above properties, to access fileset like S3, GCS, OSS and 
custom fileset, extra properties are needed, please see 
+[S3 GVFS Java client 
configurations](./hadoop-catalog-with-s3.md#using-the-gvfs-java-client-to-access-the-fileset),
 [GCS GVFS Java client 
configurations](./hadoop-catalog-with-gcs.md#using-the-gvfs-java-client-to-access-the-fileset),
 [OSS GVFS Java client 
configurations](./hadoop-catalog-with-oss.md#using-the-gvfs-java-client-to-access-the-fileset)
 and [Azure Blob Storage GVFS Java client 
configurations](./hadoop-catalog-with-adls.md#using-the-gvfs-java-client-to-access-the-fileset)
 for  [...]
 
 #### Custom fileset 
 Since 0.7.0-incubating, users can define their own fileset type and configure 
the corresponding properties, for more, please refer to [Custom 
Fileset](./hadoop-catalog.md#how-to-custom-your-own-hcfs-file-system-fileset).
@@ -132,26 +87,10 @@ You can configure these properties in two ways:
     
conf.set("fs.gvfs.impl","org.apache.gravitino.filesystem.hadoop.GravitinoVirtualFileSystem");
     conf.set("fs.gravitino.server.uri","http://localhost:8090";);
     conf.set("fs.gravitino.client.metalake","test_metalake");
-   
-    // Optional. It's only for S3 catalog. For GCS and OSS catalog, you should 
set the corresponding properties.
-    conf.set("s3-endpoint", "http://localhost:9000";);
-    conf.set("s3-access-key-id", "minio");
-    conf.set("s3-secret-access-key", "minio123"); 
-   
     Path filesetPath = new 
Path("gvfs://fileset/test_catalog/test_schema/test_fileset_1");
     FileSystem fs = filesetPath.getFileSystem(conf);
     ```
    
-:::note
-If you want to access the S3, GCS, OSS or custom fileset through GVFS, apart 
from the above properties, you need to place the corresponding bundle jars in 
the Hadoop environment. 
-For example, if you want to access the S3 fileset, you need to place
-1. The aws hadoop bundle jar 
[`gravitino-aws-bundle-${gravitino-version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-aws-bundle/)
-2. or 
[`gravitino-aws-${gravitino-version}.jar`](https://repo1.maven.org/maven2/org/apache/gravitino/gravitino-aws/),
 and hadoop-aws jar and other necessary dependencies
-
-to the classpath, it typically locates in 
`${HADOOP_HOME}/share/hadoop/common/lib/`). 
-
-:::
-
 2. Configure the properties in the `core-site.xml` file of the Hadoop 
environment:
 
     ```xml
@@ -174,20 +113,6 @@ to the classpath, it typically locates in 
`${HADOOP_HOME}/share/hadoop/common/li
         <name>fs.gravitino.client.metalake</name>
         <value>test_metalake</value>
       </property>
-   
-      <!-- Optional. It's only for S3 catalog. For GCs and OSS catalog, you 
should set the corresponding properties. -->
-      <property>
-        <name>s3-endpoint</name>
-        <value>http://localhost:9000</value>
-      </property>
-      <property>
-        <name>s3-access-key-id</name>
-        <value>minio</value>
-      </property>
-      <property>
-        <name>s3-secret-access-key</name>
-        <value>minio123</value>
-      </property>
     ```
 
 ### Usage examples
@@ -223,12 +148,6 @@ cp gravitino-filesystem-hadoop3-runtime-{version}.jar 
${HADOOP_HOME}/share/hadoo
 # You need to ensure that the Kerberos has permission on the HDFS directory.
 kinit -kt your_kerberos.keytab [email protected]
 
-
-# 4. Copy other dependencies to the Hadoop environment if you want to access 
the S3 fileset via GVFS
-cp bundles/aws-bundle/build/libs/gravitino-aws-bundle-{version}.jar 
${HADOOP_HOME}/share/hadoop/common/lib/ 
-cp 
clients/filesystem-hadoop3-runtime/build/libs/gravitino-filesystem-hadoop3-runtime-{version}-SNAPSHOT.jar
 ${HADOOP_HOME}/share/hadoop/common/lib/ 
-cp ${HADOOP_HOME}/share/hadoop/tools/lib/* 
${HADOOP_HOME}/share/hadoop/common/lib/ 
-
 # 4. Try to list the fileset
 ./${HADOOP_HOME}/bin/hadoop dfs -ls 
gvfs://fileset/test_catalog/test_schema/test_fileset_1
 ```
@@ -239,36 +158,6 @@ You can also perform operations on the files or 
directories managed by fileset t
 Make sure that your code is using the correct Hadoop environment, and that 
your environment
 has the `gravitino-filesystem-hadoop3-runtime-{version}.jar` dependency.
 
-```xml
-
-<dependency>
-  <groupId>org.apache.gravitino</groupId>
-  <artifactId>filesystem-hadoop3-runtime</artifactId>
-  <version>{gravitino-version}</version>
-</dependency>
-
-<!--  Use the following one if there is not hadoop environment -->
-<dependency>
-  <groupId>org.apache.gravitino</groupId>
-  <artifactId>gravitino-aws-bundle</artifactId>
-  <version>{gravitino-version}</version>
-</dependency>
-  
-<!-- Use the following one if there already have hadoop environment -->
-<dependency>
-  <groupId>org.apache.gravitino</groupId>
-  <artifactId>gravitino-aws</artifactId>
-  <version>{gravitino-version}</version>
-</dependency>
-
-<dependency>
-  <groupId>org.apache.hadoop</groupId>
-  <artifactId>hadoop-aws</artifactId>
-  <version>{hadoop-version}</version>
-</dependency>
-
-```
-
 For example:
 
 ```java
@@ -321,7 +210,6 @@ fs.getFileStatus(filesetPath);
     rdd.foreach(println)
     ```
 
-
 #### Via Tensorflow
 
 For Tensorflow to support GVFS, you need to recompile the 
[tensorflow-io](https://github.com/tensorflow/io) module.
@@ -468,61 +356,14 @@ to recompile the native libraries like `libhdfs` and 
others, and completely repl
 | `oauth2_scope`                | The auth scope for the Gravitino client when 
using `oauth2` auth type with the Gravitino Virtual File System.                
                                                                                
                                                                                
      | (none)        | Yes if you use `oauth2` auth type | 0.7.0-incubating |
 | `credential_expiration_ratio` | The ratio of expiration time for credential 
from Gravitino. This is used in the cases where Gravitino Hadoop catalogs have 
enable credential vending. if the expiration time of credential fetched from 
Gravitino is 1 hour, GVFS client will try to refresh the credential in 1 * 0.9 
= 0.5 hour. | 0.5           | No                                | 
0.8.0-incubating |
 
+#### Configurations for S3, GCS, OSS and Azure Blob storage fileset
 
-#### Extra configuration for S3, GCS, OSS fileset
-
-The following properties are required if you want to access the S3 fileset via 
the GVFS python client:
-
-| Configuration item         | Description                  | Default value | 
Required                 | Since version    |
-|----------------------------|------------------------------|---------------|--------------------------|------------------|
-| `s3_endpoint`              | The endpoint of the AWS S3.  | (none)        | 
Yes if it's a S3 fileset.| 0.7.0-incubating |
-| `s3_access_key_id`         | The access key of the AWS S3.| (none)        | 
Yes if it's a S3 fileset.| 0.7.0-incubating |
-| `s3_secret_access_key`     | The secret key of the AWS S3.| (none)        | 
Yes if it's a S3 fileset.| 0.7.0-incubating |
-
-The following properties are required if you want to access the GCS fileset 
via the GVFS python client:
-
-| Configuration item         | Description                               | 
Default value | Required                  | Since version    |
-|----------------------------|-------------------------------------------|---------------|---------------------------|------------------|
-| `gcs_service_account_file` | The path of GCS service account JSON file.| 
(none)        | Yes if it's a GCS fileset.| 0.7.0-incubating |
-
-The following properties are required if you want to access the OSS fileset 
via the GVFS python client:
-
-| Configuration item         | Description                       | Default 
value | Required                   | Since version    |
-|----------------------------|-----------------------------------|---------------|----------------------------|------------------|
-| `oss_endpoint`             | The endpoint of the Aliyun OSS.   | (none)      
  | Yes if it's a OSS fileset. | 0.7.0-incubating |
-| `oss_access_key_id`        | The access key of the Aliyun OSS. | (none)      
  | Yes if it's a OSS fileset. | 0.7.0-incubating |
-| `oss_secret_access_key`    | The secret key of the Aliyun OSS. | (none)      
  | Yes if it's a OSS fileset. | 0.7.0-incubating |
-
-For Azure Blob Storage fileset, you need to configure the following properties:
-
-| Configuration item | Description                            | Default value 
| Required                                  | Since version    |
-|--------------------|----------------------------------------|---------------|-------------------------------------------|------------------|
-| `abs_account_name` | The account name of Azure Blob Storage | (none)        
| Yes if it's a Azure Blob Storage fileset. | 0.8.0-incubating |
-| `abs_account_key`  | The account key of Azure Blob Storage  | (none)        
| Yes if it's a Azure Blob Storage fileset. | 0.8.0-incubating |
-
-
-You can configure these properties when obtaining the `Gravitino Virtual 
FileSystem` in Python like this:
-
-```python
-from gravitino import gvfs
-options = {
-    "cache_size": 20,
-    "cache_expired_time": 3600,
-    "auth_type": "simple",
-    # Optional, the following properties are required if you want to access 
the S3 fileset via GVFS python client, for GCS and OSS fileset, you should set 
the corresponding properties.
-    "s3_endpoint": "http://localhost:9000";,
-    "s3_access_key_id": "minio",
-    "s3_secret_access_key": "minio123"
-}
-fs = gvfs.GravitinoVirtualFileSystem(server_uri="http://localhost:8090";, 
metalake_name="test_metalake", options=options)
-```
+Please see the cloud-storage-specific configurations [GCS GVFS Java client 
configurations](./hadoop-catalog-with-gcs.md#using-the-gvfs-python-client-to-access-a-fileset),
 [S3 GVFS Java client 
configurations](./hadoop-catalog-with-s3.md#using-the-gvfs-python-client-to-access-a-fileset),
 [OSS GVFS Java client 
configurations](./hadoop-catalog-with-oss.md#using-the-gvfs-python-client-to-access-a-fileset)
 and [Azure Blob Storage GVFS Java client 
configurations](./hadoop-catalog-with-adls.md#u [...]
 
 :::note
-
 Gravitino python client does not support [customized file 
systems](hadoop-catalog.md#how-to-custom-your-own-hcfs-file-system-fileset) 
defined by users due to the limit of `fsspec` library. 
 :::
 
-
 ### Usage examples
 
 1. Make sure to obtain the Gravitino library.
diff --git a/docs/manage-fileset-metadata-using-gravitino.md 
b/docs/manage-fileset-metadata-using-gravitino.md
index 9d96287b56..0ff84c8346 100644
--- a/docs/manage-fileset-metadata-using-gravitino.md
+++ b/docs/manage-fileset-metadata-using-gravitino.md
@@ -15,7 +15,9 @@ filesets to manage non-tabular data like training datasets 
and other raw data.
 
 Typically, a fileset is mapped to a directory on a file system like HDFS, S3, 
ADLS, GCS, etc.
 With the fileset managed by Gravitino, the non-tabular data can be managed as 
assets together with
-tabular data in Gravitino in a unified way.
+tabular data in Gravitino in a unified way. The following operations will use 
HDFS as an example, for other
+HCFS like S3, OSS, GCS, etc, please refer to the corresponding operations 
[hadoop-with-s3](./hadoop-catalog-with-s3.md), 
[hadoop-with-oss](./hadoop-catalog-with-oss.md), 
[hadoop-with-gcs](./hadoop-catalog-with-gcs.md) and 
+[hadoop-with-adls](./hadoop-catalog-with-adls.md).
 
 After a fileset is created, users can easily access, manage the 
files/directories through
 the fileset's identifier, without needing to know the physical path of the 
managed dataset. Also, with
@@ -53,24 +55,6 @@ curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
   }
 }' http://localhost:8090/api/metalakes/metalake/catalogs
 
-# create a S3 catalog
-curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
--H "Content-Type: application/json" -d '{
-  "name": "catalog",
-  "type": "FILESET",
-  "comment": "comment",
-  "provider": "hadoop",
-  "properties": {
-    "location": "s3a://bucket/root",
-    "s3-access-key-id": "access_key",
-    "s3-secret-access-key": "secret_key",
-    "s3-endpoint": "http://s3.ap-northeast-1.amazonaws.com";,
-    "filesystem-providers": "s3"
-  }
-}' http://localhost:8090/api/metalakes/metalake/catalogs
-
-# For others HCFS like GCS, OSS, etc., the properties should be set 
accordingly. please refer to
-# The following link about the catalog properties.
 ```
 
 </TabItem>
@@ -93,25 +77,8 @@ Catalog catalog = gravitinoClient.createCatalog("catalog",
     "hadoop", // provider, Gravitino only supports "hadoop" for now.
     "This is a Hadoop fileset catalog",
     properties);
-
-// create a S3 catalog
-s3Properties = ImmutableMap.<String, String>builder()
-    .put("location", "s3a://bucket/root")
-    .put("s3-access-key-id", "access_key")
-    .put("s3-secret-access-key", "secret_key")
-    .put("s3-endpoint", "http://s3.ap-northeast-1.amazonaws.com";)
-    .put("filesystem-providers", "s3")
-    .build();
-
-Catalog s3Catalog = gravitinoClient.createCatalog("catalog",
-    Type.FILESET,
-    "hadoop", // provider, Gravitino only supports "hadoop" for now.
-    "This is a S3 fileset catalog",
-    s3Properties);
 // ...
 
-// For others HCFS like GCS, OSS, etc., the properties should be set 
accordingly. please refer to
-// The following link about the catalog properties.
 ```
 
 </TabItem>
@@ -124,23 +91,6 @@ catalog = gravitino_client.create_catalog(name="catalog",
                                           provider="hadoop", 
                                           comment="This is a Hadoop fileset 
catalog",
                                           properties={"location": 
"/tmp/test1"})
-
-# create a S3 catalog
-s3_properties = {
-    "location": "s3a://bucket/root",
-    "s3-access-key-id": "access_key"
-    "s3-secret-access-key": "secret_key",
-    "s3-endpoint": "http://s3.ap-northeast-1.amazonaws.com";
-}
-
-s3_catalog = gravitino_client.create_catalog(name="catalog",
-                                             type=Catalog.Type.FILESET,
-                                             provider="hadoop",
-                                             comment="This is a S3 fileset 
catalog",
-                                             properties=s3_properties)
-
-# For others HCFS like GCS, OSS, etc., the properties should be set 
accordingly. please refer to
-# The following link about the catalog properties.
 ```
 
 </TabItem>
@@ -371,11 +321,8 @@ The `storageLocation` is the physical location of the 
fileset. Users can specify
 when creating a fileset, or follow the rules of the catalog/schema location if 
not specified.
 
 The value of `storageLocation` depends on the configuration settings of the 
catalog:
-- If this is a S3 fileset catalog, the `storageLocation` should be in the 
format of `s3a://bucket-name/path/to/fileset`.
-- If this is an OSS fileset catalog, the `storageLocation` should be in the 
format of `oss://bucket-name/path/to/fileset`.
 - If this is a local fileset catalog, the `storageLocation` should be in the 
format of `file:///path/to/fileset`.
 - If this is a HDFS fileset catalog, the `storageLocation` should be in the 
format of `hdfs://namenode:port/path/to/fileset`.
-- If this is a GCS fileset catalog, the `storageLocation` should be in the 
format of `gs://bucket-name/path/to/fileset`.
 
 For a `MANAGED` fileset, the storage location is:

(gravitino) branch main updated: [#5472] improvement(docs): Add example to use cloud storage fileset and polish hadoop-catalog document. (#6059)

Reply via email to