yuqi1129 commented on code in PR #9878:
URL: https://github.com/apache/gravitino/pull/9878#discussion_r2799464683


##########
docs/jdbc-clickhouse-catalog.md:
##########
@@ -0,0 +1,299 @@
+---
+title: "ClickHouse catalog"
+slug: /jdbc-clickhouse-catalog
+keywords:
+- jdbc
+- clickhouse
+- metadata
+license: "This software is licensed under the Apache License version 2.0."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Introduction
+
+Apache Gravitino can manage ClickHouse metadata through a JDBC catalog. This 
document describes the capabilities and limitations of the ClickHouse catalog, 
as well as the supported operations and properties for catalogs, schemas, and 
tables.
+
+:::caution
+Gravitino writes system markers into table comments (for example `(From 
Gravitino, DO NOT EDIT: ...)`). Do not change or remove them.
+:::
+
+## Catalog
+
+### Catalog capabilities
+
+| Item            | Description                                                
             |
+|-----------------|-------------------------------------------------------------------------|
+| Scope           | One catalog maps to one ClickHouse instance                
             |
+| Metadata/DDL    | Supports JDBC-based metadata management and DDL            
             |
+| Column defaults | Supports column default values                             
             |
+| Drivers         | Requires user-provided ClickHouse JDBC driver in 
`catalogs-contrib/catalog-jdbc-clickhouse/libs` |
+
+### Catalog properties
+
+You can pass any JDBC pool property that Gravitino does not define by adding 
the `gravitino.bypass.` prefix (for example `gravitino.bypass.maxWaitMillis`). 
See [commons-dbcp 
configuration](https://commons.apache.org/proper/commons-dbcp/configuration.html)
 for details.
+
+When using the JDBC catalog you must provide `jdbc-url`, `jdbc-driver`, 
`jdbc-user`, and `jdbc-password`. Common catalog properties are listed 
[here](./gravitino-server-config.md#apache-gravitino-catalog-properties-configuration);
 ClickHouse adds no extra catalog-scoped keys.
+
+| Configuration item      | Description                                        
                         | Default value | Required | Since Version |
+|-------------------------|-----------------------------------------------------------------------------|---------------|----------|---------------|
+| `jdbc-url`              | JDBC URL, for example 
`jdbc:clickhouse://localhost:8123`                    | (none)        | Yes     
 | 1.2.0         |
+| `jdbc-driver`           | JDBC driver class, for example 
`com.clickhouse.jdbc.ClickHouseDriver`       | (none)        | Yes      | 1.2.0 
        |
+| `jdbc-user`             | JDBC user name                                     
                         | (none)        | Yes      | 1.2.0         |
+| `jdbc-password`         | JDBC password                                      
                         | (none)        | Yes      | 1.2.0         |
+| `jdbc.pool.min-size`    | Minimum pool size                                  
                         | `2`           | No       | 1.2.0         |
+| `jdbc.pool.max-size`    | Maximum pool size                                  
                         | `10`          | No       | 1.2.0         |
+| `jdbc.pool.max-wait-ms` | Max wait time for a connection                     
                         | `30000`       | No       | 1.2.0         |
+
+### Create a ClickHouse catalog
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "ck",
+  "type": "RELATIONAL",
+  "comment": "ClickHouse catalog",
+  "provider": "jdbc-clickhouse",
+  "properties": {
+    "jdbc-url": "jdbc:clickhouse://localhost:8123",
+    "jdbc-driver": "com.clickhouse.jdbc.ClickHouseDriver",
+    "jdbc-user": "default",
+    "jdbc-password": "passw0rd"
+  }
+}' http://localhost:8090/api/metalakes/metalake/catalogs
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+GravitinoClient client = GravitinoClient.builder("http://localhost:8090";)
+    .withMetalake("metalake")
+    .build();
+
+Map<String, String> ckProps = ImmutableMap.<String, String>builder()
+    .put("jdbc-url", "jdbc:clickhouse://localhost:8123")
+    .put("jdbc-driver", "com.clickhouse.jdbc.ClickHouseDriver")
+    .put("jdbc-user", "default")
+    .put("jdbc-password", "passw0rd")
+    .build();
+
+Catalog catalog =
+    client.createCatalog("ck", Catalog.Type.RELATIONAL, "jdbc-clickhouse", 
"ClickHouse catalog", ckProps);
+```
+
+</TabItem>
+</Tabs>
+
+See [Manage Relational Metadata Using 
Gravitino](./manage-relational-metadata-using-gravitino.md#catalog-operations) 
for other catalog operations.
+
+## Schema
+
+### Schema capabilities
+
+| Item         | Description                                                   
                               |
+|--------------|----------------------------------------------------------------------------------------------|
+| Mapping      | Gravitino schema maps to a ClickHouse database                
                               |
+| Operations   | Create / drop (ClickHouse supports cascade drop)              
                               |
+| Comments     | Schema comments supported                                     
                               |
+| Cluster mode | Optional `ON CLUSTER` for creation when `cluster-name` is 
provided                           |
+
+### Schema properties
+
+| Property Name  | Description                                                 
                                    | Default Value | Required | Immutable | 
Since version |
+|----------------|-------------------------------------------------------------------------------------------------|---------------|----------|-----------|---------------|
+| `on-cluster`   | Use `ON CLUSTER` when creating the database                 
                                    | `false`       | No       | No        | 
1.2.0         |
+| `cluster-name` | Cluster name used with `ON CLUSTER` (must align with 
table-level cluster settings)             | (none)        | No       | No       
 | 1.2.0         |
+
+### Create a schema
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "sales",
+  "comment": "Sales database",
+  "properties": {
+    "on-cluster": "true",
+    "cluster-name": "ck_cluster"
+  }
+}' http://localhost:8090/api/metalakes/metalake/catalogs/ck/schemas
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+Catalog catalog = client.loadCatalog("ck");
+Schema schema = catalog.asTableCatalog()
+    .createSchema("sales", "Sales database",
+        ImmutableMap.of("on-cluster", "true", "cluster-name", "ck_cluster"));
+```
+
+</TabItem>
+</Tabs>
+
+See [Manage Relational Metadata Using 
Gravitino](./manage-relational-metadata-using-gravitino.md#schema-operations) 
for more schema operations.
+
+## Table
+
+### Table capabilities
+
+| Area                | Details                                                
                                                                                
                                                                                
 |
+|---------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Mapping             | Gravitino table maps to a ClickHouse table             
                                                                                
                                                                                
|
+| Engines             | Local engines: MergeTree family (`MergeTree` default, 
`ReplacingMergeTree`, `SummingMergeTree`, `AggregatingMergeTree`, 
`CollapsingMergeTree`, `VersionedCollapsingMergeTree`, `GraphiteMergeTree`), 
Tiny/Stripe/Log, Memory, File, Null, Set, Join, View, Buffer, KeeperMap, etc. 
Distributed engine supports cluster mode with remote database/table and 
sharding key. |
+| Ordering/Partition  | MergeTree-family requires exactly one `ORDER BY` 
column; only single-column identity `PARTITION BY` is supported on MergeTree 
engines. Other engines reject `ORDER BY`/`PARTITION BY`.                        
           |
+| Indexes             | Primary key; data-skipping indexes 
`DATA_SKIPPING_MINMAX` and `DATA_SKIPPING_BLOOM_FILTER` (fixed granularities).  
                                                                                
                    |
+| Distribution        | Gravitino enforces `Distributions.NONE`; no custom 
distribution strategies.                                                        
                                                                                
   |
+| Column defaults     | Supported.                                             
                                                                                
                                                                                
|
+| Unsupported         | Engine change after creation; removing table 
properties; auto-increment columns.                                             
                                                                                
          |
+
+### Table column types
+
+| Gravitino Type      | ClickHouse Type                        |
+|---------------------|----------------------------------------|
+| `Byte`              | `Int8`                                 |
+| `Unsigned Byte`     | `UInt8`                                |
+| `Short`             | `Int16`                                |
+| `Unsigned Short`    | `UInt16`                               |
+| `Integer`           | `Int32`                                |
+| `Unsigned Integer`  | `UInt32`                               |
+| `Long`              | `Int64`                                |
+| `Unsigned Long`     | `UInt64`                               |
+| `Float`             | `Float32`                              |
+| `Double`            | `Float64`                              |
+| `Decimal(p,s)`      | `Decimal(p,s)`                         |
+| `String`/`VarChar`  | `String`                               |
+| `FixedChar(n)`      | `FixedString(n)`                       |
+| `Date`              | `Date`                                 |
+| `Timestamp[(p)]`    | `DateTime` (precision defaults to `0`) |
+| `BOOLEAN`           | `Bool`                                 |
+| `UUID`              | `UUID`                                 |
+
+Other ClickHouse types are exposed as [External 
Type](./manage-relational-metadata-using-gravitino.md#external-type).
+
+### Table properties
+
+:::note
+- `settings.*` keys are passed to the ClickHouse `SETTINGS` clause verbatim.  
+- The `engine` value is immutable after creation.
+:::
+
+| Property Name              | Description                                     
                                                         | Default Value | 
Required | Reserved | Immutable | Since version |
+|----------------------------|----------------------------------------------------------------------------------------------------------|---------------|----------|----------|-----------|---------------|
+| `engine`                   | Table engine (for example `MergeTree`, 
`ReplacingMergeTree`, `Distributed`, `Memory`, etc.)              | `MergeTree` 
  | No       | No       | Yes       | 1.2.0         |
+| `cluster-name`             | Cluster name used with `ON CLUSTER` and 
Distributed engine                                               | (none)       
 | No\*     | No       | No        | 1.2.0         |
+| `on-cluster`               | Use `ON CLUSTER` when creating the table        
                                                         | (none)        | No   
    | No       | No        | 1.2.0         |
+| `cluster-remote-database`  | Remote database for `Distributed` engine        
                                                         | (none)        | 
No\*\*   | No       | No        | 1.2.0         |
+| `cluster-remote-table`     | Remote table for `Distributed` engine           
                                                         | (none)        | 
No\*\*   | No       | No        | 1.2.0         |
+| `cluster-sharding-key`     | Sharding key for `Distributed` engine 
(expression allowed; referenced columns must be non-null integral) | (none)     
   | No\*\*   | No       | No        | 1.2.0         |
+| `settings.<name>`          | ClickHouse engine setting forwarded as 
`SETTINGS <name>=<value>`                                         | (none)      
  | No       | No       | No        | 1.2.0         |
+
+\* Required when `on-cluster=true` or `engine=Distributed`.  
+\*\* Required when `engine=Distributed`.
+
+### Table indexes
+
+- `PRIMARY_KEY`
+- Data-skipping indexes:
+  - `DATA_SKIPPING_MINMAX` (`GRANULARITY` fixed to 1)
+  - `DATA_SKIPPING_BLOOM_FILTER` (`GRANULARITY` fixed to 3)
+
+### Partitioning, sorting, and distribution
+
+- `ORDER BY`: required for MergeTree-family engines (single column); rejected 
for engines that do not support it.
+- `PARTITION BY`: single-column identity only, and only for MergeTree-family 
engines.
+- Distribution: fixed to `Distributions.NONE` (no custom distribution).
+

Review Comment:
   added



##########
docs/jdbc-clickhouse-catalog.md:
##########
@@ -0,0 +1,299 @@
+---
+title: "ClickHouse catalog"
+slug: /jdbc-clickhouse-catalog
+keywords:
+- jdbc
+- clickhouse
+- metadata
+license: "This software is licensed under the Apache License version 2.0."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Introduction
+
+Apache Gravitino can manage ClickHouse metadata through a JDBC catalog. This 
document describes the capabilities and limitations of the ClickHouse catalog, 
as well as the supported operations and properties for catalogs, schemas, and 
tables.
+
+:::caution
+Gravitino writes system markers into table comments (for example `(From 
Gravitino, DO NOT EDIT: ...)`). Do not change or remove them.
+:::
+
+## Catalog
+
+### Catalog capabilities
+
+| Item            | Description                                                
             |
+|-----------------|-------------------------------------------------------------------------|
+| Scope           | One catalog maps to one ClickHouse instance                
             |
+| Metadata/DDL    | Supports JDBC-based metadata management and DDL            
             |
+| Column defaults | Supports column default values                             
             |
+| Drivers         | Requires user-provided ClickHouse JDBC driver in 
`catalogs-contrib/catalog-jdbc-clickhouse/libs` |
+
+### Catalog properties
+
+You can pass any JDBC pool property that Gravitino does not define by adding 
the `gravitino.bypass.` prefix (for example `gravitino.bypass.maxWaitMillis`). 
See [commons-dbcp 
configuration](https://commons.apache.org/proper/commons-dbcp/configuration.html)
 for details.
+
+When using the JDBC catalog you must provide `jdbc-url`, `jdbc-driver`, 
`jdbc-user`, and `jdbc-password`. Common catalog properties are listed 
[here](./gravitino-server-config.md#apache-gravitino-catalog-properties-configuration);
 ClickHouse adds no extra catalog-scoped keys.
+
+| Configuration item      | Description                                        
                         | Default value | Required | Since Version |
+|-------------------------|-----------------------------------------------------------------------------|---------------|----------|---------------|
+| `jdbc-url`              | JDBC URL, for example 
`jdbc:clickhouse://localhost:8123`                    | (none)        | Yes     
 | 1.2.0         |
+| `jdbc-driver`           | JDBC driver class, for example 
`com.clickhouse.jdbc.ClickHouseDriver`       | (none)        | Yes      | 1.2.0 
        |
+| `jdbc-user`             | JDBC user name                                     
                         | (none)        | Yes      | 1.2.0         |
+| `jdbc-password`         | JDBC password                                      
                         | (none)        | Yes      | 1.2.0         |
+| `jdbc.pool.min-size`    | Minimum pool size                                  
                         | `2`           | No       | 1.2.0         |
+| `jdbc.pool.max-size`    | Maximum pool size                                  
                         | `10`          | No       | 1.2.0         |
+| `jdbc.pool.max-wait-ms` | Max wait time for a connection                     
                         | `30000`       | No       | 1.2.0         |
+
+### Create a ClickHouse catalog
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "ck",
+  "type": "RELATIONAL",
+  "comment": "ClickHouse catalog",
+  "provider": "jdbc-clickhouse",
+  "properties": {
+    "jdbc-url": "jdbc:clickhouse://localhost:8123",
+    "jdbc-driver": "com.clickhouse.jdbc.ClickHouseDriver",
+    "jdbc-user": "default",
+    "jdbc-password": "passw0rd"
+  }
+}' http://localhost:8090/api/metalakes/metalake/catalogs
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+GravitinoClient client = GravitinoClient.builder("http://localhost:8090";)
+    .withMetalake("metalake")
+    .build();
+
+Map<String, String> ckProps = ImmutableMap.<String, String>builder()
+    .put("jdbc-url", "jdbc:clickhouse://localhost:8123")
+    .put("jdbc-driver", "com.clickhouse.jdbc.ClickHouseDriver")
+    .put("jdbc-user", "default")
+    .put("jdbc-password", "passw0rd")
+    .build();
+
+Catalog catalog =
+    client.createCatalog("ck", Catalog.Type.RELATIONAL, "jdbc-clickhouse", 
"ClickHouse catalog", ckProps);
+```
+
+</TabItem>
+</Tabs>
+
+See [Manage Relational Metadata Using 
Gravitino](./manage-relational-metadata-using-gravitino.md#catalog-operations) 
for other catalog operations.
+
+## Schema
+
+### Schema capabilities
+
+| Item         | Description                                                   
                               |
+|--------------|----------------------------------------------------------------------------------------------|
+| Mapping      | Gravitino schema maps to a ClickHouse database                
                               |
+| Operations   | Create / drop (ClickHouse supports cascade drop)              
                               |
+| Comments     | Schema comments supported                                     
                               |
+| Cluster mode | Optional `ON CLUSTER` for creation when `cluster-name` is 
provided                           |
+
+### Schema properties
+
+| Property Name  | Description                                                 
                                    | Default Value | Required | Immutable | 
Since version |
+|----------------|-------------------------------------------------------------------------------------------------|---------------|----------|-----------|---------------|
+| `on-cluster`   | Use `ON CLUSTER` when creating the database                 
                                    | `false`       | No       | No        | 
1.2.0         |
+| `cluster-name` | Cluster name used with `ON CLUSTER` (must align with 
table-level cluster settings)             | (none)        | No       | No       
 | 1.2.0         |
+
+### Create a schema
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "sales",
+  "comment": "Sales database",
+  "properties": {
+    "on-cluster": "true",
+    "cluster-name": "ck_cluster"
+  }
+}' http://localhost:8090/api/metalakes/metalake/catalogs/ck/schemas
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+Catalog catalog = client.loadCatalog("ck");
+Schema schema = catalog.asTableCatalog()
+    .createSchema("sales", "Sales database",
+        ImmutableMap.of("on-cluster", "true", "cluster-name", "ck_cluster"));
+```
+
+</TabItem>
+</Tabs>
+
+See [Manage Relational Metadata Using 
Gravitino](./manage-relational-metadata-using-gravitino.md#schema-operations) 
for more schema operations.
+
+## Table
+
+### Table capabilities
+
+| Area                | Details                                                
                                                                                
                                                                                
 |
+|---------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Mapping             | Gravitino table maps to a ClickHouse table             
                                                                                
                                                                                
|
+| Engines             | Local engines: MergeTree family (`MergeTree` default, 
`ReplacingMergeTree`, `SummingMergeTree`, `AggregatingMergeTree`, 
`CollapsingMergeTree`, `VersionedCollapsingMergeTree`, `GraphiteMergeTree`), 
Tiny/Stripe/Log, Memory, File, Null, Set, Join, View, Buffer, KeeperMap, etc. 
Distributed engine supports cluster mode with remote database/table and 
sharding key. |
+| Ordering/Partition  | MergeTree-family requires exactly one `ORDER BY` 
column; only single-column identity `PARTITION BY` is supported on MergeTree 
engines. Other engines reject `ORDER BY`/`PARTITION BY`.                        
           |
+| Indexes             | Primary key; data-skipping indexes 
`DATA_SKIPPING_MINMAX` and `DATA_SKIPPING_BLOOM_FILTER` (fixed granularities).  
                                                                                
                    |
+| Distribution        | Gravitino enforces `Distributions.NONE`; no custom 
distribution strategies.                                                        
                                                                                
   |
+| Column defaults     | Supported.                                             
                                                                                
                                                                                
|
+| Unsupported         | Engine change after creation; removing table 
properties; auto-increment columns.                                             
                                                                                
          |
+
+### Table column types
+
+| Gravitino Type      | ClickHouse Type                        |
+|---------------------|----------------------------------------|
+| `Byte`              | `Int8`                                 |
+| `Unsigned Byte`     | `UInt8`                                |
+| `Short`             | `Int16`                                |
+| `Unsigned Short`    | `UInt16`                               |
+| `Integer`           | `Int32`                                |
+| `Unsigned Integer`  | `UInt32`                               |
+| `Long`              | `Int64`                                |
+| `Unsigned Long`     | `UInt64`                               |
+| `Float`             | `Float32`                              |
+| `Double`            | `Float64`                              |
+| `Decimal(p,s)`      | `Decimal(p,s)`                         |
+| `String`/`VarChar`  | `String`                               |
+| `FixedChar(n)`      | `FixedString(n)`                       |
+| `Date`              | `Date`                                 |
+| `Timestamp[(p)]`    | `DateTime` (precision defaults to `0`) |
+| `BOOLEAN`           | `Bool`                                 |
+| `UUID`              | `UUID`                                 |
+
+Other ClickHouse types are exposed as [External 
Type](./manage-relational-metadata-using-gravitino.md#external-type).
+
+### Table properties
+
+:::note
+- `settings.*` keys are passed to the ClickHouse `SETTINGS` clause verbatim.  
+- The `engine` value is immutable after creation.
+:::
+
+| Property Name              | Description                                     
                                                         | Default Value | 
Required | Reserved | Immutable | Since version |
+|----------------------------|----------------------------------------------------------------------------------------------------------|---------------|----------|----------|-----------|---------------|
+| `engine`                   | Table engine (for example `MergeTree`, 
`ReplacingMergeTree`, `Distributed`, `Memory`, etc.)              | `MergeTree` 
  | No       | No       | Yes       | 1.2.0         |
+| `cluster-name`             | Cluster name used with `ON CLUSTER` and 
Distributed engine                                               | (none)       
 | No\*     | No       | No        | 1.2.0         |
+| `on-cluster`               | Use `ON CLUSTER` when creating the table        
                                                         | (none)        | No   
    | No       | No        | 1.2.0         |
+| `cluster-remote-database`  | Remote database for `Distributed` engine        
                                                         | (none)        | 
No\*\*   | No       | No        | 1.2.0         |
+| `cluster-remote-table`     | Remote table for `Distributed` engine           
                                                         | (none)        | 
No\*\*   | No       | No        | 1.2.0         |
+| `cluster-sharding-key`     | Sharding key for `Distributed` engine 
(expression allowed; referenced columns must be non-null integral) | (none)     
   | No\*\*   | No       | No        | 1.2.0         |
+| `settings.<name>`          | ClickHouse engine setting forwarded as 
`SETTINGS <name>=<value>`                                         | (none)      
  | No       | No       | No        | 1.2.0         |
+
+\* Required when `on-cluster=true` or `engine=Distributed`.  
+\*\* Required when `engine=Distributed`.
+
+### Table indexes
+
+- `PRIMARY_KEY`
+- Data-skipping indexes:
+  - `DATA_SKIPPING_MINMAX` (`GRANULARITY` fixed to 1)
+  - `DATA_SKIPPING_BLOOM_FILTER` (`GRANULARITY` fixed to 3)
+
+### Partitioning, sorting, and distribution
+
+- `ORDER BY`: required for MergeTree-family engines (single column); rejected 
for engines that do not support it.
+- `PARTITION BY`: single-column identity only, and only for MergeTree-family 
engines.
+- Distribution: fixed to `Distributions.NONE` (no custom distribution).
+
+### Create a table
+
+The following example creates a `MergeTree` table with `ORDER BY`, 
partitioning, indexes, comments, and properties including `ON CLUSTER`. Note 
that the `engine` property is required for MergeTree-family tables, and that 
the cluster properties must align with the schema-level cluster settings if 
`on-cluster=true`.
+

Review Comment:
   added



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to