This is an automated email from the ASF dual-hosted git repository.
morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new aa512100576 [fix] missing kafka catalog en (#3345)
aa512100576 is described below
commit aa5121005764fa9cdd78336b95c43e764a9fb52d
Author: Mingyu Chen (Rayner) <[email protected]>
AuthorDate: Sat Feb 7 17:59:10 2026 +0800
[fix] missing kafka catalog en (#3345)
## Versions
- [x] dev
- [x] 4.x
- [x] 3.x
- [ ] 2.1
## Languages
- [x] Chinese
- [x] English
## Docs Checklist
- [ ] Checked by AI
- [ ] Test Cases Built
---
docs/lakehouse/catalogs/bigquery-catalog.md | 3 -
docs/lakehouse/catalogs/delta-lake-catalog.md | 7 +-
docs/lakehouse/catalogs/kafka-catalog.md | 358 +++++++++++++++++++++
docs/lakehouse/catalogs/kudu-catalog.md | 3 -
.../current/lakehouse/catalogs/bigquery-catalog.md | 3 -
.../lakehouse/catalogs/delta-lake-catalog.md | 7 +-
.../current/lakehouse/catalogs/kafka-catalog.md | 3 -
.../current/lakehouse/catalogs/kudu-catalog.md | 3 -
.../lakehouse/catalogs/bigquery-catalog.md | 3 -
.../lakehouse/catalogs/delta-lake-catalog.md | 214 ++++++------
.../lakehouse/catalogs/kafka-catalog.md | 3 -
.../version-3.x/lakehouse/catalogs/kudu-catalog.md | 3 -
.../lakehouse/catalogs/bigquery-catalog.md | 3 -
.../lakehouse/catalogs/delta-lake-catalog.md | 214 ++++++------
.../lakehouse/catalogs/kafka-catalog.md | 3 -
.../version-4.x/lakehouse/catalogs/kudu-catalog.md | 3 -
.../lakehouse/catalogs/bigquery-catalog.md | 3 -
.../lakehouse/catalogs/delta-lake-catalog.md | 214 ++++++------
.../lakehouse/catalogs/kafka-catalog.md | 358 +++++++++++++++++++++
.../version-3.x/lakehouse/catalogs/kudu-catalog.md | 3 -
.../lakehouse/catalogs/bigquery-catalog.md | 3 -
.../lakehouse/catalogs/delta-lake-catalog.md | 214 ++++++------
.../lakehouse/catalogs/kafka-catalog.md | 358 +++++++++++++++++++++
.../version-4.x/lakehouse/catalogs/kudu-catalog.md | 3 -
24 files changed, 1562 insertions(+), 427 deletions(-)
diff --git a/docs/lakehouse/catalogs/bigquery-catalog.md
b/docs/lakehouse/catalogs/bigquery-catalog.md
index 9422f89fe3d..e96453a3a4e 100644
--- a/docs/lakehouse/catalogs/bigquery-catalog.md
+++ b/docs/lakehouse/catalogs/bigquery-catalog.md
@@ -12,9 +12,6 @@ BigQuery Catalog uses the Trino BigQuery Connector to access
BigQuery tables thr
:::note
- This feature is experimental and supported since version 3.0.1.
-:::
-
-:::note
- This feature does not depend on a Trino cluster environment and only uses
the Trino compatibility plugin.
:::
diff --git a/docs/lakehouse/catalogs/delta-lake-catalog.md
b/docs/lakehouse/catalogs/delta-lake-catalog.md
index 7c3f267e214..569505d2229 100644
--- a/docs/lakehouse/catalogs/delta-lake-catalog.md
+++ b/docs/lakehouse/catalogs/delta-lake-catalog.md
@@ -11,11 +11,8 @@
Delta Lake Catalog uses the [Trino
Connector](https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/)
compatibility framework with Trino Delta Lake Connector to access Delta Lake
tables.
:::note
-This is an experimental feature, supported since version 3.0.1.
-:::
-
-:::note
-This feature does not depend on a Trino cluster environment, it only uses the
Trino compatibility plugin.
+- This is an experimental feature, supported since version 3.0.1.
+- This feature does not depend on a Trino cluster environment, it only uses
the Trino compatibility plugin.
:::
### Use Cases
diff --git a/docs/lakehouse/catalogs/kafka-catalog.md
b/docs/lakehouse/catalogs/kafka-catalog.md
index e69de29bb2d..0c71183434e 100644
--- a/docs/lakehouse/catalogs/kafka-catalog.md
+++ b/docs/lakehouse/catalogs/kafka-catalog.md
@@ -0,0 +1,358 @@
+---
+{
+ "title": "Kafka Catalog",
+ "language": "en",
+ "description": "Apache Doris Kafka Catalog guide: Connect to Kafka data
streams through Trino Connector framework to query and integrate Kafka Topic
data. Supports Schema Registry, multiple data formats for quick Kafka and Doris
data integration."
+}
+---
+
+## Overview
+
+Kafka Catalog uses the Trino Kafka Connector through the [Trino
Connector](https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/)
compatibility framework to access Kafka Topic data.
+
+:::note
+- This is an experimental feature, supported since version 3.0.1.
+- This feature does not depend on a Trino cluster environment; it only uses
Trino-compatible plugins.
+:::
+
+### Use Cases
+
+| Scenario | Support Status |
+| -------- | -------------- |
+| Data Integration | Read Kafka Topic data and write to Doris internal tables |
+| Data Write-back | Not supported |
+
+### Version Compatibility
+
+- **Doris Version**: 3.0.1 and above
+- **Trino Connector Version**: 435
+- **Kafka Version**: For supported versions, please refer to [Trino
Documentation](https://trino.io/docs/435/connector/kafka.html)
+
+## Quick Start
+
+### Step 1: Prepare Connector Plugin
+
+You can obtain the Kafka Connector plugin using one of the following methods:
+
+**Method 1: Use Pre-compiled Package (Recommended)**
+
+Download and extract the pre-compiled plugin package from
[here](https://github.com/apache/doris-thirdparty/releases/tag/trino-435-20240724).
+
+**Method 2: Manual Compilation**
+
+If you need custom compilation, follow these steps (requires JDK 17):
+
+```shell
+git clone https://github.com/apache/doris-thirdparty.git
+cd doris-thirdparty
+git checkout trino-435
+cd plugin/trino-kafka
+mvn clean package -Dmaven.test.skip=true
+```
+
+After compilation, you will get the `trino-kafka-435/` directory under
`trino/plugin/trino-kafka/target/`.
+
+### Step 2: Deploy Plugin
+
+1. Place the `trino-kafka-435/` directory in the `connectors/` directory of
all FE and BE deployment paths (create the directory manually if it doesn't
exist):
+
+ ```text
+ ├── bin
+ ├── conf
+ ├── plugins
+ │ ├── connectors
+ │ ├── trino-kafka-435
+ ...
+ ```
+
+ > You can also customize the plugin path by modifying the
`trino_connector_plugin_dir` configuration in `fe.conf`. For example:
`trino_connector_plugin_dir=/path/to/connectors/`
+
+2. Restart all FE and BE nodes to ensure the connector is properly loaded.
+
+### Step 3: Create Catalog
+
+**Basic Configuration**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>,<broker2>:<port2>',
+ 'trino.kafka.table-names' = 'test_db.topic_name',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+**Using Configuration File**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>,<broker2>:<port2>',
+ 'trino.kafka.config.resources' = '/path/to/kafka-client.properties',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+**Configure Default Schema**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>,<broker2>:<port2>',
+ 'trino.kafka.default-schema' = 'default_db',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+### Step 4: Query Data
+
+After creating the catalog, you can query Kafka Topic data using one of three
methods:
+
+```sql
+-- Method 1: Switch to catalog and query
+SWITCH kafka;
+USE kafka_schema;
+SELECT * FROM topic_name LIMIT 10;
+
+-- Method 2: Use two-level path
+USE kafka.kafka_schema;
+SELECT * FROM topic_name LIMIT 10;
+
+-- Method 3: Use fully qualified name
+SELECT * FROM kafka.kafka_schema.topic_name LIMIT 10;
+```
+
+## Schema Registry Integration
+
+Kafka Catalog supports automatic schema retrieval through Confluent Schema
Registry, eliminating the need to manually define table structures.
+
+### Configure Schema Registry
+
+**Basic Authentication**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>',
+ 'trino.kafka.table-description-supplier' = 'CONFLUENT',
+ 'trino.kafka.confluent-schema-registry-url' =
'http://<schema-registry-host>:<schema-registry-port>',
+ 'trino.kafka.confluent-schema-registry-auth-type' = 'BASIC_AUTH',
+ 'trino.kafka.confluent-schema-registry.basic-auth.username' = 'admin',
+ 'trino.kafka.confluent-schema-registry.basic-auth.password' = 'admin123',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+**Complete Configuration Example**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>',
+ 'trino.kafka.default-schema' = 'nrdp',
+ 'trino.kafka.table-description-supplier' = 'CONFLUENT',
+ 'trino.kafka.confluent-schema-registry-url' =
'http://<schema-registry-host>:<schema-registry-port>',
+ 'trino.kafka.confluent-schema-registry-auth-type' = 'BASIC_AUTH',
+ 'trino.kafka.confluent-schema-registry.basic-auth.username' = 'admin',
+ 'trino.kafka.confluent-schema-registry.basic-auth.password' = 'admin123',
+ 'trino.kafka.config.resources' = '/path/to/kafka-client.properties',
+ 'trino.kafka.confluent-schema-registry-subject-mapping' =
'nrdp.topic1:NRDP.topic1',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+### Schema Registry Parameters
+
+| Parameter Name | Required | Default | Description |
+| -------------- | -------- | ------- | ----------- |
+| `trino.kafka.table-description-supplier` | No | - | Set to `CONFLUENT` to
enable Schema Registry support |
+| `trino.kafka.confluent-schema-registry-url` | Yes* | - | Schema Registry
service address |
+| `trino.kafka.confluent-schema-registry-auth-type` | No | NONE |
Authentication type: NONE, BASIC_AUTH, BEARER |
+| `trino.kafka.confluent-schema-registry.basic-auth.username` | No | - | Basic
Auth username |
+| `trino.kafka.confluent-schema-registry.basic-auth.password` | No | - | Basic
Auth password |
+| `trino.kafka.confluent-schema-registry-subject-mapping` | No | - | Subject
name mapping, format: `<db1>.<tbl1>:<topic_name1>,<db2>.<tbl2>:<topic_name2>` |
+
+:::tip
+When using Schema Registry, Doris will automatically retrieve Topic schema
information from Schema Registry, eliminating the need to manually create table
structures.
+:::
+
+### Subject Mapping
+
+In some cases, the Subject name registered in Schema Registry may not match
the Topic name in Kafka, preventing data queries. In such cases, you need to
manually specify the mapping relationship through
`confluent-schema-registry-subject-mapping`.
+
+```sql
+-- Map schema.topic to SCHEMA.topic Subject in Schema Registry
+'trino.kafka.confluent-schema-registry-subject-mapping' =
'<db1>.<tbl1>:<topic_name1>'
+```
+
+Where `db1` and `tbl1` are the actual Database and Table names seen in Doris,
and `topic_name1` is the actual Topic name in Kafka (case-sensitive).
+
+Multiple mappings can be separated by commas:
+
+```sql
+'trino.kafka.confluent-schema-registry-subject-mapping' =
'<db1>.<tbl1>:<topic_name1>,<db2>.<tbl2>:<topic_name2>'
+```
+
+## Configuration
+
+### Catalog Configuration Parameters
+
+The basic syntax for creating a Kafka Catalog is as follows:
+
+```sql
+CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
+ 'type' = 'trino-connector', -- Required, fixed value
+ 'trino.connector.name' = 'kafka', -- Required, fixed value
+ {TrinoProperties}, -- Trino Connector related properties
+ {CommonProperties} -- Common properties
+);
+```
+
+#### TrinoProperties Parameters
+
+TrinoProperties are used to configure Trino Kafka Connector-specific
properties, which are prefixed with `trino.`. Common parameters include:
+
+| Parameter Name | Required | Default | Description |
+| -------------- | -------- | ------- | ----------- |
+| `trino.kafka.nodes` | Yes | - | Kafka Broker node address list, format:
`host1:port1,host2:port2` |
+| `trino.kafka.table-names` | No | - | List of Topics to map, format:
`schema.topic1,schema.topic2` |
+| `trino.kafka.default-schema` | No | default | Default schema name |
+| `trino.kafka.hide-internal-columns` | No | true | Whether to hide Kafka
internal columns (such as `_partition_id`, `_partition_offset`, etc.) |
+| `trino.kafka.config.resources` | No | - | Kafka client configuration file
path |
+| `trino.kafka.table-description-supplier` | No | - | Table structure
provider, set to `CONFLUENT` to use Schema Registry |
+| `trino.kafka.confluent-schema-registry-url` | No | - | Schema Registry
service address |
+
+For more Kafka Connector configuration parameters, please refer to [Trino
Official Documentation](https://trino.io/docs/435/connector/kafka.html).
+
+#### CommonProperties Parameters
+
+CommonProperties are used to configure general catalog properties, such as
metadata refresh policies and permission control. For detailed information,
please refer to the "Common Properties" section in [Catalog
Overview](../catalog-overview.md).
+
+### Kafka Client Configuration
+
+When you need to configure advanced Kafka client parameters (such as security
authentication, SSL, etc.), you can specify them through a configuration file.
Create a configuration file (e.g., `kafka-client.properties`):
+
+```properties
+# ============================================
+# Kerberos/SASL Authentication Configuration
+# ============================================
+sasl.mechanism=GSSAPI
+sasl.kerberos.service.name=kafka
+
+# JAAS Configuration - Using keytab method
+sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required \
+ useKeyTab=true \
+ storeKey=true \
+ useTicketCache=false \
+ serviceName="kafka" \
+ keyTab="/opt/trino/security/keytabs/kafka.keytab" \
+ principal="[email protected]";
+
+# ============================================
+# Avro Deserializer Configuration
+# ============================================
+key.deserializer=io.confluent.kafka.serializers.KafkaAvroDeserializer
+value.deserializer=io.confluent.kafka.serializers.KafkaAvroDeserializer
+```
+
+Then specify the configuration file when creating the catalog:
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>',
+ 'trino.kafka.config.resources' = '/path/to/kafka-client.properties'
+);
+```
+
+## Data Type Mapping
+
+When using Kafka Catalog, data types are mapped according to the following
rules:
+
+| Kafka/Avro Type | Trino Type | Doris Type | Notes |
+| --------------- | ---------- | ---------- | ----- |
+| boolean | boolean | boolean | |
+| int | integer | int | |
+| long | bigint | bigint | |
+| float | real | float | |
+| double | double | double | |
+| bytes | varbinary | string | Use `HEX(col)` function to query |
+| string | varchar | string | |
+| array | array | array | |
+| map | map | map | |
+| record | row | struct | Complex nested structure |
+| enum | varchar | string | |
+| fixed | varbinary | string | |
+| null | - | - | |
+
+:::tip
+- For `bytes` type, use the `HEX()` function to display in hexadecimal format.
+- The data types supported by Kafka Catalog depend on the serialization format
used (JSON, Avro, Protobuf, etc.) and Schema Registry configuration.
+:::
+
+## Kafka Internal Columns
+
+Kafka Connector provides some internal columns to access metadata information
of Kafka messages:
+
+| Column Name | Type | Description |
+| ----------- | ---- | ----------- |
+| `_partition_id` | bigint | Partition ID where the message is located |
+| `_partition_offset` | bigint | Message offset within the partition |
+| `_message_timestamp` | timestamp | Message timestamp |
+| `_key` | varchar | Message key |
+| `_key_corrupt` | boolean | Whether the key is corrupted |
+| `_key_length` | bigint | Key byte length |
+| `_message` | varchar | Raw message content |
+| `_message_corrupt` | boolean | Whether the message is corrupted |
+| `_message_length` | bigint | Message byte length |
+| `_headers` | map | Message header information |
+
+By default, these internal columns are hidden. If you need to query these
columns, set when creating the catalog:
+
+```sql
+'trino.kafka.hide-internal-columns' = 'false'
+```
+
+Query example:
+
+```sql
+SELECT
+ _partition_id,
+ _partition_offset,
+ _message_timestamp,
+ *
+FROM kafka.schema.topic_name
+LIMIT 10;
+```
+
+## Limitations
+
+1. **Read-only Access**: Kafka Catalog only supports reading data; write
operations (INSERT, UPDATE, DELETE) are not supported.
+
+2. **Table Names Configuration**: When not using Schema Registry, you need to
explicitly specify the list of Topics to access through the
`trino.kafka.table-names` parameter.
+
+3. **Schema Definition**:
+ - When using Schema Registry, schema information is automatically retrieved
from Schema Registry.
+ - When not using Schema Registry, you need to manually create table
definitions or use Trino's Topic description files.
+
+4. **Data Format**: Supported data formats depend on the serialization method
used by the Topic (JSON, Avro, Protobuf, etc.). For details, please refer to
[Trino Official Documentation](https://trino.io/docs/435/connector/kafka.html).
+
+5. **Performance Considerations**:
+ - Kafka Catalog reads Kafka data in real-time; querying large amounts of
data may affect performance.
+ - It is recommended to use the `LIMIT` clause or time filter conditions to
limit the amount of data scanned.
+
+## Feature Debugging
+
+You can refer to
[here](https://github.com/morningman/demo-env/tree/main/kafka) to quickly build
a Kafka environment for feature verification.
+
+## References
+
+- [Trino Kafka Connector Official
Documentation](https://trino.io/docs/435/connector/kafka.html)
+- [Trino Connector Development
Guide](https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/)
+- [Confluent Schema Registry
Documentation](https://docs.confluent.io/platform/current/schema-registry/index.html)
\ No newline at end of file
diff --git a/docs/lakehouse/catalogs/kudu-catalog.md
b/docs/lakehouse/catalogs/kudu-catalog.md
index 57dc1b19e90..de31ed9c638 100644
--- a/docs/lakehouse/catalogs/kudu-catalog.md
+++ b/docs/lakehouse/catalogs/kudu-catalog.md
@@ -12,9 +12,6 @@ Kudu Catalog accesses Kudu tables through the [Trino
Connector](https://doris.ap
:::note
- This is an experimental feature, supported since version 3.0.1.
-:::
-
-:::note
- This feature does not depend on a Trino cluster environment and only uses
the Trino compatibility plugin.
:::
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/bigquery-catalog.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/bigquery-catalog.md
index 534d9e09cce..594d1cfb8fc 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/bigquery-catalog.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/bigquery-catalog.md
@@ -12,9 +12,6 @@ BigQuery Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/communi
:::note
- 该功能为实验功能,自 3.0.1 版本开始支持。
-:::
-
-:::note
- 该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
:::
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/delta-lake-catalog.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/delta-lake-catalog.md
index 539783f5ea0..bc591d1afb2 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/delta-lake-catalog.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/delta-lake-catalog.md
@@ -11,11 +11,8 @@
Delta Lake Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide/)
兼容框架,使用 Trino Delta Lake Connector 来访问 Delta Lake 表。
:::note
-该功能为实验功能,自 3.0.1 版本开始支持。
-:::
-
-:::note
-该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
+- 该功能为实验功能,自 3.0.1 版本开始支持。
+- 该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
:::
### 适用场景
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/kafka-catalog.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/kafka-catalog.md
index 1d2ad002e57..aeb7d3f36b5 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/kafka-catalog.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/kafka-catalog.md
@@ -12,9 +12,6 @@ Kafka Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/community/
:::note
- 该功能为实验功能,自 3.0.1 版本开始支持。
-:::
-
-:::note
- 该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
:::
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/kudu-catalog.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/kudu-catalog.md
index 71eb6bac592..76e005ac014 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/kudu-catalog.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/kudu-catalog.md
@@ -12,9 +12,6 @@ Kudu Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/community/h
:::note
- 该功能为实验功能,自 3.0.1 版本开始支持。
-:::
-
-:::note
- 该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
:::
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/bigquery-catalog.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/bigquery-catalog.md
index 534d9e09cce..594d1cfb8fc 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/bigquery-catalog.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/bigquery-catalog.md
@@ -12,9 +12,6 @@ BigQuery Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/communi
:::note
- 该功能为实验功能,自 3.0.1 版本开始支持。
-:::
-
-:::note
- 该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
:::
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/delta-lake-catalog.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/delta-lake-catalog.md
index 078d52ac330..bc591d1afb2 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/delta-lake-catalog.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/delta-lake-catalog.md
@@ -2,140 +2,168 @@
{
"title": "Delta Lake Catalog",
"language": "zh-CN",
- "description": "Delta Lake Catalog 通过 Trino Connector 兼容框架,使用 Delta Lake
Connector 来访问 Delta Lake 表。"
+ "description": "Apache Doris Delta Lake Catalog 使用指南:通过 Trino Connector
框架连接 Delta Lake 数据湖,实现 Delta Lake 表数据的查询和集成。支持 Hive Metastore、多种数据类型映射,快速完成
Delta Lake 与 Doris 的数据集成。"
}
---
-Delta Lake Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide/)
兼容框架,使用 Delta Lake Connector 来访问 Delta Lake 表。
+## 概述
+
+Delta Lake Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide/)
兼容框架,使用 Trino Delta Lake Connector 来访问 Delta Lake 表。
:::note
-该功能为实验功能,自 3.0.1 版本开始支持。
+- 该功能为实验功能,自 3.0.1 版本开始支持。
+- 该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
:::
-## 适用场景
-
-| 场景 | 说明 |
-| ---- | ------------------------------ |
-| 数据集成 | 读取 Detla Lake 数据并写入到 Doris 内表。 |
-| 数据写回 | 不支持。 |
+### 适用场景
-## 环境准备
+| 场景 | 支持情况 |
+| -------- | -------------------------------------- |
+| 数据集成 | 读取 Delta Lake 数据并写入到 Doris 内表 |
+| 数据写回 | 不支持 |
-### 编译 Delta Lake Connector 插件
+### 版本兼容性
-> 需要 JDK 17 版本。
+- **Doris 版本**:3.0.1 及以上
+- **Trino Connector 版本**:435
+- **Delta Lake 版本**:具体支持的版本请参考 [Trino
文档](https://trino.io/docs/435/connector/delta-lake.html)
-```shell
-$ git clone https://github.com/apache/doris-thirdparty.git
-$ cd doris-thirdparty
-$ git checkout trino-435
-$ cd plugin/trino-delta-lake
-$ mvn clean install -DskipTest
-$ cd ../../lib/trino-hdfs
-$ mvn clean install -DskipTest
-```
+## 快速开始
-完成编译后,会在 `trino/plugin/trino-delta-lake/target/` 下得到 `trino-delta-lake-435`
目录,在 `trino/lib/trino-hdfs/target/` 下得到 `hdfs` 目录
+### 步骤 1:准备 Connector 插件
-也可以直接下载预编译的
[trino-delta-lake-435-20240724.tar.gz](https://github.com/apache/Doris-thirdparty/releases/download/trino-435-20240724/trino-delta-lake-435-20240724.tar.gz)
及
[hdfs.tar.gz](https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724/trino-hdfs-435-20240724.tar.gz)
并解压。
+你可以选择以下两种方式之一来获取 Delta Lake Connector 插件:
-### 部署 Delta Lake Connector
-
-将 `trino-delta-lake-435/` 目录放到所有 FE 和 BE 部署路径的 `connectors/`
目录下(如果没有,可以手动创建),将 `hdfs.tar.gz` 解压到 `trino-delta-lake-435/` 目录下。
-
-```text
-├── bin
-├── conf
-├── connectors
-│ ├── trino-delta-lake-435
-│ │ ├── hdfs
-...
-```
+**方式一:使用预编译包(推荐)**
-部署完成后,建议重启 FE、BE 节点以确保 Connector 可以被正确加载。
+直接下载预编译的
[trino-delta-lake-435-20240724.tar.gz](https://github.com/apache/Doris-thirdparty/releases/download/trino-435-20240724/trino-delta-lake-435-20240724.tar.gz)
及
[hdfs.tar.gz](https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724/trino-hdfs-435-20240724.tar.gz)
并解压。
-## 配置 Catalog
+**方式二:手动编译**
-### 语法
+如果需要自定义编译,按照以下步骤操作(需要 JDK 17):
-```sql
-CREATE CATALOG [IF NOT EXISTS] catalog_name
-PROPERTIES (
- 'type' = 'trino-connector', -- required
- 'trino.connector.name' = 'delta_lake', -- required
- {TrinoProperties},
- {CommonProperties}
-);
+```shell
+git clone https://github.com/apache/doris-thirdparty.git
+cd doris-thirdparty
+git checkout trino-435
+cd plugin/trino-delta-lake
+mvn clean install -DskipTests
+cd ../../lib/trino-hdfs
+mvn clean install -DskipTests
```
-* `{TrinoProperties}`
-
- TrinoProperties 部分用于填写将传递给 Trino Connector 的属性,这些属性以`trino.`为前缀。理论上,Trino
支持的属性这里都支持,更多有关 Delta Lake 的信息可以参考 [Trino
文档](https://trino.io/docs/435/connector/delta-lake.html)。
-
-* `[CommonProperties]`
-
- CommonProperties 部分用于填写通用属性。请参阅[ 数据目录概述 ](../catalog-overview.md)中【通用属性】部分。
+完成编译后,会在 `trino/plugin/trino-delta-lake/target/` 下得到 `trino-delta-lake-435`
目录,在 `trino/lib/trino-hdfs/target/` 下得到 `hdfs` 目录。
-### 支持的 Delta Lake 版本
+### 步骤 2:部署插件
-更多有关 Delta Lake 的信息可以参考 [Trino
文档](https://trino.io/docs/435/connector/delta-lake.html)。
+1. 将 `trino-delta-lake-435/` 目录放到所有 FE 和 BE 部署路径的 `connectors/`
目录下(如果没有该目录,请手动创建):
-### 支持的元数据服务
+ ```text
+ ├── bin
+ ├── conf
+ ├── plugins
+ │ ├── connectors
+ │ ├── trino-delta-lake-435
+ │ ├── hdfs
+ ...
+ ```
-更多有关 Delta Lake 的信息可以参考 [Trino
文档](https://trino.io/docs/435/connector/delta-lake.html)。
+ > 也可以通过修改 `fe.conf` 的 `trino_connector_plugin_dir`
配置自定义插件路径。如:`trino_connector_plugin_dir=/path/to/connectors/`
-### 支持的存储系统
+2. 重启所有 FE 和 BE 节点,以确保 Connector 被正确加载。
-更多有关 Delta Lake 的信息可以参考 [Trino
文档](https://trino.io/docs/435/connector/delta-lake.html)。
+### 步骤 3:创建 Catalog
-## 列类型映射
-
-| Delta Lake Type | Trino Type | Doris Type | Comment |
-| --------------- | --------------------------- | ------------- | ------- |
-| boolean | boolean | boolean | |
-| int | int | int | |
-| byte | tinyint | tinyint | |
-| short | smallint | smallint | |
-| long | bigint | bigint | |
-| float | real | float | |
-| double | double | double | |
-| decimal(P, S) | decimal(P, S) | decimal(P, S) | |
-| string | varchar | string | |
-| bianry | varbinary | string | |
-| date | date | date | |
-| timestamp\_ntz | timestamp(N) | datetime(N) | |
-| timestamp | timestamp with time zone(N) | datetime(N) | |
-| array | array | array | |
-| map | map | map | |
-| struct | row | struct | |
-
-## 基础示例
+**基础配置**
```sql
-CREATE CATALOG delta_lake_hms properties (
- 'type' = 'trino-connector',
+CREATE CATALOG delta_lake_catalog PROPERTIES (
+ 'type' = 'trino-connector',
'trino.connector.name' = 'delta_lake',
'trino.hive.metastore' = 'thrift',
- 'trino.hive.metastore.uri'= 'thrift://ip:port',
-
'trino.hive.config.resources'='/path/to/core-site.xml,/path/to/hdfs-site.xml'
+ 'trino.hive.metastore.uri' = 'thrift://ip:port',
+ 'trino.hive.config.resources' =
'/path/to/core-site.xml,/path/to/hdfs-site.xml'
);
```
-## 查询操作
+**配置说明**
+
+- `trino.hive.metastore`:元数据服务类型,支持 `thrift`(Hive Metastore)等
+- `trino.hive.metastore.uri`:Hive Metastore 服务地址
+- `trino.hive.config.resources`:Hadoop 配置文件路径,多个文件用逗号分隔
+
+更多配置选项请参考下方「配置说明」部分或 [Trino
官方文档](https://trino.io/docs/435/connector/delta-lake.html)。
-配置好 Catalog 后,可以通过以下方式查询 Catalog 中的表数据:
+### 步骤 4:查询数据
+
+创建 Catalog 后,可以通过以下三种方式查询 Delta Lake 表数据:
```sql
--- 1. switch to catalog, use database and query
-SWITCH delta_lake_ctl;
+-- 方式 1:切换到 Catalog 后查询
+SWITCH delta_lake_catalog;
USE delta_lake_db;
SELECT * FROM delta_lake_tbl LIMIT 10;
--- 2. use dalta lake database directly
-USE delta_lake_ctl.delta_lake_db;
+-- 方式 2:使用两级路径
+USE delta_lake_catalog.delta_lake_db;
SELECT * FROM delta_lake_tbl LIMIT 10;
--- 3. use full qualified name to query
-SELECT * FROM delta_lake_ctl.delta_lake_db.delta_lake_tbl LIMIT 10;
+-- 方式 3:使用全限定名
+SELECT * FROM delta_lake_catalog.delta_lake_db.delta_lake_tbl LIMIT 10;
```
+## 配置说明
+
+### Catalog 配置参数
+
+创建 Delta Lake Catalog 的基本语法如下:
+
+```sql
+CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
+ 'type' = 'trino-connector', -- 必填,固定值
+ 'trino.connector.name' = 'delta_lake', -- 必填,固定值
+ {TrinoProperties}, -- Trino Connector 相关属性
+ {CommonProperties} -- 通用属性
+);
+```
+
+#### TrinoProperties 参数
+
+TrinoProperties 用于配置 Trino Delta Lake Connector 的专有属性,这些属性以 `trino.`
为前缀。常用参数包括:
+
+| 参数名称 | 必填 | 默认值 | 说明
|
+| --------------------------------- | ---- | ------ |
--------------------------------------------- |
+| `trino.hive.metastore` | 是 | - | 元数据服务类型,如 `thrift`
|
+| `trino.hive.metastore.uri` | 是 | - | Hive Metastore 服务地址
|
+| `trino.hive.config.resources` | 否 | - | Hadoop 配置文件路径,多个文件用逗号分隔
|
+| `trino.delta.hide-non-delta-tables` | 否 | false | 是否隐藏非 Delta Lake 表
|
+
+更多 Delta Lake Connector 配置参数请参考 [Trino
官方文档](https://trino.io/docs/435/connector/delta-lake.html)。
+
+#### CommonProperties 参数
+
+CommonProperties 用于配置 Catalog
的通用属性,例如元数据刷新策略、权限控制等。详细说明请参阅[数据目录概述](../catalog-overview.md)中「通用属性」部分。
+
+## 数据类型映射
+
+在使用 Delta Lake Catalog 时,数据类型会按照以下规则进行映射:
+
+| Delta Lake Type | Trino Type | Doris Type | 说明 |
+| --------------- | --------------------------- | ------------- | ---- |
+| boolean | boolean | boolean | |
+| int | int | int | |
+| byte | tinyint | tinyint | |
+| short | smallint | smallint | |
+| long | bigint | bigint | |
+| float | real | float | |
+| double | double | double | |
+| decimal(P, S) | decimal(P, S) | decimal(P, S) | |
+| string | varchar | string | |
+| binary | varbinary | string | |
+| date | date | date | |
+| timestamp\_ntz | timestamp(N) | datetime(N) | |
+| timestamp | timestamp with time zone(N) | datetime(N) | |
+| array | array | array | |
+| map | map | map | |
+| struct | row | struct | |
+
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/kafka-catalog.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/kafka-catalog.md
index 1d2ad002e57..aeb7d3f36b5 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/kafka-catalog.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/kafka-catalog.md
@@ -12,9 +12,6 @@ Kafka Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/community/
:::note
- 该功能为实验功能,自 3.0.1 版本开始支持。
-:::
-
-:::note
- 该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
:::
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/kudu-catalog.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/kudu-catalog.md
index 71eb6bac592..76e005ac014 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/kudu-catalog.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/kudu-catalog.md
@@ -12,9 +12,6 @@ Kudu Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/community/h
:::note
- 该功能为实验功能,自 3.0.1 版本开始支持。
-:::
-
-:::note
- 该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
:::
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/bigquery-catalog.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/bigquery-catalog.md
index 534d9e09cce..594d1cfb8fc 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/bigquery-catalog.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/bigquery-catalog.md
@@ -12,9 +12,6 @@ BigQuery Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/communi
:::note
- 该功能为实验功能,自 3.0.1 版本开始支持。
-:::
-
-:::note
- 该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
:::
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/delta-lake-catalog.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/delta-lake-catalog.md
index 078d52ac330..bc591d1afb2 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/delta-lake-catalog.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/delta-lake-catalog.md
@@ -2,140 +2,168 @@
{
"title": "Delta Lake Catalog",
"language": "zh-CN",
- "description": "Delta Lake Catalog 通过 Trino Connector 兼容框架,使用 Delta Lake
Connector 来访问 Delta Lake 表。"
+ "description": "Apache Doris Delta Lake Catalog 使用指南:通过 Trino Connector
框架连接 Delta Lake 数据湖,实现 Delta Lake 表数据的查询和集成。支持 Hive Metastore、多种数据类型映射,快速完成
Delta Lake 与 Doris 的数据集成。"
}
---
-Delta Lake Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide/)
兼容框架,使用 Delta Lake Connector 来访问 Delta Lake 表。
+## 概述
+
+Delta Lake Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide/)
兼容框架,使用 Trino Delta Lake Connector 来访问 Delta Lake 表。
:::note
-该功能为实验功能,自 3.0.1 版本开始支持。
+- 该功能为实验功能,自 3.0.1 版本开始支持。
+- 该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
:::
-## 适用场景
-
-| 场景 | 说明 |
-| ---- | ------------------------------ |
-| 数据集成 | 读取 Detla Lake 数据并写入到 Doris 内表。 |
-| 数据写回 | 不支持。 |
+### 适用场景
-## 环境准备
+| 场景 | 支持情况 |
+| -------- | -------------------------------------- |
+| 数据集成 | 读取 Delta Lake 数据并写入到 Doris 内表 |
+| 数据写回 | 不支持 |
-### 编译 Delta Lake Connector 插件
+### 版本兼容性
-> 需要 JDK 17 版本。
+- **Doris 版本**:3.0.1 及以上
+- **Trino Connector 版本**:435
+- **Delta Lake 版本**:具体支持的版本请参考 [Trino
文档](https://trino.io/docs/435/connector/delta-lake.html)
-```shell
-$ git clone https://github.com/apache/doris-thirdparty.git
-$ cd doris-thirdparty
-$ git checkout trino-435
-$ cd plugin/trino-delta-lake
-$ mvn clean install -DskipTest
-$ cd ../../lib/trino-hdfs
-$ mvn clean install -DskipTest
-```
+## 快速开始
-完成编译后,会在 `trino/plugin/trino-delta-lake/target/` 下得到 `trino-delta-lake-435`
目录,在 `trino/lib/trino-hdfs/target/` 下得到 `hdfs` 目录
+### 步骤 1:准备 Connector 插件
-也可以直接下载预编译的
[trino-delta-lake-435-20240724.tar.gz](https://github.com/apache/Doris-thirdparty/releases/download/trino-435-20240724/trino-delta-lake-435-20240724.tar.gz)
及
[hdfs.tar.gz](https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724/trino-hdfs-435-20240724.tar.gz)
并解压。
+你可以选择以下两种方式之一来获取 Delta Lake Connector 插件:
-### 部署 Delta Lake Connector
-
-将 `trino-delta-lake-435/` 目录放到所有 FE 和 BE 部署路径的 `connectors/`
目录下(如果没有,可以手动创建),将 `hdfs.tar.gz` 解压到 `trino-delta-lake-435/` 目录下。
-
-```text
-├── bin
-├── conf
-├── connectors
-│ ├── trino-delta-lake-435
-│ │ ├── hdfs
-...
-```
+**方式一:使用预编译包(推荐)**
-部署完成后,建议重启 FE、BE 节点以确保 Connector 可以被正确加载。
+直接下载预编译的
[trino-delta-lake-435-20240724.tar.gz](https://github.com/apache/Doris-thirdparty/releases/download/trino-435-20240724/trino-delta-lake-435-20240724.tar.gz)
及
[hdfs.tar.gz](https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724/trino-hdfs-435-20240724.tar.gz)
并解压。
-## 配置 Catalog
+**方式二:手动编译**
-### 语法
+如果需要自定义编译,按照以下步骤操作(需要 JDK 17):
-```sql
-CREATE CATALOG [IF NOT EXISTS] catalog_name
-PROPERTIES (
- 'type' = 'trino-connector', -- required
- 'trino.connector.name' = 'delta_lake', -- required
- {TrinoProperties},
- {CommonProperties}
-);
+```shell
+git clone https://github.com/apache/doris-thirdparty.git
+cd doris-thirdparty
+git checkout trino-435
+cd plugin/trino-delta-lake
+mvn clean install -DskipTests
+cd ../../lib/trino-hdfs
+mvn clean install -DskipTests
```
-* `{TrinoProperties}`
-
- TrinoProperties 部分用于填写将传递给 Trino Connector 的属性,这些属性以`trino.`为前缀。理论上,Trino
支持的属性这里都支持,更多有关 Delta Lake 的信息可以参考 [Trino
文档](https://trino.io/docs/435/connector/delta-lake.html)。
-
-* `[CommonProperties]`
-
- CommonProperties 部分用于填写通用属性。请参阅[ 数据目录概述 ](../catalog-overview.md)中【通用属性】部分。
+完成编译后,会在 `trino/plugin/trino-delta-lake/target/` 下得到 `trino-delta-lake-435`
目录,在 `trino/lib/trino-hdfs/target/` 下得到 `hdfs` 目录。
-### 支持的 Delta Lake 版本
+### 步骤 2:部署插件
-更多有关 Delta Lake 的信息可以参考 [Trino
文档](https://trino.io/docs/435/connector/delta-lake.html)。
+1. 将 `trino-delta-lake-435/` 目录放到所有 FE 和 BE 部署路径的 `connectors/`
目录下(如果没有该目录,请手动创建):
-### 支持的元数据服务
+ ```text
+ ├── bin
+ ├── conf
+ ├── plugins
+ │ ├── connectors
+ │ ├── trino-delta-lake-435
+ │ ├── hdfs
+ ...
+ ```
-更多有关 Delta Lake 的信息可以参考 [Trino
文档](https://trino.io/docs/435/connector/delta-lake.html)。
+ > 也可以通过修改 `fe.conf` 的 `trino_connector_plugin_dir`
配置自定义插件路径。如:`trino_connector_plugin_dir=/path/to/connectors/`
-### 支持的存储系统
+2. 重启所有 FE 和 BE 节点,以确保 Connector 被正确加载。
-更多有关 Delta Lake 的信息可以参考 [Trino
文档](https://trino.io/docs/435/connector/delta-lake.html)。
+### 步骤 3:创建 Catalog
-## 列类型映射
-
-| Delta Lake Type | Trino Type | Doris Type | Comment |
-| --------------- | --------------------------- | ------------- | ------- |
-| boolean | boolean | boolean | |
-| int | int | int | |
-| byte | tinyint | tinyint | |
-| short | smallint | smallint | |
-| long | bigint | bigint | |
-| float | real | float | |
-| double | double | double | |
-| decimal(P, S) | decimal(P, S) | decimal(P, S) | |
-| string | varchar | string | |
-| bianry | varbinary | string | |
-| date | date | date | |
-| timestamp\_ntz | timestamp(N) | datetime(N) | |
-| timestamp | timestamp with time zone(N) | datetime(N) | |
-| array | array | array | |
-| map | map | map | |
-| struct | row | struct | |
-
-## 基础示例
+**基础配置**
```sql
-CREATE CATALOG delta_lake_hms properties (
- 'type' = 'trino-connector',
+CREATE CATALOG delta_lake_catalog PROPERTIES (
+ 'type' = 'trino-connector',
'trino.connector.name' = 'delta_lake',
'trino.hive.metastore' = 'thrift',
- 'trino.hive.metastore.uri'= 'thrift://ip:port',
-
'trino.hive.config.resources'='/path/to/core-site.xml,/path/to/hdfs-site.xml'
+ 'trino.hive.metastore.uri' = 'thrift://ip:port',
+ 'trino.hive.config.resources' =
'/path/to/core-site.xml,/path/to/hdfs-site.xml'
);
```
-## 查询操作
+**配置说明**
+
+- `trino.hive.metastore`:元数据服务类型,支持 `thrift`(Hive Metastore)等
+- `trino.hive.metastore.uri`:Hive Metastore 服务地址
+- `trino.hive.config.resources`:Hadoop 配置文件路径,多个文件用逗号分隔
+
+更多配置选项请参考下方「配置说明」部分或 [Trino
官方文档](https://trino.io/docs/435/connector/delta-lake.html)。
-配置好 Catalog 后,可以通过以下方式查询 Catalog 中的表数据:
+### 步骤 4:查询数据
+
+创建 Catalog 后,可以通过以下三种方式查询 Delta Lake 表数据:
```sql
--- 1. switch to catalog, use database and query
-SWITCH delta_lake_ctl;
+-- 方式 1:切换到 Catalog 后查询
+SWITCH delta_lake_catalog;
USE delta_lake_db;
SELECT * FROM delta_lake_tbl LIMIT 10;
--- 2. use dalta lake database directly
-USE delta_lake_ctl.delta_lake_db;
+-- 方式 2:使用两级路径
+USE delta_lake_catalog.delta_lake_db;
SELECT * FROM delta_lake_tbl LIMIT 10;
--- 3. use full qualified name to query
-SELECT * FROM delta_lake_ctl.delta_lake_db.delta_lake_tbl LIMIT 10;
+-- 方式 3:使用全限定名
+SELECT * FROM delta_lake_catalog.delta_lake_db.delta_lake_tbl LIMIT 10;
```
+## 配置说明
+
+### Catalog 配置参数
+
+创建 Delta Lake Catalog 的基本语法如下:
+
+```sql
+CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
+ 'type' = 'trino-connector', -- 必填,固定值
+ 'trino.connector.name' = 'delta_lake', -- 必填,固定值
+ {TrinoProperties}, -- Trino Connector 相关属性
+ {CommonProperties} -- 通用属性
+);
+```
+
+#### TrinoProperties 参数
+
+TrinoProperties 用于配置 Trino Delta Lake Connector 的专有属性,这些属性以 `trino.`
为前缀。常用参数包括:
+
+| 参数名称 | 必填 | 默认值 | 说明
|
+| --------------------------------- | ---- | ------ |
--------------------------------------------- |
+| `trino.hive.metastore` | 是 | - | 元数据服务类型,如 `thrift`
|
+| `trino.hive.metastore.uri` | 是 | - | Hive Metastore 服务地址
|
+| `trino.hive.config.resources` | 否 | - | Hadoop 配置文件路径,多个文件用逗号分隔
|
+| `trino.delta.hide-non-delta-tables` | 否 | false | 是否隐藏非 Delta Lake 表
|
+
+更多 Delta Lake Connector 配置参数请参考 [Trino
官方文档](https://trino.io/docs/435/connector/delta-lake.html)。
+
+#### CommonProperties 参数
+
+CommonProperties 用于配置 Catalog
的通用属性,例如元数据刷新策略、权限控制等。详细说明请参阅[数据目录概述](../catalog-overview.md)中「通用属性」部分。
+
+## 数据类型映射
+
+在使用 Delta Lake Catalog 时,数据类型会按照以下规则进行映射:
+
+| Delta Lake Type | Trino Type | Doris Type | 说明 |
+| --------------- | --------------------------- | ------------- | ---- |
+| boolean | boolean | boolean | |
+| int | int | int | |
+| byte | tinyint | tinyint | |
+| short | smallint | smallint | |
+| long | bigint | bigint | |
+| float | real | float | |
+| double | double | double | |
+| decimal(P, S) | decimal(P, S) | decimal(P, S) | |
+| string | varchar | string | |
+| binary | varbinary | string | |
+| date | date | date | |
+| timestamp\_ntz | timestamp(N) | datetime(N) | |
+| timestamp | timestamp with time zone(N) | datetime(N) | |
+| array | array | array | |
+| map | map | map | |
+| struct | row | struct | |
+
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/kafka-catalog.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/kafka-catalog.md
index 1d2ad002e57..aeb7d3f36b5 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/kafka-catalog.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/kafka-catalog.md
@@ -12,9 +12,6 @@ Kafka Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/community/
:::note
- 该功能为实验功能,自 3.0.1 版本开始支持。
-:::
-
-:::note
- 该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
:::
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/kudu-catalog.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/kudu-catalog.md
index 71eb6bac592..76e005ac014 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/kudu-catalog.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/kudu-catalog.md
@@ -12,9 +12,6 @@ Kudu Catalog 通过 [Trino
Connector](https://doris.apache.org/zh-CN/community/h
:::note
- 该功能为实验功能,自 3.0.1 版本开始支持。
-:::
-
-:::note
- 该功能不依赖 Trino 集群环境,仅使用 Trino 兼容插件。
:::
diff --git a/versioned_docs/version-3.x/lakehouse/catalogs/bigquery-catalog.md
b/versioned_docs/version-3.x/lakehouse/catalogs/bigquery-catalog.md
index 9422f89fe3d..e96453a3a4e 100644
--- a/versioned_docs/version-3.x/lakehouse/catalogs/bigquery-catalog.md
+++ b/versioned_docs/version-3.x/lakehouse/catalogs/bigquery-catalog.md
@@ -12,9 +12,6 @@ BigQuery Catalog uses the Trino BigQuery Connector to access
BigQuery tables thr
:::note
- This feature is experimental and supported since version 3.0.1.
-:::
-
-:::note
- This feature does not depend on a Trino cluster environment and only uses
the Trino compatibility plugin.
:::
diff --git
a/versioned_docs/version-3.x/lakehouse/catalogs/delta-lake-catalog.md
b/versioned_docs/version-3.x/lakehouse/catalogs/delta-lake-catalog.md
index 9133807f9da..569505d2229 100644
--- a/versioned_docs/version-3.x/lakehouse/catalogs/delta-lake-catalog.md
+++ b/versioned_docs/version-3.x/lakehouse/catalogs/delta-lake-catalog.md
@@ -2,139 +2,167 @@
{
"title": "Delta Lake Catalog",
"language": "en",
- "description": "Delta Lake Catalog uses the Trino Connector compatibility
framework to access Delta Lake tables through the Delta Lake Connector."
+ "description": "Apache Doris Delta Lake Catalog User Guide: Connect to
Delta Lake data lake through Trino Connector framework to query and integrate
Delta Lake table data. Supports Hive Metastore, multiple data type mappings,
and quick integration between Delta Lake and Doris."
}
---
-Delta Lake Catalog uses the [Trino
Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide/)
compatibility framework to access Delta Lake tables through the Delta Lake
Connector.
+## Overview
+
+Delta Lake Catalog uses the [Trino
Connector](https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/)
compatibility framework with Trino Delta Lake Connector to access Delta Lake
tables.
:::note
-This feature is experimental and has been supported since version 3.0.1.
+- This is an experimental feature, supported since version 3.0.1.
+- This feature does not depend on a Trino cluster environment, it only uses
the Trino compatibility plugin.
:::
-## Application Scenarios
-
-| Scenario | Description |
-| -------------- | ------------------------------------ |
-| Data Integration | Read Delta Lake data and write it into Doris internal
tables. |
-| Data Writeback | Not supported. |
+### Use Cases
-## Environment Preparation
+| Scenario | Support Status |
+| -------- | -------------- |
+| Data Integration | Read Delta Lake data and write to Doris internal tables |
+| Data Write-back | Not supported |
-### Compile the Delta Lake Connector Plugin
+### Version Compatibility
-> JDK 17 is required.
+- **Doris Version**: 3.0.1 and above
+- **Trino Connector Version**: 435
+- **Delta Lake Version**: For supported versions, please refer to [Trino
Documentation](https://trino.io/docs/435/connector/delta-lake.html)
-```shell
-$ git clone https://github.com/apache/doris-thirdparty.git
-$ cd doris-thirdparty
-$ git checkout trino-435
-$ cd plugin/trino-delta-lake
-$ mvn clean install -DskipTest
-$ cd ../../lib/trino-hdfs
-$ mvn clean install -DskipTest
-```
+## Quick Start
-After compiling, you will find the `trino-delta-lake-435` directory under
`trino/plugin/trino-delta-lake/target/` and the `hdfs` directory under
`trino/lib/trino-hdfs/target/`.
+### Step 1: Prepare Connector Plugin
-You can also directly download the precompiled
[trino-delta-lake-435-20240724.tar.gz](https://github.com/apache/Doris-thirdparty/releases/download/trino-435-20240724/trino-delta-lake-435-20240724.tar.gz)
and
[hdfs.tar.gz](https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724/trino-hdfs-435-20240724.tar.gz),
then extract them.
+You can obtain the Delta Lake Connector plugin using one of the following
methods:
-### Deploy the Delta Lake Connector
-
-Place the `trino-delta-lake-435/` directory in the `connectors/` directory of
all FE and BE deployment paths(If it does not exist, you can create it
manually) and extract `hdfs.tar.gz` into the `trino-delta-lake-435/` directory.
-
-```text
-├── bin
-├── conf
-├── connectors
-│ ├── trino-delta-lake-435
-│ │ ├── hdfs
-...
-```
+**Method 1: Use Pre-compiled Package (Recommended)**
-After deployment, it is recommended to restart the FE and BE nodes to ensure
the Connector is loaded correctly.
+Download the pre-compiled
[trino-delta-lake-435-20240724.tar.gz](https://github.com/apache/Doris-thirdparty/releases/download/trino-435-20240724/trino-delta-lake-435-20240724.tar.gz)
and
[hdfs.tar.gz](https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724/trino-hdfs-435-20240724.tar.gz)
and extract them.
-## Configuring Catalog
+**Method 2: Manual Compilation**
-### Syntax
+If you need custom compilation, follow these steps (requires JDK 17):
-```sql
-CREATE CATALOG [IF NOT EXISTS] catalog_name
-PROPERTIES (
- 'type' = 'trino-connector', -- required
- 'trino.connector.name' = 'delta_lake', -- required
- {TrinoProperties},
- {CommonProperties}
-);
+```shell
+git clone https://github.com/apache/doris-thirdparty.git
+cd doris-thirdparty
+git checkout trino-435
+cd plugin/trino-delta-lake
+mvn clean install -DskipTests
+cd ../../lib/trino-hdfs
+mvn clean install -DskipTests
```
-* `{TrinoProperties}`
-
- The TrinoProperties section is used to specify properties that will be
passed to the Trino Connector. These properties use the `trino.` prefix. In
theory, all properties supported by Trino are also supported here. For more
information about Delta Lake, refer to the [Trino
documentation](https://trino.io/docs/435/connector/delta-lake.html).
-
-* `[CommonProperties]`
-
- The CommonProperties section is used to specify general properties. Please
refer to the [Catalog Overview](../catalog-overview.md) under the "Common
Properties" section.
-
-### Supported Delta Lake Versions
+After compilation, you'll get the `trino-delta-lake-435` directory under
`trino/plugin/trino-delta-lake/target/`, and the `hdfs` directory under
`trino/lib/trino-hdfs/target/`.
-For more information about Delta Lake, refer to the [Trino
documentation](https://trino.io/docs/435/connector/delta-lake.html).
+### Step 2: Deploy Plugin
-### Supported Metadata Services
+1. Place the `trino-delta-lake-435/` directory under the `connectors/`
directory of all FE and BE deployment paths (create the directory manually if
it doesn't exist):
-For more information about Delta Lake, refer to the [Trino
documentation](https://trino.io/docs/435/connector/delta-lake.html).
+ ```text
+ ├── bin
+ ├── conf
+ ├── plugins
+ │ ├── connectors
+ │ ├── trino-delta-lake-435
+ │ ├── hdfs
+ ...
+ ```
-### Supported Storage Systems
+ > You can also customize the plugin path by modifying the
`trino_connector_plugin_dir` configuration in `fe.conf`. For example:
`trino_connector_plugin_dir=/path/to/connectors/`
-For more information about Delta Lake, refer to the [Trino
documentation](https://trino.io/docs/435/connector/delta-lake.html).
+2. Restart all FE and BE nodes to ensure the connector is loaded correctly.
-## Column Type Mapping
+### Step 3: Create Catalog
-| Delta Lake Type | Trino Type | Doris Type | Comment |
-| --------------- | --------------------------- | ------------- | ------- |
-| boolean | boolean | boolean | |
-| int | int | int | |
-| byte | tinyint | tinyint | |
-| short | smallint | smallint | |
-| long | bigint | bigint | |
-| float | real | float | |
-| double | double | double | |
-| decimal(P, S) | decimal(P, S) | decimal(P, S) | |
-| string | varchar | string | |
-| bianry | varbinary | string | |
-| date | date | date | |
-| timestamp\_ntz | timestamp(N) | datetime(N) | |
-| timestamp | timestamp with time zone(N) | datetime(N) | |
-| array | array | array | |
-| map | map | map | |
-| struct | row | struct | |
-
-## Examples
+**Basic Configuration**
```sql
-CREATE CATALOG delta_lake_hms properties (
- 'type' = 'trino-connector',
+CREATE CATALOG delta_lake_catalog PROPERTIES (
+ 'type' = 'trino-connector',
'trino.connector.name' = 'delta_lake',
'trino.hive.metastore' = 'thrift',
- 'trino.hive.metastore.uri'= 'thrift://ip:port',
-
'trino.hive.config.resources'='/path/to/core-site.xml,/path/to/hdfs-site.xml'
+ 'trino.hive.metastore.uri' = 'thrift://ip:port',
+ 'trino.hive.config.resources' =
'/path/to/core-site.xml,/path/to/hdfs-site.xml'
);
```
-## Query Operations
+**Configuration Description**
+
+- `trino.hive.metastore`: Metadata service type, supports `thrift` (Hive
Metastore), etc.
+- `trino.hive.metastore.uri`: Hive Metastore service address
+- `trino.hive.config.resources`: Hadoop configuration file path, multiple
files separated by commas
+
+For more configuration options, please refer to the "Configuration
Description" section below or [Trino Official
Documentation](https://trino.io/docs/435/connector/delta-lake.html).
-After configuring the Catalog, you can query the table data in the Catalog
using the following methods:
+### Step 4: Query Data
+
+After creating the Catalog, you can query Delta Lake table data using one of
the following three methods:
```sql
--- 1. Switch to the catalog, use the database, and query
-SWITCH delta_lake_ctl;
+-- Method 1: Switch to Catalog then query
+SWITCH delta_lake_catalog;
USE delta_lake_db;
SELECT * FROM delta_lake_tbl LIMIT 10;
--- 2. Use the Delta Lake database directly
-USE delta_lake_ctl.delta_lake_db;
+-- Method 2: Use two-level path
+USE delta_lake_catalog.delta_lake_db;
SELECT * FROM delta_lake_tbl LIMIT 10;
--- 3. Use the fully qualified name to query
-SELECT * FROM delta_lake_ctl.delta_lake_db.delta_lake_tbl LIMIT 10;
+-- Method 3: Use fully qualified name
+SELECT * FROM delta_lake_catalog.delta_lake_db.delta_lake_tbl LIMIT 10;
+```
+
+## Configuration Description
+
+### Catalog Configuration Parameters
+
+The basic syntax for creating a Delta Lake Catalog is as follows:
+
+```sql
+CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
+ 'type' = 'trino-connector', -- Required, fixed value
+ 'trino.connector.name' = 'delta_lake', -- Required, fixed value
+ {TrinoProperties}, -- Trino Connector related
properties
+ {CommonProperties} -- Common properties
+);
```
+
+#### TrinoProperties Parameters
+
+TrinoProperties are used to configure Trino Delta Lake Connector specific
properties, these properties are prefixed with `trino.`. Common parameters
include:
+
+| Parameter Name | Required | Default Value | Description |
+| -------------- | -------- | ------------- | ----------- |
+| `trino.hive.metastore` | Yes | - | Metadata service type, such as `thrift` |
+| `trino.hive.metastore.uri` | Yes | - | Hive Metastore service address |
+| `trino.hive.config.resources` | No | - | Hadoop configuration file path,
multiple files separated by commas |
+| `trino.delta.hide-non-delta-tables` | No | false | Whether to hide non-Delta
Lake tables |
+
+For more Delta Lake Connector configuration parameters, please refer to [Trino
Official Documentation](https://trino.io/docs/435/connector/delta-lake.html).
+
+#### CommonProperties Parameters
+
+CommonProperties are used to configure common Catalog properties, such as
metadata refresh policies, permission control, etc. For detailed information,
please refer to the "Common Properties" section in [Catalog
Overview](../catalog-overview.md).
+
+## Data Type Mapping
+
+When using Delta Lake Catalog, data types are mapped according to the
following rules:
+
+| Delta Lake Type | Trino Type | Doris Type | Notes |
+| --------------- | ---------- | ---------- | ----- |
+| boolean | boolean | boolean | |
+| int | int | int | |
+| byte | tinyint | tinyint | |
+| short | smallint | smallint | |
+| long | bigint | bigint | |
+| float | real | float | |
+| double | double | double | |
+| decimal(P, S) | decimal(P, S) | decimal(P, S) | |
+| string | varchar | string | |
+| binary | varbinary | string | |
+| date | date | date | |
+| timestamp_ntz | timestamp(N) | datetime(N) | |
+| timestamp | timestamp with time zone(N) | datetime(N) | |
+| array | array | array | |
+| map | map | map | |
+| struct | row | struct | |
\ No newline at end of file
diff --git a/versioned_docs/version-3.x/lakehouse/catalogs/kafka-catalog.md
b/versioned_docs/version-3.x/lakehouse/catalogs/kafka-catalog.md
index e69de29bb2d..0c71183434e 100644
--- a/versioned_docs/version-3.x/lakehouse/catalogs/kafka-catalog.md
+++ b/versioned_docs/version-3.x/lakehouse/catalogs/kafka-catalog.md
@@ -0,0 +1,358 @@
+---
+{
+ "title": "Kafka Catalog",
+ "language": "en",
+ "description": "Apache Doris Kafka Catalog guide: Connect to Kafka data
streams through Trino Connector framework to query and integrate Kafka Topic
data. Supports Schema Registry, multiple data formats for quick Kafka and Doris
data integration."
+}
+---
+
+## Overview
+
+Kafka Catalog uses the Trino Kafka Connector through the [Trino
Connector](https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/)
compatibility framework to access Kafka Topic data.
+
+:::note
+- This is an experimental feature, supported since version 3.0.1.
+- This feature does not depend on a Trino cluster environment; it only uses
Trino-compatible plugins.
+:::
+
+### Use Cases
+
+| Scenario | Support Status |
+| -------- | -------------- |
+| Data Integration | Read Kafka Topic data and write to Doris internal tables |
+| Data Write-back | Not supported |
+
+### Version Compatibility
+
+- **Doris Version**: 3.0.1 and above
+- **Trino Connector Version**: 435
+- **Kafka Version**: For supported versions, please refer to [Trino
Documentation](https://trino.io/docs/435/connector/kafka.html)
+
+## Quick Start
+
+### Step 1: Prepare Connector Plugin
+
+You can obtain the Kafka Connector plugin using one of the following methods:
+
+**Method 1: Use Pre-compiled Package (Recommended)**
+
+Download and extract the pre-compiled plugin package from
[here](https://github.com/apache/doris-thirdparty/releases/tag/trino-435-20240724).
+
+**Method 2: Manual Compilation**
+
+If you need custom compilation, follow these steps (requires JDK 17):
+
+```shell
+git clone https://github.com/apache/doris-thirdparty.git
+cd doris-thirdparty
+git checkout trino-435
+cd plugin/trino-kafka
+mvn clean package -Dmaven.test.skip=true
+```
+
+After compilation, you will get the `trino-kafka-435/` directory under
`trino/plugin/trino-kafka/target/`.
+
+### Step 2: Deploy Plugin
+
+1. Place the `trino-kafka-435/` directory in the `connectors/` directory of
all FE and BE deployment paths (create the directory manually if it doesn't
exist):
+
+ ```text
+ ├── bin
+ ├── conf
+ ├── plugins
+ │ ├── connectors
+ │ ├── trino-kafka-435
+ ...
+ ```
+
+ > You can also customize the plugin path by modifying the
`trino_connector_plugin_dir` configuration in `fe.conf`. For example:
`trino_connector_plugin_dir=/path/to/connectors/`
+
+2. Restart all FE and BE nodes to ensure the connector is properly loaded.
+
+### Step 3: Create Catalog
+
+**Basic Configuration**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>,<broker2>:<port2>',
+ 'trino.kafka.table-names' = 'test_db.topic_name',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+**Using Configuration File**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>,<broker2>:<port2>',
+ 'trino.kafka.config.resources' = '/path/to/kafka-client.properties',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+**Configure Default Schema**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>,<broker2>:<port2>',
+ 'trino.kafka.default-schema' = 'default_db',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+### Step 4: Query Data
+
+After creating the catalog, you can query Kafka Topic data using one of three
methods:
+
+```sql
+-- Method 1: Switch to catalog and query
+SWITCH kafka;
+USE kafka_schema;
+SELECT * FROM topic_name LIMIT 10;
+
+-- Method 2: Use two-level path
+USE kafka.kafka_schema;
+SELECT * FROM topic_name LIMIT 10;
+
+-- Method 3: Use fully qualified name
+SELECT * FROM kafka.kafka_schema.topic_name LIMIT 10;
+```
+
+## Schema Registry Integration
+
+Kafka Catalog supports automatic schema retrieval through Confluent Schema
Registry, eliminating the need to manually define table structures.
+
+### Configure Schema Registry
+
+**Basic Authentication**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>',
+ 'trino.kafka.table-description-supplier' = 'CONFLUENT',
+ 'trino.kafka.confluent-schema-registry-url' =
'http://<schema-registry-host>:<schema-registry-port>',
+ 'trino.kafka.confluent-schema-registry-auth-type' = 'BASIC_AUTH',
+ 'trino.kafka.confluent-schema-registry.basic-auth.username' = 'admin',
+ 'trino.kafka.confluent-schema-registry.basic-auth.password' = 'admin123',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+**Complete Configuration Example**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>',
+ 'trino.kafka.default-schema' = 'nrdp',
+ 'trino.kafka.table-description-supplier' = 'CONFLUENT',
+ 'trino.kafka.confluent-schema-registry-url' =
'http://<schema-registry-host>:<schema-registry-port>',
+ 'trino.kafka.confluent-schema-registry-auth-type' = 'BASIC_AUTH',
+ 'trino.kafka.confluent-schema-registry.basic-auth.username' = 'admin',
+ 'trino.kafka.confluent-schema-registry.basic-auth.password' = 'admin123',
+ 'trino.kafka.config.resources' = '/path/to/kafka-client.properties',
+ 'trino.kafka.confluent-schema-registry-subject-mapping' =
'nrdp.topic1:NRDP.topic1',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+### Schema Registry Parameters
+
+| Parameter Name | Required | Default | Description |
+| -------------- | -------- | ------- | ----------- |
+| `trino.kafka.table-description-supplier` | No | - | Set to `CONFLUENT` to
enable Schema Registry support |
+| `trino.kafka.confluent-schema-registry-url` | Yes* | - | Schema Registry
service address |
+| `trino.kafka.confluent-schema-registry-auth-type` | No | NONE |
Authentication type: NONE, BASIC_AUTH, BEARER |
+| `trino.kafka.confluent-schema-registry.basic-auth.username` | No | - | Basic
Auth username |
+| `trino.kafka.confluent-schema-registry.basic-auth.password` | No | - | Basic
Auth password |
+| `trino.kafka.confluent-schema-registry-subject-mapping` | No | - | Subject
name mapping, format: `<db1>.<tbl1>:<topic_name1>,<db2>.<tbl2>:<topic_name2>` |
+
+:::tip
+When using Schema Registry, Doris will automatically retrieve Topic schema
information from Schema Registry, eliminating the need to manually create table
structures.
+:::
+
+### Subject Mapping
+
+In some cases, the Subject name registered in Schema Registry may not match
the Topic name in Kafka, preventing data queries. In such cases, you need to
manually specify the mapping relationship through
`confluent-schema-registry-subject-mapping`.
+
+```sql
+-- Map schema.topic to SCHEMA.topic Subject in Schema Registry
+'trino.kafka.confluent-schema-registry-subject-mapping' =
'<db1>.<tbl1>:<topic_name1>'
+```
+
+Where `db1` and `tbl1` are the actual Database and Table names seen in Doris,
and `topic_name1` is the actual Topic name in Kafka (case-sensitive).
+
+Multiple mappings can be separated by commas:
+
+```sql
+'trino.kafka.confluent-schema-registry-subject-mapping' =
'<db1>.<tbl1>:<topic_name1>,<db2>.<tbl2>:<topic_name2>'
+```
+
+## Configuration
+
+### Catalog Configuration Parameters
+
+The basic syntax for creating a Kafka Catalog is as follows:
+
+```sql
+CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
+ 'type' = 'trino-connector', -- Required, fixed value
+ 'trino.connector.name' = 'kafka', -- Required, fixed value
+ {TrinoProperties}, -- Trino Connector related properties
+ {CommonProperties} -- Common properties
+);
+```
+
+#### TrinoProperties Parameters
+
+TrinoProperties are used to configure Trino Kafka Connector-specific
properties, which are prefixed with `trino.`. Common parameters include:
+
+| Parameter Name | Required | Default | Description |
+| -------------- | -------- | ------- | ----------- |
+| `trino.kafka.nodes` | Yes | - | Kafka Broker node address list, format:
`host1:port1,host2:port2` |
+| `trino.kafka.table-names` | No | - | List of Topics to map, format:
`schema.topic1,schema.topic2` |
+| `trino.kafka.default-schema` | No | default | Default schema name |
+| `trino.kafka.hide-internal-columns` | No | true | Whether to hide Kafka
internal columns (such as `_partition_id`, `_partition_offset`, etc.) |
+| `trino.kafka.config.resources` | No | - | Kafka client configuration file
path |
+| `trino.kafka.table-description-supplier` | No | - | Table structure
provider, set to `CONFLUENT` to use Schema Registry |
+| `trino.kafka.confluent-schema-registry-url` | No | - | Schema Registry
service address |
+
+For more Kafka Connector configuration parameters, please refer to [Trino
Official Documentation](https://trino.io/docs/435/connector/kafka.html).
+
+#### CommonProperties Parameters
+
+CommonProperties are used to configure general catalog properties, such as
metadata refresh policies and permission control. For detailed information,
please refer to the "Common Properties" section in [Catalog
Overview](../catalog-overview.md).
+
+### Kafka Client Configuration
+
+When you need to configure advanced Kafka client parameters (such as security
authentication, SSL, etc.), you can specify them through a configuration file.
Create a configuration file (e.g., `kafka-client.properties`):
+
+```properties
+# ============================================
+# Kerberos/SASL Authentication Configuration
+# ============================================
+sasl.mechanism=GSSAPI
+sasl.kerberos.service.name=kafka
+
+# JAAS Configuration - Using keytab method
+sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required \
+ useKeyTab=true \
+ storeKey=true \
+ useTicketCache=false \
+ serviceName="kafka" \
+ keyTab="/opt/trino/security/keytabs/kafka.keytab" \
+ principal="[email protected]";
+
+# ============================================
+# Avro Deserializer Configuration
+# ============================================
+key.deserializer=io.confluent.kafka.serializers.KafkaAvroDeserializer
+value.deserializer=io.confluent.kafka.serializers.KafkaAvroDeserializer
+```
+
+Then specify the configuration file when creating the catalog:
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>',
+ 'trino.kafka.config.resources' = '/path/to/kafka-client.properties'
+);
+```
+
+## Data Type Mapping
+
+When using Kafka Catalog, data types are mapped according to the following
rules:
+
+| Kafka/Avro Type | Trino Type | Doris Type | Notes |
+| --------------- | ---------- | ---------- | ----- |
+| boolean | boolean | boolean | |
+| int | integer | int | |
+| long | bigint | bigint | |
+| float | real | float | |
+| double | double | double | |
+| bytes | varbinary | string | Use `HEX(col)` function to query |
+| string | varchar | string | |
+| array | array | array | |
+| map | map | map | |
+| record | row | struct | Complex nested structure |
+| enum | varchar | string | |
+| fixed | varbinary | string | |
+| null | - | - | |
+
+:::tip
+- For `bytes` type, use the `HEX()` function to display in hexadecimal format.
+- The data types supported by Kafka Catalog depend on the serialization format
used (JSON, Avro, Protobuf, etc.) and Schema Registry configuration.
+:::
+
+## Kafka Internal Columns
+
+Kafka Connector provides some internal columns to access metadata information
of Kafka messages:
+
+| Column Name | Type | Description |
+| ----------- | ---- | ----------- |
+| `_partition_id` | bigint | Partition ID where the message is located |
+| `_partition_offset` | bigint | Message offset within the partition |
+| `_message_timestamp` | timestamp | Message timestamp |
+| `_key` | varchar | Message key |
+| `_key_corrupt` | boolean | Whether the key is corrupted |
+| `_key_length` | bigint | Key byte length |
+| `_message` | varchar | Raw message content |
+| `_message_corrupt` | boolean | Whether the message is corrupted |
+| `_message_length` | bigint | Message byte length |
+| `_headers` | map | Message header information |
+
+By default, these internal columns are hidden. If you need to query these
columns, set when creating the catalog:
+
+```sql
+'trino.kafka.hide-internal-columns' = 'false'
+```
+
+Query example:
+
+```sql
+SELECT
+ _partition_id,
+ _partition_offset,
+ _message_timestamp,
+ *
+FROM kafka.schema.topic_name
+LIMIT 10;
+```
+
+## Limitations
+
+1. **Read-only Access**: Kafka Catalog only supports reading data; write
operations (INSERT, UPDATE, DELETE) are not supported.
+
+2. **Table Names Configuration**: When not using Schema Registry, you need to
explicitly specify the list of Topics to access through the
`trino.kafka.table-names` parameter.
+
+3. **Schema Definition**:
+ - When using Schema Registry, schema information is automatically retrieved
from Schema Registry.
+ - When not using Schema Registry, you need to manually create table
definitions or use Trino's Topic description files.
+
+4. **Data Format**: Supported data formats depend on the serialization method
used by the Topic (JSON, Avro, Protobuf, etc.). For details, please refer to
[Trino Official Documentation](https://trino.io/docs/435/connector/kafka.html).
+
+5. **Performance Considerations**:
+ - Kafka Catalog reads Kafka data in real-time; querying large amounts of
data may affect performance.
+ - It is recommended to use the `LIMIT` clause or time filter conditions to
limit the amount of data scanned.
+
+## Feature Debugging
+
+You can refer to
[here](https://github.com/morningman/demo-env/tree/main/kafka) to quickly build
a Kafka environment for feature verification.
+
+## References
+
+- [Trino Kafka Connector Official
Documentation](https://trino.io/docs/435/connector/kafka.html)
+- [Trino Connector Development
Guide](https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/)
+- [Confluent Schema Registry
Documentation](https://docs.confluent.io/platform/current/schema-registry/index.html)
\ No newline at end of file
diff --git a/versioned_docs/version-3.x/lakehouse/catalogs/kudu-catalog.md
b/versioned_docs/version-3.x/lakehouse/catalogs/kudu-catalog.md
index 57dc1b19e90..de31ed9c638 100644
--- a/versioned_docs/version-3.x/lakehouse/catalogs/kudu-catalog.md
+++ b/versioned_docs/version-3.x/lakehouse/catalogs/kudu-catalog.md
@@ -12,9 +12,6 @@ Kudu Catalog accesses Kudu tables through the [Trino
Connector](https://doris.ap
:::note
- This is an experimental feature, supported since version 3.0.1.
-:::
-
-:::note
- This feature does not depend on a Trino cluster environment and only uses
the Trino compatibility plugin.
:::
diff --git a/versioned_docs/version-4.x/lakehouse/catalogs/bigquery-catalog.md
b/versioned_docs/version-4.x/lakehouse/catalogs/bigquery-catalog.md
index 9422f89fe3d..e96453a3a4e 100644
--- a/versioned_docs/version-4.x/lakehouse/catalogs/bigquery-catalog.md
+++ b/versioned_docs/version-4.x/lakehouse/catalogs/bigquery-catalog.md
@@ -12,9 +12,6 @@ BigQuery Catalog uses the Trino BigQuery Connector to access
BigQuery tables thr
:::note
- This feature is experimental and supported since version 3.0.1.
-:::
-
-:::note
- This feature does not depend on a Trino cluster environment and only uses
the Trino compatibility plugin.
:::
diff --git
a/versioned_docs/version-4.x/lakehouse/catalogs/delta-lake-catalog.md
b/versioned_docs/version-4.x/lakehouse/catalogs/delta-lake-catalog.md
index 9133807f9da..569505d2229 100644
--- a/versioned_docs/version-4.x/lakehouse/catalogs/delta-lake-catalog.md
+++ b/versioned_docs/version-4.x/lakehouse/catalogs/delta-lake-catalog.md
@@ -2,139 +2,167 @@
{
"title": "Delta Lake Catalog",
"language": "en",
- "description": "Delta Lake Catalog uses the Trino Connector compatibility
framework to access Delta Lake tables through the Delta Lake Connector."
+ "description": "Apache Doris Delta Lake Catalog User Guide: Connect to
Delta Lake data lake through Trino Connector framework to query and integrate
Delta Lake table data. Supports Hive Metastore, multiple data type mappings,
and quick integration between Delta Lake and Doris."
}
---
-Delta Lake Catalog uses the [Trino
Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide/)
compatibility framework to access Delta Lake tables through the Delta Lake
Connector.
+## Overview
+
+Delta Lake Catalog uses the [Trino
Connector](https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/)
compatibility framework with Trino Delta Lake Connector to access Delta Lake
tables.
:::note
-This feature is experimental and has been supported since version 3.0.1.
+- This is an experimental feature, supported since version 3.0.1.
+- This feature does not depend on a Trino cluster environment, it only uses
the Trino compatibility plugin.
:::
-## Application Scenarios
-
-| Scenario | Description |
-| -------------- | ------------------------------------ |
-| Data Integration | Read Delta Lake data and write it into Doris internal
tables. |
-| Data Writeback | Not supported. |
+### Use Cases
-## Environment Preparation
+| Scenario | Support Status |
+| -------- | -------------- |
+| Data Integration | Read Delta Lake data and write to Doris internal tables |
+| Data Write-back | Not supported |
-### Compile the Delta Lake Connector Plugin
+### Version Compatibility
-> JDK 17 is required.
+- **Doris Version**: 3.0.1 and above
+- **Trino Connector Version**: 435
+- **Delta Lake Version**: For supported versions, please refer to [Trino
Documentation](https://trino.io/docs/435/connector/delta-lake.html)
-```shell
-$ git clone https://github.com/apache/doris-thirdparty.git
-$ cd doris-thirdparty
-$ git checkout trino-435
-$ cd plugin/trino-delta-lake
-$ mvn clean install -DskipTest
-$ cd ../../lib/trino-hdfs
-$ mvn clean install -DskipTest
-```
+## Quick Start
-After compiling, you will find the `trino-delta-lake-435` directory under
`trino/plugin/trino-delta-lake/target/` and the `hdfs` directory under
`trino/lib/trino-hdfs/target/`.
+### Step 1: Prepare Connector Plugin
-You can also directly download the precompiled
[trino-delta-lake-435-20240724.tar.gz](https://github.com/apache/Doris-thirdparty/releases/download/trino-435-20240724/trino-delta-lake-435-20240724.tar.gz)
and
[hdfs.tar.gz](https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724/trino-hdfs-435-20240724.tar.gz),
then extract them.
+You can obtain the Delta Lake Connector plugin using one of the following
methods:
-### Deploy the Delta Lake Connector
-
-Place the `trino-delta-lake-435/` directory in the `connectors/` directory of
all FE and BE deployment paths(If it does not exist, you can create it
manually) and extract `hdfs.tar.gz` into the `trino-delta-lake-435/` directory.
-
-```text
-├── bin
-├── conf
-├── connectors
-│ ├── trino-delta-lake-435
-│ │ ├── hdfs
-...
-```
+**Method 1: Use Pre-compiled Package (Recommended)**
-After deployment, it is recommended to restart the FE and BE nodes to ensure
the Connector is loaded correctly.
+Download the pre-compiled
[trino-delta-lake-435-20240724.tar.gz](https://github.com/apache/Doris-thirdparty/releases/download/trino-435-20240724/trino-delta-lake-435-20240724.tar.gz)
and
[hdfs.tar.gz](https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724/trino-hdfs-435-20240724.tar.gz)
and extract them.
-## Configuring Catalog
+**Method 2: Manual Compilation**
-### Syntax
+If you need custom compilation, follow these steps (requires JDK 17):
-```sql
-CREATE CATALOG [IF NOT EXISTS] catalog_name
-PROPERTIES (
- 'type' = 'trino-connector', -- required
- 'trino.connector.name' = 'delta_lake', -- required
- {TrinoProperties},
- {CommonProperties}
-);
+```shell
+git clone https://github.com/apache/doris-thirdparty.git
+cd doris-thirdparty
+git checkout trino-435
+cd plugin/trino-delta-lake
+mvn clean install -DskipTests
+cd ../../lib/trino-hdfs
+mvn clean install -DskipTests
```
-* `{TrinoProperties}`
-
- The TrinoProperties section is used to specify properties that will be
passed to the Trino Connector. These properties use the `trino.` prefix. In
theory, all properties supported by Trino are also supported here. For more
information about Delta Lake, refer to the [Trino
documentation](https://trino.io/docs/435/connector/delta-lake.html).
-
-* `[CommonProperties]`
-
- The CommonProperties section is used to specify general properties. Please
refer to the [Catalog Overview](../catalog-overview.md) under the "Common
Properties" section.
-
-### Supported Delta Lake Versions
+After compilation, you'll get the `trino-delta-lake-435` directory under
`trino/plugin/trino-delta-lake/target/`, and the `hdfs` directory under
`trino/lib/trino-hdfs/target/`.
-For more information about Delta Lake, refer to the [Trino
documentation](https://trino.io/docs/435/connector/delta-lake.html).
+### Step 2: Deploy Plugin
-### Supported Metadata Services
+1. Place the `trino-delta-lake-435/` directory under the `connectors/`
directory of all FE and BE deployment paths (create the directory manually if
it doesn't exist):
-For more information about Delta Lake, refer to the [Trino
documentation](https://trino.io/docs/435/connector/delta-lake.html).
+ ```text
+ ├── bin
+ ├── conf
+ ├── plugins
+ │ ├── connectors
+ │ ├── trino-delta-lake-435
+ │ ├── hdfs
+ ...
+ ```
-### Supported Storage Systems
+ > You can also customize the plugin path by modifying the
`trino_connector_plugin_dir` configuration in `fe.conf`. For example:
`trino_connector_plugin_dir=/path/to/connectors/`
-For more information about Delta Lake, refer to the [Trino
documentation](https://trino.io/docs/435/connector/delta-lake.html).
+2. Restart all FE and BE nodes to ensure the connector is loaded correctly.
-## Column Type Mapping
+### Step 3: Create Catalog
-| Delta Lake Type | Trino Type | Doris Type | Comment |
-| --------------- | --------------------------- | ------------- | ------- |
-| boolean | boolean | boolean | |
-| int | int | int | |
-| byte | tinyint | tinyint | |
-| short | smallint | smallint | |
-| long | bigint | bigint | |
-| float | real | float | |
-| double | double | double | |
-| decimal(P, S) | decimal(P, S) | decimal(P, S) | |
-| string | varchar | string | |
-| bianry | varbinary | string | |
-| date | date | date | |
-| timestamp\_ntz | timestamp(N) | datetime(N) | |
-| timestamp | timestamp with time zone(N) | datetime(N) | |
-| array | array | array | |
-| map | map | map | |
-| struct | row | struct | |
-
-## Examples
+**Basic Configuration**
```sql
-CREATE CATALOG delta_lake_hms properties (
- 'type' = 'trino-connector',
+CREATE CATALOG delta_lake_catalog PROPERTIES (
+ 'type' = 'trino-connector',
'trino.connector.name' = 'delta_lake',
'trino.hive.metastore' = 'thrift',
- 'trino.hive.metastore.uri'= 'thrift://ip:port',
-
'trino.hive.config.resources'='/path/to/core-site.xml,/path/to/hdfs-site.xml'
+ 'trino.hive.metastore.uri' = 'thrift://ip:port',
+ 'trino.hive.config.resources' =
'/path/to/core-site.xml,/path/to/hdfs-site.xml'
);
```
-## Query Operations
+**Configuration Description**
+
+- `trino.hive.metastore`: Metadata service type, supports `thrift` (Hive
Metastore), etc.
+- `trino.hive.metastore.uri`: Hive Metastore service address
+- `trino.hive.config.resources`: Hadoop configuration file path, multiple
files separated by commas
+
+For more configuration options, please refer to the "Configuration
Description" section below or [Trino Official
Documentation](https://trino.io/docs/435/connector/delta-lake.html).
-After configuring the Catalog, you can query the table data in the Catalog
using the following methods:
+### Step 4: Query Data
+
+After creating the Catalog, you can query Delta Lake table data using one of
the following three methods:
```sql
--- 1. Switch to the catalog, use the database, and query
-SWITCH delta_lake_ctl;
+-- Method 1: Switch to Catalog then query
+SWITCH delta_lake_catalog;
USE delta_lake_db;
SELECT * FROM delta_lake_tbl LIMIT 10;
--- 2. Use the Delta Lake database directly
-USE delta_lake_ctl.delta_lake_db;
+-- Method 2: Use two-level path
+USE delta_lake_catalog.delta_lake_db;
SELECT * FROM delta_lake_tbl LIMIT 10;
--- 3. Use the fully qualified name to query
-SELECT * FROM delta_lake_ctl.delta_lake_db.delta_lake_tbl LIMIT 10;
+-- Method 3: Use fully qualified name
+SELECT * FROM delta_lake_catalog.delta_lake_db.delta_lake_tbl LIMIT 10;
+```
+
+## Configuration Description
+
+### Catalog Configuration Parameters
+
+The basic syntax for creating a Delta Lake Catalog is as follows:
+
+```sql
+CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
+ 'type' = 'trino-connector', -- Required, fixed value
+ 'trino.connector.name' = 'delta_lake', -- Required, fixed value
+ {TrinoProperties}, -- Trino Connector related
properties
+ {CommonProperties} -- Common properties
+);
```
+
+#### TrinoProperties Parameters
+
+TrinoProperties are used to configure Trino Delta Lake Connector specific
properties, these properties are prefixed with `trino.`. Common parameters
include:
+
+| Parameter Name | Required | Default Value | Description |
+| -------------- | -------- | ------------- | ----------- |
+| `trino.hive.metastore` | Yes | - | Metadata service type, such as `thrift` |
+| `trino.hive.metastore.uri` | Yes | - | Hive Metastore service address |
+| `trino.hive.config.resources` | No | - | Hadoop configuration file path,
multiple files separated by commas |
+| `trino.delta.hide-non-delta-tables` | No | false | Whether to hide non-Delta
Lake tables |
+
+For more Delta Lake Connector configuration parameters, please refer to [Trino
Official Documentation](https://trino.io/docs/435/connector/delta-lake.html).
+
+#### CommonProperties Parameters
+
+CommonProperties are used to configure common Catalog properties, such as
metadata refresh policies, permission control, etc. For detailed information,
please refer to the "Common Properties" section in [Catalog
Overview](../catalog-overview.md).
+
+## Data Type Mapping
+
+When using Delta Lake Catalog, data types are mapped according to the
following rules:
+
+| Delta Lake Type | Trino Type | Doris Type | Notes |
+| --------------- | ---------- | ---------- | ----- |
+| boolean | boolean | boolean | |
+| int | int | int | |
+| byte | tinyint | tinyint | |
+| short | smallint | smallint | |
+| long | bigint | bigint | |
+| float | real | float | |
+| double | double | double | |
+| decimal(P, S) | decimal(P, S) | decimal(P, S) | |
+| string | varchar | string | |
+| binary | varbinary | string | |
+| date | date | date | |
+| timestamp_ntz | timestamp(N) | datetime(N) | |
+| timestamp | timestamp with time zone(N) | datetime(N) | |
+| array | array | array | |
+| map | map | map | |
+| struct | row | struct | |
\ No newline at end of file
diff --git a/versioned_docs/version-4.x/lakehouse/catalogs/kafka-catalog.md
b/versioned_docs/version-4.x/lakehouse/catalogs/kafka-catalog.md
index e69de29bb2d..0c71183434e 100644
--- a/versioned_docs/version-4.x/lakehouse/catalogs/kafka-catalog.md
+++ b/versioned_docs/version-4.x/lakehouse/catalogs/kafka-catalog.md
@@ -0,0 +1,358 @@
+---
+{
+ "title": "Kafka Catalog",
+ "language": "en",
+ "description": "Apache Doris Kafka Catalog guide: Connect to Kafka data
streams through Trino Connector framework to query and integrate Kafka Topic
data. Supports Schema Registry, multiple data formats for quick Kafka and Doris
data integration."
+}
+---
+
+## Overview
+
+Kafka Catalog uses the Trino Kafka Connector through the [Trino
Connector](https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/)
compatibility framework to access Kafka Topic data.
+
+:::note
+- This is an experimental feature, supported since version 3.0.1.
+- This feature does not depend on a Trino cluster environment; it only uses
Trino-compatible plugins.
+:::
+
+### Use Cases
+
+| Scenario | Support Status |
+| -------- | -------------- |
+| Data Integration | Read Kafka Topic data and write to Doris internal tables |
+| Data Write-back | Not supported |
+
+### Version Compatibility
+
+- **Doris Version**: 3.0.1 and above
+- **Trino Connector Version**: 435
+- **Kafka Version**: For supported versions, please refer to [Trino
Documentation](https://trino.io/docs/435/connector/kafka.html)
+
+## Quick Start
+
+### Step 1: Prepare Connector Plugin
+
+You can obtain the Kafka Connector plugin using one of the following methods:
+
+**Method 1: Use Pre-compiled Package (Recommended)**
+
+Download and extract the pre-compiled plugin package from
[here](https://github.com/apache/doris-thirdparty/releases/tag/trino-435-20240724).
+
+**Method 2: Manual Compilation**
+
+If you need custom compilation, follow these steps (requires JDK 17):
+
+```shell
+git clone https://github.com/apache/doris-thirdparty.git
+cd doris-thirdparty
+git checkout trino-435
+cd plugin/trino-kafka
+mvn clean package -Dmaven.test.skip=true
+```
+
+After compilation, you will get the `trino-kafka-435/` directory under
`trino/plugin/trino-kafka/target/`.
+
+### Step 2: Deploy Plugin
+
+1. Place the `trino-kafka-435/` directory in the `connectors/` directory of
all FE and BE deployment paths (create the directory manually if it doesn't
exist):
+
+ ```text
+ ├── bin
+ ├── conf
+ ├── plugins
+ │ ├── connectors
+ │ ├── trino-kafka-435
+ ...
+ ```
+
+ > You can also customize the plugin path by modifying the
`trino_connector_plugin_dir` configuration in `fe.conf`. For example:
`trino_connector_plugin_dir=/path/to/connectors/`
+
+2. Restart all FE and BE nodes to ensure the connector is properly loaded.
+
+### Step 3: Create Catalog
+
+**Basic Configuration**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>,<broker2>:<port2>',
+ 'trino.kafka.table-names' = 'test_db.topic_name',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+**Using Configuration File**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>,<broker2>:<port2>',
+ 'trino.kafka.config.resources' = '/path/to/kafka-client.properties',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+**Configure Default Schema**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>,<broker2>:<port2>',
+ 'trino.kafka.default-schema' = 'default_db',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+### Step 4: Query Data
+
+After creating the catalog, you can query Kafka Topic data using one of three
methods:
+
+```sql
+-- Method 1: Switch to catalog and query
+SWITCH kafka;
+USE kafka_schema;
+SELECT * FROM topic_name LIMIT 10;
+
+-- Method 2: Use two-level path
+USE kafka.kafka_schema;
+SELECT * FROM topic_name LIMIT 10;
+
+-- Method 3: Use fully qualified name
+SELECT * FROM kafka.kafka_schema.topic_name LIMIT 10;
+```
+
+## Schema Registry Integration
+
+Kafka Catalog supports automatic schema retrieval through Confluent Schema
Registry, eliminating the need to manually define table structures.
+
+### Configure Schema Registry
+
+**Basic Authentication**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>',
+ 'trino.kafka.table-description-supplier' = 'CONFLUENT',
+ 'trino.kafka.confluent-schema-registry-url' =
'http://<schema-registry-host>:<schema-registry-port>',
+ 'trino.kafka.confluent-schema-registry-auth-type' = 'BASIC_AUTH',
+ 'trino.kafka.confluent-schema-registry.basic-auth.username' = 'admin',
+ 'trino.kafka.confluent-schema-registry.basic-auth.password' = 'admin123',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+**Complete Configuration Example**
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>',
+ 'trino.kafka.default-schema' = 'nrdp',
+ 'trino.kafka.table-description-supplier' = 'CONFLUENT',
+ 'trino.kafka.confluent-schema-registry-url' =
'http://<schema-registry-host>:<schema-registry-port>',
+ 'trino.kafka.confluent-schema-registry-auth-type' = 'BASIC_AUTH',
+ 'trino.kafka.confluent-schema-registry.basic-auth.username' = 'admin',
+ 'trino.kafka.confluent-schema-registry.basic-auth.password' = 'admin123',
+ 'trino.kafka.config.resources' = '/path/to/kafka-client.properties',
+ 'trino.kafka.confluent-schema-registry-subject-mapping' =
'nrdp.topic1:NRDP.topic1',
+ 'trino.kafka.hide-internal-columns' = 'false'
+);
+```
+
+### Schema Registry Parameters
+
+| Parameter Name | Required | Default | Description |
+| -------------- | -------- | ------- | ----------- |
+| `trino.kafka.table-description-supplier` | No | - | Set to `CONFLUENT` to
enable Schema Registry support |
+| `trino.kafka.confluent-schema-registry-url` | Yes* | - | Schema Registry
service address |
+| `trino.kafka.confluent-schema-registry-auth-type` | No | NONE |
Authentication type: NONE, BASIC_AUTH, BEARER |
+| `trino.kafka.confluent-schema-registry.basic-auth.username` | No | - | Basic
Auth username |
+| `trino.kafka.confluent-schema-registry.basic-auth.password` | No | - | Basic
Auth password |
+| `trino.kafka.confluent-schema-registry-subject-mapping` | No | - | Subject
name mapping, format: `<db1>.<tbl1>:<topic_name1>,<db2>.<tbl2>:<topic_name2>` |
+
+:::tip
+When using Schema Registry, Doris will automatically retrieve Topic schema
information from Schema Registry, eliminating the need to manually create table
structures.
+:::
+
+### Subject Mapping
+
+In some cases, the Subject name registered in Schema Registry may not match
the Topic name in Kafka, preventing data queries. In such cases, you need to
manually specify the mapping relationship through
`confluent-schema-registry-subject-mapping`.
+
+```sql
+-- Map schema.topic to SCHEMA.topic Subject in Schema Registry
+'trino.kafka.confluent-schema-registry-subject-mapping' =
'<db1>.<tbl1>:<topic_name1>'
+```
+
+Where `db1` and `tbl1` are the actual Database and Table names seen in Doris,
and `topic_name1` is the actual Topic name in Kafka (case-sensitive).
+
+Multiple mappings can be separated by commas:
+
+```sql
+'trino.kafka.confluent-schema-registry-subject-mapping' =
'<db1>.<tbl1>:<topic_name1>,<db2>.<tbl2>:<topic_name2>'
+```
+
+## Configuration
+
+### Catalog Configuration Parameters
+
+The basic syntax for creating a Kafka Catalog is as follows:
+
+```sql
+CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
+ 'type' = 'trino-connector', -- Required, fixed value
+ 'trino.connector.name' = 'kafka', -- Required, fixed value
+ {TrinoProperties}, -- Trino Connector related properties
+ {CommonProperties} -- Common properties
+);
+```
+
+#### TrinoProperties Parameters
+
+TrinoProperties are used to configure Trino Kafka Connector-specific
properties, which are prefixed with `trino.`. Common parameters include:
+
+| Parameter Name | Required | Default | Description |
+| -------------- | -------- | ------- | ----------- |
+| `trino.kafka.nodes` | Yes | - | Kafka Broker node address list, format:
`host1:port1,host2:port2` |
+| `trino.kafka.table-names` | No | - | List of Topics to map, format:
`schema.topic1,schema.topic2` |
+| `trino.kafka.default-schema` | No | default | Default schema name |
+| `trino.kafka.hide-internal-columns` | No | true | Whether to hide Kafka
internal columns (such as `_partition_id`, `_partition_offset`, etc.) |
+| `trino.kafka.config.resources` | No | - | Kafka client configuration file
path |
+| `trino.kafka.table-description-supplier` | No | - | Table structure
provider, set to `CONFLUENT` to use Schema Registry |
+| `trino.kafka.confluent-schema-registry-url` | No | - | Schema Registry
service address |
+
+For more Kafka Connector configuration parameters, please refer to [Trino
Official Documentation](https://trino.io/docs/435/connector/kafka.html).
+
+#### CommonProperties Parameters
+
+CommonProperties are used to configure general catalog properties, such as
metadata refresh policies and permission control. For detailed information,
please refer to the "Common Properties" section in [Catalog
Overview](../catalog-overview.md).
+
+### Kafka Client Configuration
+
+When you need to configure advanced Kafka client parameters (such as security
authentication, SSL, etc.), you can specify them through a configuration file.
Create a configuration file (e.g., `kafka-client.properties`):
+
+```properties
+# ============================================
+# Kerberos/SASL Authentication Configuration
+# ============================================
+sasl.mechanism=GSSAPI
+sasl.kerberos.service.name=kafka
+
+# JAAS Configuration - Using keytab method
+sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required \
+ useKeyTab=true \
+ storeKey=true \
+ useTicketCache=false \
+ serviceName="kafka" \
+ keyTab="/opt/trino/security/keytabs/kafka.keytab" \
+ principal="[email protected]";
+
+# ============================================
+# Avro Deserializer Configuration
+# ============================================
+key.deserializer=io.confluent.kafka.serializers.KafkaAvroDeserializer
+value.deserializer=io.confluent.kafka.serializers.KafkaAvroDeserializer
+```
+
+Then specify the configuration file when creating the catalog:
+
+```sql
+CREATE CATALOG kafka PROPERTIES (
+ 'type' = 'trino-connector',
+ 'trino.connector.name' = 'kafka',
+ 'trino.kafka.nodes' = '<broker1>:<port1>',
+ 'trino.kafka.config.resources' = '/path/to/kafka-client.properties'
+);
+```
+
+## Data Type Mapping
+
+When using Kafka Catalog, data types are mapped according to the following
rules:
+
+| Kafka/Avro Type | Trino Type | Doris Type | Notes |
+| --------------- | ---------- | ---------- | ----- |
+| boolean | boolean | boolean | |
+| int | integer | int | |
+| long | bigint | bigint | |
+| float | real | float | |
+| double | double | double | |
+| bytes | varbinary | string | Use `HEX(col)` function to query |
+| string | varchar | string | |
+| array | array | array | |
+| map | map | map | |
+| record | row | struct | Complex nested structure |
+| enum | varchar | string | |
+| fixed | varbinary | string | |
+| null | - | - | |
+
+:::tip
+- For `bytes` type, use the `HEX()` function to display in hexadecimal format.
+- The data types supported by Kafka Catalog depend on the serialization format
used (JSON, Avro, Protobuf, etc.) and Schema Registry configuration.
+:::
+
+## Kafka Internal Columns
+
+Kafka Connector provides some internal columns to access metadata information
of Kafka messages:
+
+| Column Name | Type | Description |
+| ----------- | ---- | ----------- |
+| `_partition_id` | bigint | Partition ID where the message is located |
+| `_partition_offset` | bigint | Message offset within the partition |
+| `_message_timestamp` | timestamp | Message timestamp |
+| `_key` | varchar | Message key |
+| `_key_corrupt` | boolean | Whether the key is corrupted |
+| `_key_length` | bigint | Key byte length |
+| `_message` | varchar | Raw message content |
+| `_message_corrupt` | boolean | Whether the message is corrupted |
+| `_message_length` | bigint | Message byte length |
+| `_headers` | map | Message header information |
+
+By default, these internal columns are hidden. If you need to query these
columns, set when creating the catalog:
+
+```sql
+'trino.kafka.hide-internal-columns' = 'false'
+```
+
+Query example:
+
+```sql
+SELECT
+ _partition_id,
+ _partition_offset,
+ _message_timestamp,
+ *
+FROM kafka.schema.topic_name
+LIMIT 10;
+```
+
+## Limitations
+
+1. **Read-only Access**: Kafka Catalog only supports reading data; write
operations (INSERT, UPDATE, DELETE) are not supported.
+
+2. **Table Names Configuration**: When not using Schema Registry, you need to
explicitly specify the list of Topics to access through the
`trino.kafka.table-names` parameter.
+
+3. **Schema Definition**:
+ - When using Schema Registry, schema information is automatically retrieved
from Schema Registry.
+ - When not using Schema Registry, you need to manually create table
definitions or use Trino's Topic description files.
+
+4. **Data Format**: Supported data formats depend on the serialization method
used by the Topic (JSON, Avro, Protobuf, etc.). For details, please refer to
[Trino Official Documentation](https://trino.io/docs/435/connector/kafka.html).
+
+5. **Performance Considerations**:
+ - Kafka Catalog reads Kafka data in real-time; querying large amounts of
data may affect performance.
+ - It is recommended to use the `LIMIT` clause or time filter conditions to
limit the amount of data scanned.
+
+## Feature Debugging
+
+You can refer to
[here](https://github.com/morningman/demo-env/tree/main/kafka) to quickly build
a Kafka environment for feature verification.
+
+## References
+
+- [Trino Kafka Connector Official
Documentation](https://trino.io/docs/435/connector/kafka.html)
+- [Trino Connector Development
Guide](https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/)
+- [Confluent Schema Registry
Documentation](https://docs.confluent.io/platform/current/schema-registry/index.html)
\ No newline at end of file
diff --git a/versioned_docs/version-4.x/lakehouse/catalogs/kudu-catalog.md
b/versioned_docs/version-4.x/lakehouse/catalogs/kudu-catalog.md
index 57dc1b19e90..de31ed9c638 100644
--- a/versioned_docs/version-4.x/lakehouse/catalogs/kudu-catalog.md
+++ b/versioned_docs/version-4.x/lakehouse/catalogs/kudu-catalog.md
@@ -12,9 +12,6 @@ Kudu Catalog accesses Kudu tables through the [Trino
Connector](https://doris.ap
:::note
- This is an experimental feature, supported since version 3.0.1.
-:::
-
-:::note
- This feature does not depend on a Trino cluster environment and only uses
the Trino compatibility plugin.
:::
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]