This is an automated email from the ASF dual-hosted git repository.
morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 5ca3fc01b43 [opt] add iceberg dlf, obs pfs and iceberg alter complex
column doc (#3505)
5ca3fc01b43 is described below
commit 5ca3fc01b43dc5214ec3c10fbb85352137c9eb2c
Author: Mingyu Chen (Rayner) <[email protected]>
AuthorDate: Mon Mar 30 12:40:23 2026 -0700
[opt] add iceberg dlf, obs pfs and iceberg alter complex column doc (#3505)
## Versions
- [x] dev
- [x] 4.x
- [x] 3.x
- [ ] 2.1
## Languages
- [x] Chinese
- [x] English
## Docs Checklist
- [ ] Checked by AI
- [ ] Test Cases Built
---
docs/lakehouse/best-practices/doris-dlf-iceberg.md | 120 +++++++++++++++++++++
docs/lakehouse/catalogs/iceberg-catalog.mdx | 55 ++++++++++
docs/lakehouse/storages/huawei-obs.md | 5 +-
.../lakehouse/best-practices/doris-dlf-iceberg.md | 120 +++++++++++++++++++++
.../current/lakehouse/catalogs/iceberg-catalog.mdx | 56 ++++++++++
.../current/lakehouse/storages/huawei-obs.md | 5 +-
.../lakehouse/best-practices/doris-dlf-iceberg.md | 120 +++++++++++++++++++++
.../lakehouse/catalogs/iceberg-catalog.mdx | 56 ++++++++++
.../version-3.x/lakehouse/storages/huawei-obs.md | 5 +-
.../lakehouse/best-practices/doris-dlf-iceberg.md | 120 +++++++++++++++++++++
.../lakehouse/catalogs/iceberg-catalog.mdx | 56 ++++++++++
.../version-4.x/lakehouse/storages/huawei-obs.md | 5 +-
sidebars.ts | 3 +-
.../lakehouse/best-practices/doris-dlf-iceberg.md | 120 +++++++++++++++++++++
.../lakehouse/catalogs/iceberg-catalog.mdx | 55 ++++++++++
.../version-3.x/lakehouse/storages/huawei-obs.md | 5 +-
.../lakehouse/best-practices/doris-dlf-iceberg.md | 120 +++++++++++++++++++++
.../lakehouse/catalogs/iceberg-catalog.mdx | 55 ++++++++++
.../version-4.x/lakehouse/storages/huawei-obs.md | 5 +-
versioned_sidebars/version-3.x-sidebars.json | 3 +-
versioned_sidebars/version-4.x-sidebars.json | 3 +-
21 files changed, 1083 insertions(+), 9 deletions(-)
diff --git a/docs/lakehouse/best-practices/doris-dlf-iceberg.md
b/docs/lakehouse/best-practices/doris-dlf-iceberg.md
new file mode 100644
index 00000000000..cf8dca7ca59
--- /dev/null
+++ b/docs/lakehouse/best-practices/doris-dlf-iceberg.md
@@ -0,0 +1,120 @@
+---
+{
+ "title": "Integrating Alibaba Cloud DLF Rest Catalog",
+ "language": "en",
+ "description": "This article explains how to integrate Apache Doris with
Alibaba Cloud DLF (Data Lake Formation) Rest Catalog for seamless access and
analysis of Iceberg table data, including guides on creating Catalog, querying
data, and incremental reading."
+}
+---
+
+Alibaba Cloud [Data Lake Formation
(DLF)](https://cn.aliyun.com/product/bigdata/dlf), as a core component of the
cloud-native data lake architecture, helps users quickly build cloud-native
data lake solutions. DLF provides unified metadata management on the data lake,
enterprise-level permission control, and seamless integration with multiple
compute engines, breaking down data silos and enabling business insights.
+
+- Unified Metadata and Storage
+
+ Big data compute engines share a single set of lake metadata and storage,
with data flowing seamlessly between lake products.
+
+- Unified Permission Management
+
+ Big data compute engines share a single set of lake table permission
configurations, enabling one-time setup with universal effect.
+
+- Storage Optimization
+
+ Provides optimization strategies including small file compaction, expired
snapshot cleanup, partition reorganization, and obsolete file cleanup to
improve storage efficiency.
+
+- Comprehensive Cloud Ecosystem Support
+
+ Deep integration with Alibaba Cloud products, including streaming and
batch compute engines, delivering out-of-the-box functionality and enhanced
user experience.
+
+Doris supports integration with DLF Iceberg Rest Catalog starting from version
4.1.0, enabling seamless connection to DLF for accessing and analyzing Iceberg
table data. This article demonstrates how to connect Apache Doris with DLF and
access Iceberg table data.
+
+:::tip
+This feature is supported starting from Doris version 4.1.0.
+:::
+
+## Usage Guide
+
+### 01 Enable DLF Service
+
+Please refer to the DLF official documentation to enable the DLF service and
create the corresponding Catalog, Database, and Table.
+
+### 02 Access DLF Using EMR Spark SQL
+
+- Connect
+
+ ```shell
+ spark-sql --master yarn \
+ --conf
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
\
+ --conf spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog
\
+ --conf
spark.sql.catalog.iceberg.catalog-impl=org.apache.iceberg.rest.RESTCatalog \
+ --conf
spark.sql.catalog.iceberg.uri=http://<region>-vpc.dlf.aliyuncs.com/iceberg \
+ --conf spark.sql.catalog.iceberg.warehouse=<your-catalog-name> \
+ --conf spark.sql.catalog.iceberg.credential=<ak>:<sk>
+ ```
+
+ > Replace the corresponding `<region>`, `warehouse`, `<ak>`, and `<sk>`.
+
+- Write Data
+
+ ```sql
+ USE iceberg.<your-catalog-name>;
+
+ CREATE TABLE users_samples
+ (
+ user_id INT,
+ age_level STRING,
+ final_gender_code STRING,
+ clk BOOLEAN
+ ) USING iceberg;
+
+ INSERT INTO users_samples VALUES
+ (1, '25-34', 'M', true),
+ (2, '18-24', 'F', false);
+
+ INSERT INTO users_samples VALUES
+ (3, '25-34', 'M', true),
+ (4, '18-24', 'F', false);
+
+ INSERT INTO users_samples VALUES
+ (5, '25-34', 'M', true),
+ (6, '18-24', 'F', false);
+ ```
+
+### 03 Connect to DLF Using Doris
+
+- Create Iceberg Catalog
+
+ ```sql
+ CREATE CATALOG ice PROPERTIES (
+ 'type' = 'iceberg',
+ 'iceberg.catalog.type' = 'rest',
+ 'iceberg.rest.uri' = 'http://<region>-vpc.dlf.aliyuncs.com/iceberg',
+ 'warehouse' = '<your-catalog-name>',
+ 'iceberg.rest.sigv4-enabled' = 'true',
+ 'iceberg.rest.signing-name' = 'DlfNext',
+ 'iceberg.rest.access-key-id' = '<ak>',
+ 'iceberg.rest.secret-access-key' = '<sk>',
+ 'iceberg.rest.signing-region' = '<region>',
+ 'iceberg.rest.vended-credentials-enabled' = 'true',
+ 'io-impl' = 'org.apache.iceberg.rest.DlfFileIO',
+ 'fs.oss.support' = 'true'
+ );
+ ```
+
+ - Doris uses the temporary credentials returned by DLF to access OSS
object storage, so no additional OSS credentials are required.
+ - DLF can only be accessed within the same VPC. Ensure you provide the
correct URI address.
+ - DLF Iceberg REST catalog requires SigV4 signature enabled, with specific
signing name for DLF `DlfNext`.
+
+- Query Data
+
+ ```sql
+ SELECT * FROM users_samples ORDER BY user_id;
+ +---------+-----------+-------------------+------+
+ | user_id | age_level | final_gender_code | clk |
+ +---------+-----------+-------------------+------+
+ | 1 | 25-34 | M | 1 |
+ | 2 | 18-24 | F | 0 |
+ | 3 | 25-34 | M | 1 |
+ | 4 | 18-24 | F | 0 |
+ | 5 | 25-34 | M | 1 |
+ | 6 | 18-24 | F | 0 |
+ +---------+-----------+-------------------+------+
+ ```
diff --git a/docs/lakehouse/catalogs/iceberg-catalog.mdx
b/docs/lakehouse/catalogs/iceberg-catalog.mdx
index 069cae29745..99aaccccd9e 100644
--- a/docs/lakehouse/catalogs/iceberg-catalog.mdx
+++ b/docs/lakehouse/catalogs/iceberg-catalog.mdx
@@ -621,6 +621,29 @@ Support for Nested Namespace needs to be explicitly
enabled. For details, please
</details>
### Aliyun DLF
+<details>
+ <summary>4.1+ Version</summary>
+ <Tabs>
+ <TabItem value='DLF 2.5+' label='DLF 2.5+' default>
+ ```sql
+ CREATE CATALOG iceberg_dlf2_catalog PROPERTIES (
+ 'type' = 'iceberg',
+ 'iceberg.catalog.type' = 'rest',
+ 'iceberg.rest.uri' =
'http://<region>-vpc.dlf.aliyuncs.com/iceberg',
+ 'warehouse' = '<your-catalog-name>',
+ 'iceberg.rest.sigv4-enabled' = 'true',
+ 'iceberg.rest.signing-name' = 'DlfNext',
+ 'iceberg.rest.access-key-id' = '<ak>',
+ 'iceberg.rest.secret-access-key' = '<sk>',
+ 'iceberg.rest.signing-region' = '<region>',
+ 'iceberg.rest.vended-credentials-enabled' = 'true',
+ 'io-impl' = 'org.apache.iceberg.rest.DlfFileIO',
+ 'fs.oss.support' = 'true'
+ );
+ ```
+ </TabItem>
+ </Tabs>
+</details>
<details>
<summary>3.1+ Version</summary>
<Tabs>
@@ -2157,6 +2180,21 @@ Supported schema change operations include:
Use the `MODIFY COLUMN` statement to modify column attributes, including
type, nullable, default value, comment, and column position.
+ Since version 4.0.4, Doris supports modifying complex types (STRUCT,
ARRAY, MAP), including safe type promotions and appending struct fields.
+
+ Safe type promotions supported in nested types:
+ - INT -> BIGINT, LARGEINT
+ - TINYINT -> SMALLINT, INT, BIGINT, LARGEINT
+ - SMALLINT -> INT, BIGINT, LARGEINT
+ - BIGINT -> LARGEINT
+ - FLOAT -> DOUBLE
+ - VARCHAR(n) -> VARCHAR(m) where m > n
+
+ Constraints for complex types:
+ - All new nested fields must be nullable.
+ - Cannot change optional to required.
+ - Default values for complex types only support NULL.
+
Note: When modifying column attributes, all attributes that are not being
modified should also be explicitly specified with their original values.
```sql
@@ -2174,6 +2212,23 @@ Supported schema change operations include:
ALTER TABLE iceberg_table MODIFY COLUMN id BIGINT NOT NULL DEFAULT 0 COMMENT
'This is a modified id column';
```
+ Example of modifying complex types:
+
+ ```sql
+ -- Create Iceberg table with complex types
+ CREATE TABLE iceberg_tbl (
+ id BIGINT,
+ user_info STRUCT<name:STRING, scores:ARRAY<INT>, age:INT>,
+ dt STRING
+ );
+
+ -- Append a new field (email) to the STRUCT column
+ ALTER TABLE iceberg_tbl MODIFY COLUMN user_info STRUCT<name:STRING,
scores:ARRAY<INT>, age:INT, email:STRING>;
+
+ -- Promote the nested ARRAY element type from INT to BIGINT
+ ALTER TABLE iceberg_tbl MODIFY COLUMN user_info STRUCT<name:STRING,
scores:ARRAY<BIGINT>, age:INT, email:STRING>;
+ ```
+
* **Reorder Columns**
Use `ORDER BY` to reorder columns by specifying the new column order.
diff --git a/docs/lakehouse/storages/huawei-obs.md
b/docs/lakehouse/storages/huawei-obs.md
index 4574ffee15d..e9d2842422c 100644
--- a/docs/lakehouse/storages/huawei-obs.md
+++ b/docs/lakehouse/storages/huawei-obs.md
@@ -14,8 +14,11 @@ This document describes the parameters required to access
Huawei Cloud OBS, whic
- Export properties
- Outfile properties
-**Doris uses S3 Client to access Huawei Cloud OBS through S3-compatible
protocol.**
+**Doris supports accessing Huawei Cloud OBS through S3-compatible protocol
(using S3 Client) or OBS native protocol (using native SDK).**
+:::info
+Starting from versions 3.0.5 and 4.1.0, Doris natively integrates the Huawei
Cloud OBS SDK. Users can access OBS data (such as Paimon Catalog) directly with
the `obs://` prefix, enabling better support for Huawei Cloud's Parallel File
System (PFS).
+:::
## Parameter Overview
| Property Name | Former Name | Description
| Default Value | Required |
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/best-practices/doris-dlf-iceberg.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/best-practices/doris-dlf-iceberg.md
new file mode 100644
index 00000000000..68583e33f09
--- /dev/null
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/best-practices/doris-dlf-iceberg.md
@@ -0,0 +1,120 @@
+---
+{
+ "title": "集成阿里云 DLF Rest Catalog",
+ "language": "zh-CN",
+ "description": "本文介绍如何使用 Apache Doris 集成阿里云 DLF(Data Lake Formation)Rest
Catalog,实现 Iceberg 表数据的无缝访问与分析,包括创建 Catalog、查询数据等操作指南。"
+}
+---
+
+阿里云数据湖构建 [Data Lake Formation,DLF](https://cn.aliyun.com/product/bigdata/dlf)
作为云原生数据湖架构核心组成部分,帮助用户快速地构建云原生数据湖架构。数据湖构建提供湖上元数据统一管理、企业级权限控制,并无缝对接多种计算引擎,打破数据孤岛,洞察业务价值。
+
+- 统一元数据与存储
+
+ 大数据计算引擎共享一套湖上元数据和存储,且数据可在环湖产品间流动。
+
+- 统一权限管理
+
+ 大数据计算引擎共享一套湖表权限配置,实现一次配置,多处生效。
+
+- 存储优化
+
+ 提供小文件合并、过期快照清理、分区整理及废弃文件清理等优化策略,提升存储效率。
+
+- 完善的云生态支持体系
+
+ 深度整合阿里云产品,包括流批计算引擎,实现开箱即用,提升用户体验与操作便捷性。
+
+Doris 自 3.0.5/4.1.0 版本开始,支持集成 DLF Iceberg Rest Catalog,可以无缝对接 DLF,访问并分析
Iceberg 表数据。本文将演示如何使用 Apache Doris 对接 DLF 并进行 Iceberg 表数据访问。
+
+:::tip
+该功能从 Doris 4.1.0 版本开始支持。
+:::
+
+## 使用指南
+
+### 01 开通 DLF 服务
+
+请参考 DLF 官方文档开通 DLF 服务,并创建相应的 Catalog、Database 和 Table。
+
+### 02 使用 EMR Spark SQL 访问 DLF
+
+- 连接
+
+ ```shell
+ spark-sql --master yarn \
+ --conf
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
\
+ --conf spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog
\
+ --conf
spark.sql.catalog.iceberg.catalog-impl=org.apache.iceberg.rest.RESTCatalog \
+ --conf
spark.sql.catalog.iceberg.uri=http://<region>-vpc.dlf.aliyuncs.com/iceberg \
+ --conf spark.sql.catalog.iceberg.warehouse=<your-catalog-name> \
+ --conf spark.sql.catalog.iceberg.credential=<ak>:<sk>
+ ```
+
+ > 替换对应的 `<region>`, `warehouse`, `<ak>`, 和 `<sk>`。
+
+- 写入数据
+
+ ```sql
+ USE iceberg.<your-catalog-name>;
+
+ CREATE TABLE users_samples
+ (
+ user_id INT,
+ age_level STRING,
+ final_gender_code STRING,
+ clk BOOLEAN
+ ) USING iceberg;
+
+ INSERT INTO users_samples VALUES
+ (1, '25-34', 'M', true),
+ (2, '18-24', 'F', false);
+
+ INSERT INTO users_samples VALUES
+ (3, '25-34', 'M', true),
+ (4, '18-24', 'F', false);
+
+ INSERT INTO users_samples VALUES
+ (5, '25-34', 'M', true),
+ (6, '18-24', 'F', false);
+ ```
+
+### 03 使用 Doris 连接 DLF
+
+- 创建 Iceberg Catalog
+
+ ```sql
+ CREATE CATALOG ice PROPERTIES (
+ 'type' = 'iceberg',
+ 'iceberg.catalog.type' = 'rest',
+ 'iceberg.rest.uri' = 'http://<region>-vpc.dlf.aliyuncs.com/iceberg',
+ 'warehouse' = '<your-catalog-name>',
+ 'iceberg.rest.sigv4-enabled' = 'true',
+ 'iceberg.rest.signing-name' = 'DlfNext',
+ 'iceberg.rest.access-key-id' = '<ak>',
+ 'iceberg.rest.secret-access-key' = '<sk>',
+ 'iceberg.rest.signing-region' = '<region>',
+ 'iceberg.rest.vended-credentials-enabled' = 'true',
+ 'io-impl' = 'org.apache.iceberg.rest.DlfFileIO',
+ 'fs.oss.support' = 'true'
+ );
+ ```
+
+ - Doris 会使用 DLF 返回的临时凭证访问 OSS 对象存储,不需要额外提供 OSS 的凭证信息。
+ - 仅支持在同 VPC 内访问 DLF,注意提供正确的 uri 地址。
+ - 访问 DLF Iceberg REST Catalog 需要启用 SigV4 签名机制,并填写专用的 API 签名名称 `DlfNext`
以及正确的 Region。
+
+- 查询数据
+
+ ```sql
+ SELECT * FROM users_samples ORDER BY user_id;
+ +---------+-----------+-------------------+------+
+ | user_id | age_level | final_gender_code | clk |
+ +---------+-----------+-------------------+------+
+ | 1 | 25-34 | M | 1 |
+ | 2 | 18-24 | F | 0 |
+ | 3 | 25-34 | M | 1 |
+ | 4 | 18-24 | F | 0 |
+ | 5 | 25-34 | M | 1 |
+ | 6 | 18-24 | F | 0 |
+ +---------+-----------+-------------------+------+
+ ```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.mdx
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.mdx
index e2e40c95b31..48d5dc912a9 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.mdx
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.mdx
@@ -610,6 +610,30 @@ Iceberg 的元数层级关系是 Catalog -> Namespace -> Table。其中 Namespac
</details>
### Aliyun DLF
+<details>
+ <summary>4.1+ 版本</summary>
+ <Tabs>
+ <TabItem value='DLF 2.5+' label='DLF 2.5+' default>
+ ```sql
+ CREATE CATALOG iceberg_dlf2_catalog PROPERTIES (
+ 'type' = 'iceberg',
+ 'iceberg.catalog.type' = 'rest',
+ 'iceberg.rest.uri' =
'http://<region>-vpc.dlf.aliyuncs.com/iceberg',
+ 'warehouse' = '<your-catalog-name>',
+ 'iceberg.rest.sigv4-enabled' = 'true',
+ 'iceberg.rest.signing-name' = 'DlfNext',
+ 'iceberg.rest.access-key-id' = '<ak>',
+ 'iceberg.rest.secret-access-key' = '<sk>',
+ 'iceberg.rest.signing-region' = '<region>',
+ 'iceberg.rest.vended-credentials-enabled' = 'true',
+ 'io-impl' = 'org.apache.iceberg.rest.DlfFileIO',
+ 'fs.oss.support' = 'true'
+ );
+ ```
+ </TabItem>
+ </Tabs>
+</details>
+
<details>
<summary>3.1+ 版本</summary>
<Tabs>
@@ -2169,6 +2193,21 @@ DROP DATABASE [IF EXISTS] iceberg.iceberg_db;
通过 `MODIFY COLUMN` 语句修改列的属性,包括类型、nullable、默认值、注释和列位置。
+ 自 4.0.4 版本起,Doris 支持修改复杂类型(STRUCT、ARRAY、MAP),包括安全的类型推导和追加 Struct 字段。
+
+ 嵌套类型中支持的安全类型推导:
+ - INT -> BIGINT, LARGEINT
+ - TINYINT -> SMALLINT, INT, BIGINT, LARGEINT
+ - SMALLINT -> INT, BIGINT, LARGEINT
+ - BIGINT -> LARGEINT
+ - FLOAT -> DOUBLE
+ - VARCHAR(n) -> VARCHAR(m) 其中 m > n
+
+ 修改复杂类型的限制:
+ - 所有新的嵌套字段必须为 nullable。
+ - 不能将可选(optional)改为必填(required)。
+ - 复杂类型的默认值仅支持 NULL。
+
注意:修改列的属性时,所有没有被修改的属性也应该显式地指定为原来的值。
```sql
@@ -2186,6 +2225,23 @@ DROP DATABASE [IF EXISTS] iceberg.iceberg_db;
ALTER TABLE iceberg_table MODIFY COLUMN id BIGINT NOT NULL DEFAULT 0 COMMENT
'This is a modified id column' FIRST;
```
+ 修改复杂类型的示例:
+
+ ```sql
+ -- 创建包含复杂类型的 Iceberg 表
+ CREATE TABLE iceberg_tbl (
+ id BIGINT,
+ user_info STRUCT<name:STRING, scores:ARRAY<INT>, age:INT>,
+ dt STRING
+ );
+
+ -- 为 STRUCT 类型的列追加新字段 (email)
+ ALTER TABLE iceberg_tbl MODIFY COLUMN user_info STRUCT<name:STRING,
scores:ARRAY<INT>, age:INT, email:STRING>;
+
+ -- 将嵌套 ARRAY 元素类型从 INT 提升为 BIGINT
+ ALTER TABLE iceberg_tbl MODIFY COLUMN user_info STRUCT<name:STRING,
scores:ARRAY<BIGINT>, age:INT, email:STRING>;
+ ```
+
* **重新排序**
通过 `ORDER BY` 重新排序列,指定新的列顺序。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/storages/huawei-obs.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/storages/huawei-obs.md
index bce53ff6f39..ce82d9dc97f 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/storages/huawei-obs.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/storages/huawei-obs.md
@@ -14,8 +14,11 @@
- Export 属性
- Outfile 属性
-**Doris 使用 S3 Client,通过 S3 兼容协议访问华为云 OBS。**
+**Doris 支持通过 S3 兼容协议(使用 S3 Client)或 OBS 原生协议(基于原生 SDK)访问华为云 OBS。**
+:::info
+自 3.0.5 和 4.1.0 版本开始,Doris 默认内置了华为云 OBS 的原生 SDK。用户可以直接通过 `obs://` 前缀访问 OBS
数据(如 Paimon Catalog),并且能够更好地支持华为云的并行文件系统(PFS)。
+:::
## 参数总览
| 属性名称 | 曾用名 | 描述
| 默认值 | 是否必须 |
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/doris-dlf-iceberg.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/doris-dlf-iceberg.md
new file mode 100644
index 00000000000..68583e33f09
--- /dev/null
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/doris-dlf-iceberg.md
@@ -0,0 +1,120 @@
+---
+{
+ "title": "集成阿里云 DLF Rest Catalog",
+ "language": "zh-CN",
+ "description": "本文介绍如何使用 Apache Doris 集成阿里云 DLF(Data Lake Formation)Rest
Catalog,实现 Iceberg 表数据的无缝访问与分析,包括创建 Catalog、查询数据等操作指南。"
+}
+---
+
+阿里云数据湖构建 [Data Lake Formation,DLF](https://cn.aliyun.com/product/bigdata/dlf)
作为云原生数据湖架构核心组成部分,帮助用户快速地构建云原生数据湖架构。数据湖构建提供湖上元数据统一管理、企业级权限控制,并无缝对接多种计算引擎,打破数据孤岛,洞察业务价值。
+
+- 统一元数据与存储
+
+ 大数据计算引擎共享一套湖上元数据和存储,且数据可在环湖产品间流动。
+
+- 统一权限管理
+
+ 大数据计算引擎共享一套湖表权限配置,实现一次配置,多处生效。
+
+- 存储优化
+
+ 提供小文件合并、过期快照清理、分区整理及废弃文件清理等优化策略,提升存储效率。
+
+- 完善的云生态支持体系
+
+ 深度整合阿里云产品,包括流批计算引擎,实现开箱即用,提升用户体验与操作便捷性。
+
+Doris 自 3.0.5/4.1.0 版本开始,支持集成 DLF Iceberg Rest Catalog,可以无缝对接 DLF,访问并分析
Iceberg 表数据。本文将演示如何使用 Apache Doris 对接 DLF 并进行 Iceberg 表数据访问。
+
+:::tip
+该功能从 Doris 4.1.0 版本开始支持。
+:::
+
+## 使用指南
+
+### 01 开通 DLF 服务
+
+请参考 DLF 官方文档开通 DLF 服务,并创建相应的 Catalog、Database 和 Table。
+
+### 02 使用 EMR Spark SQL 访问 DLF
+
+- 连接
+
+ ```shell
+ spark-sql --master yarn \
+ --conf
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
\
+ --conf spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog
\
+ --conf
spark.sql.catalog.iceberg.catalog-impl=org.apache.iceberg.rest.RESTCatalog \
+ --conf
spark.sql.catalog.iceberg.uri=http://<region>-vpc.dlf.aliyuncs.com/iceberg \
+ --conf spark.sql.catalog.iceberg.warehouse=<your-catalog-name> \
+ --conf spark.sql.catalog.iceberg.credential=<ak>:<sk>
+ ```
+
+ > 替换对应的 `<region>`, `warehouse`, `<ak>`, 和 `<sk>`。
+
+- 写入数据
+
+ ```sql
+ USE iceberg.<your-catalog-name>;
+
+ CREATE TABLE users_samples
+ (
+ user_id INT,
+ age_level STRING,
+ final_gender_code STRING,
+ clk BOOLEAN
+ ) USING iceberg;
+
+ INSERT INTO users_samples VALUES
+ (1, '25-34', 'M', true),
+ (2, '18-24', 'F', false);
+
+ INSERT INTO users_samples VALUES
+ (3, '25-34', 'M', true),
+ (4, '18-24', 'F', false);
+
+ INSERT INTO users_samples VALUES
+ (5, '25-34', 'M', true),
+ (6, '18-24', 'F', false);
+ ```
+
+### 03 使用 Doris 连接 DLF
+
+- 创建 Iceberg Catalog
+
+ ```sql
+ CREATE CATALOG ice PROPERTIES (
+ 'type' = 'iceberg',
+ 'iceberg.catalog.type' = 'rest',
+ 'iceberg.rest.uri' = 'http://<region>-vpc.dlf.aliyuncs.com/iceberg',
+ 'warehouse' = '<your-catalog-name>',
+ 'iceberg.rest.sigv4-enabled' = 'true',
+ 'iceberg.rest.signing-name' = 'DlfNext',
+ 'iceberg.rest.access-key-id' = '<ak>',
+ 'iceberg.rest.secret-access-key' = '<sk>',
+ 'iceberg.rest.signing-region' = '<region>',
+ 'iceberg.rest.vended-credentials-enabled' = 'true',
+ 'io-impl' = 'org.apache.iceberg.rest.DlfFileIO',
+ 'fs.oss.support' = 'true'
+ );
+ ```
+
+ - Doris 会使用 DLF 返回的临时凭证访问 OSS 对象存储,不需要额外提供 OSS 的凭证信息。
+ - 仅支持在同 VPC 内访问 DLF,注意提供正确的 uri 地址。
+ - 访问 DLF Iceberg REST Catalog 需要启用 SigV4 签名机制,并填写专用的 API 签名名称 `DlfNext`
以及正确的 Region。
+
+- 查询数据
+
+ ```sql
+ SELECT * FROM users_samples ORDER BY user_id;
+ +---------+-----------+-------------------+------+
+ | user_id | age_level | final_gender_code | clk |
+ +---------+-----------+-------------------+------+
+ | 1 | 25-34 | M | 1 |
+ | 2 | 18-24 | F | 0 |
+ | 3 | 25-34 | M | 1 |
+ | 4 | 18-24 | F | 0 |
+ | 5 | 25-34 | M | 1 |
+ | 6 | 18-24 | F | 0 |
+ +---------+-----------+-------------------+------+
+ ```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
index a30bc74ea0e..9badeb0d509 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
@@ -608,6 +608,30 @@ Iceberg 的元数层级关系是 Catalog -> Namespace -> Table。其中 Namespac
</details>
### Aliyun DLF
+<details>
+ <summary>4.1+ 版本</summary>
+ <Tabs>
+ <TabItem value='DLF 2.5+' label='DLF 2.5+' default>
+ ```sql
+ CREATE CATALOG iceberg_dlf2_catalog PROPERTIES (
+ 'type' = 'iceberg',
+ 'iceberg.catalog.type' = 'rest',
+ 'iceberg.rest.uri' =
'http://<region>-vpc.dlf.aliyuncs.com/iceberg',
+ 'warehouse' = '<your-catalog-name>',
+ 'iceberg.rest.sigv4-enabled' = 'true',
+ 'iceberg.rest.signing-name' = 'DlfNext',
+ 'iceberg.rest.access-key-id' = '<ak>',
+ 'iceberg.rest.secret-access-key' = '<sk>',
+ 'iceberg.rest.signing-region' = '<region>',
+ 'iceberg.rest.vended-credentials-enabled' = 'true',
+ 'io-impl' = 'org.apache.iceberg.rest.DlfFileIO',
+ 'fs.oss.support' = 'true'
+ );
+ ```
+ </TabItem>
+ </Tabs>
+</details>
+
<details>
<summary>3.1+ 版本</summary>
<Tabs>
@@ -2151,6 +2175,21 @@ DROP DATABASE [IF EXISTS] iceberg.iceberg_db;
通过 `MODIFY COLUMN` 语句修改列的属性,包括类型、nullable、默认值、注释和列位置。
+ 自 4.0.4 版本起,Doris 支持修改复杂类型(STRUCT、ARRAY、MAP),包括安全的类型推导和追加 Struct 字段。
+
+ 嵌套类型中支持的安全类型推导:
+ - INT -> BIGINT, LARGEINT
+ - TINYINT -> SMALLINT, INT, BIGINT, LARGEINT
+ - SMALLINT -> INT, BIGINT, LARGEINT
+ - BIGINT -> LARGEINT
+ - FLOAT -> DOUBLE
+ - VARCHAR(n) -> VARCHAR(m) 其中 m > n
+
+ 修改复杂类型的限制:
+ - 所有新的嵌套字段必须为 nullable。
+ - 不能将可选(optional)改为必填(required)。
+ - 复杂类型的默认值仅支持 NULL。
+
注意:修改列的属性时,所有没有被修改的属性也应该显式地指定为原来的值。
```sql
@@ -2168,6 +2207,23 @@ DROP DATABASE [IF EXISTS] iceberg.iceberg_db;
ALTER TABLE iceberg_table MODIFY COLUMN id BIGINT NOT NULL DEFAULT 0 COMMENT
'This is a modified id column' FIRST;
```
+ 修改复杂类型的示例:
+
+ ```sql
+ -- 创建包含复杂类型的 Iceberg 表
+ CREATE TABLE iceberg_tbl (
+ id BIGINT,
+ user_info STRUCT<name:STRING, scores:ARRAY<INT>, age:INT>,
+ dt STRING
+ );
+
+ -- 为 STRUCT 类型的列追加新字段 (email)
+ ALTER TABLE iceberg_tbl MODIFY COLUMN user_info STRUCT<name:STRING,
scores:ARRAY<INT>, age:INT, email:STRING>;
+
+ -- 将嵌套 ARRAY 元素类型从 INT 提升为 BIGINT
+ ALTER TABLE iceberg_tbl MODIFY COLUMN user_info STRUCT<name:STRING,
scores:ARRAY<BIGINT>, age:INT, email:STRING>;
+ ```
+
* **重新排序**
通过 `ORDER BY` 重新排序列,指定新的列顺序。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/storages/huawei-obs.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/storages/huawei-obs.md
index bce53ff6f39..ce82d9dc97f 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/storages/huawei-obs.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/storages/huawei-obs.md
@@ -14,8 +14,11 @@
- Export 属性
- Outfile 属性
-**Doris 使用 S3 Client,通过 S3 兼容协议访问华为云 OBS。**
+**Doris 支持通过 S3 兼容协议(使用 S3 Client)或 OBS 原生协议(基于原生 SDK)访问华为云 OBS。**
+:::info
+自 3.0.5 和 4.1.0 版本开始,Doris 默认内置了华为云 OBS 的原生 SDK。用户可以直接通过 `obs://` 前缀访问 OBS
数据(如 Paimon Catalog),并且能够更好地支持华为云的并行文件系统(PFS)。
+:::
## 参数总览
| 属性名称 | 曾用名 | 描述
| 默认值 | 是否必须 |
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/best-practices/doris-dlf-iceberg.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/best-practices/doris-dlf-iceberg.md
new file mode 100644
index 00000000000..68583e33f09
--- /dev/null
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/best-practices/doris-dlf-iceberg.md
@@ -0,0 +1,120 @@
+---
+{
+ "title": "集成阿里云 DLF Rest Catalog",
+ "language": "zh-CN",
+ "description": "本文介绍如何使用 Apache Doris 集成阿里云 DLF(Data Lake Formation)Rest
Catalog,实现 Iceberg 表数据的无缝访问与分析,包括创建 Catalog、查询数据等操作指南。"
+}
+---
+
+阿里云数据湖构建 [Data Lake Formation,DLF](https://cn.aliyun.com/product/bigdata/dlf)
作为云原生数据湖架构核心组成部分,帮助用户快速地构建云原生数据湖架构。数据湖构建提供湖上元数据统一管理、企业级权限控制,并无缝对接多种计算引擎,打破数据孤岛,洞察业务价值。
+
+- 统一元数据与存储
+
+ 大数据计算引擎共享一套湖上元数据和存储,且数据可在环湖产品间流动。
+
+- 统一权限管理
+
+ 大数据计算引擎共享一套湖表权限配置,实现一次配置,多处生效。
+
+- 存储优化
+
+ 提供小文件合并、过期快照清理、分区整理及废弃文件清理等优化策略,提升存储效率。
+
+- 完善的云生态支持体系
+
+ 深度整合阿里云产品,包括流批计算引擎,实现开箱即用,提升用户体验与操作便捷性。
+
+Doris 自 3.0.5/4.1.0 版本开始,支持集成 DLF Iceberg Rest Catalog,可以无缝对接 DLF,访问并分析
Iceberg 表数据。本文将演示如何使用 Apache Doris 对接 DLF 并进行 Iceberg 表数据访问。
+
+:::tip
+该功能从 Doris 4.1.0 版本开始支持。
+:::
+
+## 使用指南
+
+### 01 开通 DLF 服务
+
+请参考 DLF 官方文档开通 DLF 服务,并创建相应的 Catalog、Database 和 Table。
+
+### 02 使用 EMR Spark SQL 访问 DLF
+
+- 连接
+
+ ```shell
+ spark-sql --master yarn \
+ --conf
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
\
+ --conf spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog
\
+ --conf
spark.sql.catalog.iceberg.catalog-impl=org.apache.iceberg.rest.RESTCatalog \
+ --conf
spark.sql.catalog.iceberg.uri=http://<region>-vpc.dlf.aliyuncs.com/iceberg \
+ --conf spark.sql.catalog.iceberg.warehouse=<your-catalog-name> \
+ --conf spark.sql.catalog.iceberg.credential=<ak>:<sk>
+ ```
+
+ > 替换对应的 `<region>`, `warehouse`, `<ak>`, 和 `<sk>`。
+
+- 写入数据
+
+ ```sql
+ USE iceberg.<your-catalog-name>;
+
+ CREATE TABLE users_samples
+ (
+ user_id INT,
+ age_level STRING,
+ final_gender_code STRING,
+ clk BOOLEAN
+ ) USING iceberg;
+
+ INSERT INTO users_samples VALUES
+ (1, '25-34', 'M', true),
+ (2, '18-24', 'F', false);
+
+ INSERT INTO users_samples VALUES
+ (3, '25-34', 'M', true),
+ (4, '18-24', 'F', false);
+
+ INSERT INTO users_samples VALUES
+ (5, '25-34', 'M', true),
+ (6, '18-24', 'F', false);
+ ```
+
+### 03 使用 Doris 连接 DLF
+
+- 创建 Iceberg Catalog
+
+ ```sql
+ CREATE CATALOG ice PROPERTIES (
+ 'type' = 'iceberg',
+ 'iceberg.catalog.type' = 'rest',
+ 'iceberg.rest.uri' = 'http://<region>-vpc.dlf.aliyuncs.com/iceberg',
+ 'warehouse' = '<your-catalog-name>',
+ 'iceberg.rest.sigv4-enabled' = 'true',
+ 'iceberg.rest.signing-name' = 'DlfNext',
+ 'iceberg.rest.access-key-id' = '<ak>',
+ 'iceberg.rest.secret-access-key' = '<sk>',
+ 'iceberg.rest.signing-region' = '<region>',
+ 'iceberg.rest.vended-credentials-enabled' = 'true',
+ 'io-impl' = 'org.apache.iceberg.rest.DlfFileIO',
+ 'fs.oss.support' = 'true'
+ );
+ ```
+
+ - Doris 会使用 DLF 返回的临时凭证访问 OSS 对象存储,不需要额外提供 OSS 的凭证信息。
+ - 仅支持在同 VPC 内访问 DLF,注意提供正确的 uri 地址。
+ - 访问 DLF Iceberg REST Catalog 需要启用 SigV4 签名机制,并填写专用的 API 签名名称 `DlfNext`
以及正确的 Region。
+
+- 查询数据
+
+ ```sql
+ SELECT * FROM users_samples ORDER BY user_id;
+ +---------+-----------+-------------------+------+
+ | user_id | age_level | final_gender_code | clk |
+ +---------+-----------+-------------------+------+
+ | 1 | 25-34 | M | 1 |
+ | 2 | 18-24 | F | 0 |
+ | 3 | 25-34 | M | 1 |
+ | 4 | 18-24 | F | 0 |
+ | 5 | 25-34 | M | 1 |
+ | 6 | 18-24 | F | 0 |
+ +---------+-----------+-------------------+------+
+ ```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
index a30bc74ea0e..9badeb0d509 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
@@ -608,6 +608,30 @@ Iceberg 的元数层级关系是 Catalog -> Namespace -> Table。其中 Namespac
</details>
### Aliyun DLF
+<details>
+ <summary>4.1+ 版本</summary>
+ <Tabs>
+ <TabItem value='DLF 2.5+' label='DLF 2.5+' default>
+ ```sql
+ CREATE CATALOG iceberg_dlf2_catalog PROPERTIES (
+ 'type' = 'iceberg',
+ 'iceberg.catalog.type' = 'rest',
+ 'iceberg.rest.uri' =
'http://<region>-vpc.dlf.aliyuncs.com/iceberg',
+ 'warehouse' = '<your-catalog-name>',
+ 'iceberg.rest.sigv4-enabled' = 'true',
+ 'iceberg.rest.signing-name' = 'DlfNext',
+ 'iceberg.rest.access-key-id' = '<ak>',
+ 'iceberg.rest.secret-access-key' = '<sk>',
+ 'iceberg.rest.signing-region' = '<region>',
+ 'iceberg.rest.vended-credentials-enabled' = 'true',
+ 'io-impl' = 'org.apache.iceberg.rest.DlfFileIO',
+ 'fs.oss.support' = 'true'
+ );
+ ```
+ </TabItem>
+ </Tabs>
+</details>
+
<details>
<summary>3.1+ 版本</summary>
<Tabs>
@@ -2151,6 +2175,21 @@ DROP DATABASE [IF EXISTS] iceberg.iceberg_db;
通过 `MODIFY COLUMN` 语句修改列的属性,包括类型、nullable、默认值、注释和列位置。
+ 自 4.0.4 版本起,Doris 支持修改复杂类型(STRUCT、ARRAY、MAP),包括安全的类型推导和追加 Struct 字段。
+
+ 嵌套类型中支持的安全类型推导:
+ - INT -> BIGINT, LARGEINT
+ - TINYINT -> SMALLINT, INT, BIGINT, LARGEINT
+ - SMALLINT -> INT, BIGINT, LARGEINT
+ - BIGINT -> LARGEINT
+ - FLOAT -> DOUBLE
+ - VARCHAR(n) -> VARCHAR(m) 其中 m > n
+
+ 修改复杂类型的限制:
+ - 所有新的嵌套字段必须为 nullable。
+ - 不能将可选(optional)改为必填(required)。
+ - 复杂类型的默认值仅支持 NULL。
+
注意:修改列的属性时,所有没有被修改的属性也应该显式地指定为原来的值。
```sql
@@ -2168,6 +2207,23 @@ DROP DATABASE [IF EXISTS] iceberg.iceberg_db;
ALTER TABLE iceberg_table MODIFY COLUMN id BIGINT NOT NULL DEFAULT 0 COMMENT
'This is a modified id column' FIRST;
```
+ 修改复杂类型的示例:
+
+ ```sql
+ -- 创建包含复杂类型的 Iceberg 表
+ CREATE TABLE iceberg_tbl (
+ id BIGINT,
+ user_info STRUCT<name:STRING, scores:ARRAY<INT>, age:INT>,
+ dt STRING
+ );
+
+ -- 为 STRUCT 类型的列追加新字段 (email)
+ ALTER TABLE iceberg_tbl MODIFY COLUMN user_info STRUCT<name:STRING,
scores:ARRAY<INT>, age:INT, email:STRING>;
+
+ -- 将嵌套 ARRAY 元素类型从 INT 提升为 BIGINT
+ ALTER TABLE iceberg_tbl MODIFY COLUMN user_info STRUCT<name:STRING,
scores:ARRAY<BIGINT>, age:INT, email:STRING>;
+ ```
+
* **重新排序**
通过 `ORDER BY` 重新排序列,指定新的列顺序。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/storages/huawei-obs.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/storages/huawei-obs.md
index bce53ff6f39..ce82d9dc97f 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/storages/huawei-obs.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/storages/huawei-obs.md
@@ -14,8 +14,11 @@
- Export 属性
- Outfile 属性
-**Doris 使用 S3 Client,通过 S3 兼容协议访问华为云 OBS。**
+**Doris 支持通过 S3 兼容协议(使用 S3 Client)或 OBS 原生协议(基于原生 SDK)访问华为云 OBS。**
+:::info
+自 3.0.5 和 4.1.0 版本开始,Doris 默认内置了华为云 OBS 的原生 SDK。用户可以直接通过 `obs://` 前缀访问 OBS
数据(如 Paimon Catalog),并且能够更好地支持华为云的并行文件系统(PFS)。
+:::
## 参数总览
| 属性名称 | 曾用名 | 描述
| 默认值 | 是否必须 |
diff --git a/sidebars.ts b/sidebars.ts
index e7c028123b3..0398a2810e3 100644
--- a/sidebars.ts
+++ b/sidebars.ts
@@ -365,7 +365,8 @@ const sidebars: SidebarsConfig = {
'lakehouse/best-practices/doris-onelake',
'lakehouse/best-practices/doris-unity-catalog',
'lakehouse/best-practices/doris-lakekeeper',
- 'lakehouse/best-practices/doris-nessie'
+
'lakehouse/best-practices/doris-nessie',
+
'lakehouse/best-practices/doris-dlf-iceberg'
],
},
{
diff --git
a/versioned_docs/version-3.x/lakehouse/best-practices/doris-dlf-iceberg.md
b/versioned_docs/version-3.x/lakehouse/best-practices/doris-dlf-iceberg.md
new file mode 100644
index 00000000000..cf8dca7ca59
--- /dev/null
+++ b/versioned_docs/version-3.x/lakehouse/best-practices/doris-dlf-iceberg.md
@@ -0,0 +1,120 @@
+---
+{
+ "title": "Integrating Alibaba Cloud DLF Rest Catalog",
+ "language": "en",
+ "description": "This article explains how to integrate Apache Doris with
Alibaba Cloud DLF (Data Lake Formation) Rest Catalog for seamless access and
analysis of Iceberg table data, including guides on creating Catalog, querying
data, and incremental reading."
+}
+---
+
+Alibaba Cloud [Data Lake Formation
(DLF)](https://cn.aliyun.com/product/bigdata/dlf), as a core component of the
cloud-native data lake architecture, helps users quickly build cloud-native
data lake solutions. DLF provides unified metadata management on the data lake,
enterprise-level permission control, and seamless integration with multiple
compute engines, breaking down data silos and enabling business insights.
+
+- Unified Metadata and Storage
+
+ Big data compute engines share a single set of lake metadata and storage,
with data flowing seamlessly between lake products.
+
+- Unified Permission Management
+
+ Big data compute engines share a single set of lake table permission
configurations, enabling one-time setup with universal effect.
+
+- Storage Optimization
+
+ Provides optimization strategies including small file compaction, expired
snapshot cleanup, partition reorganization, and obsolete file cleanup to
improve storage efficiency.
+
+- Comprehensive Cloud Ecosystem Support
+
+ Deep integration with Alibaba Cloud products, including streaming and
batch compute engines, delivering out-of-the-box functionality and enhanced
user experience.
+
+Doris supports integration with DLF Iceberg Rest Catalog starting from version
4.1.0, enabling seamless connection to DLF for accessing and analyzing Iceberg
table data. This article demonstrates how to connect Apache Doris with DLF and
access Iceberg table data.
+
+:::tip
+This feature is supported starting from Doris version 4.1.0.
+:::
+
+## Usage Guide
+
+### 01 Enable DLF Service
+
+Please refer to the DLF official documentation to enable the DLF service and
create the corresponding Catalog, Database, and Table.
+
+### 02 Access DLF Using EMR Spark SQL
+
+- Connect
+
+ ```shell
+ spark-sql --master yarn \
+ --conf
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
\
+ --conf spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog
\
+ --conf
spark.sql.catalog.iceberg.catalog-impl=org.apache.iceberg.rest.RESTCatalog \
+ --conf
spark.sql.catalog.iceberg.uri=http://<region>-vpc.dlf.aliyuncs.com/iceberg \
+ --conf spark.sql.catalog.iceberg.warehouse=<your-catalog-name> \
+ --conf spark.sql.catalog.iceberg.credential=<ak>:<sk>
+ ```
+
+ > Replace the corresponding `<region>`, `warehouse`, `<ak>`, and `<sk>`.
+
+- Write Data
+
+ ```sql
+ USE iceberg.<your-catalog-name>;
+
+ CREATE TABLE users_samples
+ (
+ user_id INT,
+ age_level STRING,
+ final_gender_code STRING,
+ clk BOOLEAN
+ ) USING iceberg;
+
+ INSERT INTO users_samples VALUES
+ (1, '25-34', 'M', true),
+ (2, '18-24', 'F', false);
+
+ INSERT INTO users_samples VALUES
+ (3, '25-34', 'M', true),
+ (4, '18-24', 'F', false);
+
+ INSERT INTO users_samples VALUES
+ (5, '25-34', 'M', true),
+ (6, '18-24', 'F', false);
+ ```
+
+### 03 Connect to DLF Using Doris
+
+- Create Iceberg Catalog
+
+ ```sql
+ CREATE CATALOG ice PROPERTIES (
+ 'type' = 'iceberg',
+ 'iceberg.catalog.type' = 'rest',
+ 'iceberg.rest.uri' = 'http://<region>-vpc.dlf.aliyuncs.com/iceberg',
+ 'warehouse' = '<your-catalog-name>',
+ 'iceberg.rest.sigv4-enabled' = 'true',
+ 'iceberg.rest.signing-name' = 'DlfNext',
+ 'iceberg.rest.access-key-id' = '<ak>',
+ 'iceberg.rest.secret-access-key' = '<sk>',
+ 'iceberg.rest.signing-region' = '<region>',
+ 'iceberg.rest.vended-credentials-enabled' = 'true',
+ 'io-impl' = 'org.apache.iceberg.rest.DlfFileIO',
+ 'fs.oss.support' = 'true'
+ );
+ ```
+
+ - Doris uses the temporary credentials returned by DLF to access OSS
object storage, so no additional OSS credentials are required.
+ - DLF can only be accessed within the same VPC. Ensure you provide the
correct URI address.
+ - DLF Iceberg REST catalog requires SigV4 signature enabled, with specific
signing name for DLF `DlfNext`.
+
+- Query Data
+
+ ```sql
+ SELECT * FROM users_samples ORDER BY user_id;
+ +---------+-----------+-------------------+------+
+ | user_id | age_level | final_gender_code | clk |
+ +---------+-----------+-------------------+------+
+ | 1 | 25-34 | M | 1 |
+ | 2 | 18-24 | F | 0 |
+ | 3 | 25-34 | M | 1 |
+ | 4 | 18-24 | F | 0 |
+ | 5 | 25-34 | M | 1 |
+ | 6 | 18-24 | F | 0 |
+ +---------+-----------+-------------------+------+
+ ```
diff --git a/versioned_docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
b/versioned_docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
index c21e0355b12..6c1f2d11acc 100644
--- a/versioned_docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
+++ b/versioned_docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
@@ -604,6 +604,29 @@ Support for Nested Namespace needs to be explicitly
enabled. For details, please
</details>
### Aliyun DLF
+<details>
+ <summary>4.1+ Version</summary>
+ <Tabs>
+ <TabItem value='DLF 2.5+' label='DLF 2.5+' default>
+ ```sql
+ CREATE CATALOG iceberg_dlf2_catalog PROPERTIES (
+ 'type' = 'iceberg',
+ 'iceberg.catalog.type' = 'rest',
+ 'iceberg.rest.uri' =
'http://<region>-vpc.dlf.aliyuncs.com/iceberg',
+ 'warehouse' = '<your-catalog-name>',
+ 'iceberg.rest.sigv4-enabled' = 'true',
+ 'iceberg.rest.signing-name' = 'DlfNext',
+ 'iceberg.rest.access-key-id' = '<ak>',
+ 'iceberg.rest.secret-access-key' = '<sk>',
+ 'iceberg.rest.signing-region' = '<region>',
+ 'iceberg.rest.vended-credentials-enabled' = 'true',
+ 'io-impl' = 'org.apache.iceberg.rest.DlfFileIO',
+ 'fs.oss.support' = 'true'
+ );
+ ```
+ </TabItem>
+ </Tabs>
+</details>
<details>
<summary>3.1+ Version</summary>
<Tabs>
@@ -2140,6 +2163,21 @@ Supported schema change operations include:
Use the `MODIFY COLUMN` statement to modify column attributes, including
type, nullable, default value, comment, and column position.
+ Since version 4.0.4, Doris supports modifying complex types (STRUCT,
ARRAY, MAP), including safe type promotions and appending struct fields.
+
+ Safe type promotions supported in nested types:
+ - INT -> BIGINT, LARGEINT
+ - TINYINT -> SMALLINT, INT, BIGINT, LARGEINT
+ - SMALLINT -> INT, BIGINT, LARGEINT
+ - BIGINT -> LARGEINT
+ - FLOAT -> DOUBLE
+ - VARCHAR(n) -> VARCHAR(m) where m > n
+
+ Constraints for complex types:
+ - All new nested fields must be nullable.
+ - Cannot change optional to required.
+ - Default values for complex types only support NULL.
+
Note: When modifying column attributes, all attributes that are not being
modified should also be explicitly specified with their original values.
```sql
@@ -2157,6 +2195,23 @@ Supported schema change operations include:
ALTER TABLE iceberg_table MODIFY COLUMN id BIGINT NOT NULL DEFAULT 0 COMMENT
'This is a modified id column';
```
+ Example of modifying complex types:
+
+ ```sql
+ -- Create Iceberg table with complex types
+ CREATE TABLE iceberg_tbl (
+ id BIGINT,
+ user_info STRUCT<name:STRING, scores:ARRAY<INT>, age:INT>,
+ dt STRING
+ );
+
+ -- Append a new field (email) to the STRUCT column
+ ALTER TABLE iceberg_tbl MODIFY COLUMN user_info STRUCT<name:STRING,
scores:ARRAY<INT>, age:INT, email:STRING>;
+
+ -- Promote the nested ARRAY element type from INT to BIGINT
+ ALTER TABLE iceberg_tbl MODIFY COLUMN user_info STRUCT<name:STRING,
scores:ARRAY<BIGINT>, age:INT, email:STRING>;
+ ```
+
* **Reorder Columns**
Use `ORDER BY` to reorder columns by specifying the new column order.
diff --git a/versioned_docs/version-3.x/lakehouse/storages/huawei-obs.md
b/versioned_docs/version-3.x/lakehouse/storages/huawei-obs.md
index 4574ffee15d..e9d2842422c 100644
--- a/versioned_docs/version-3.x/lakehouse/storages/huawei-obs.md
+++ b/versioned_docs/version-3.x/lakehouse/storages/huawei-obs.md
@@ -14,8 +14,11 @@ This document describes the parameters required to access
Huawei Cloud OBS, whic
- Export properties
- Outfile properties
-**Doris uses S3 Client to access Huawei Cloud OBS through S3-compatible
protocol.**
+**Doris supports accessing Huawei Cloud OBS through S3-compatible protocol
(using S3 Client) or OBS native protocol (using native SDK).**
+:::info
+Starting from versions 3.0.5 and 4.1.0, Doris natively integrates the Huawei
Cloud OBS SDK. Users can access OBS data (such as Paimon Catalog) directly with
the `obs://` prefix, enabling better support for Huawei Cloud's Parallel File
System (PFS).
+:::
## Parameter Overview
| Property Name | Former Name | Description
| Default Value | Required |
diff --git
a/versioned_docs/version-4.x/lakehouse/best-practices/doris-dlf-iceberg.md
b/versioned_docs/version-4.x/lakehouse/best-practices/doris-dlf-iceberg.md
new file mode 100644
index 00000000000..cf8dca7ca59
--- /dev/null
+++ b/versioned_docs/version-4.x/lakehouse/best-practices/doris-dlf-iceberg.md
@@ -0,0 +1,120 @@
+---
+{
+ "title": "Integrating Alibaba Cloud DLF Rest Catalog",
+ "language": "en",
+ "description": "This article explains how to integrate Apache Doris with
Alibaba Cloud DLF (Data Lake Formation) Rest Catalog for seamless access and
analysis of Iceberg table data, including guides on creating Catalog, querying
data, and incremental reading."
+}
+---
+
+Alibaba Cloud [Data Lake Formation
(DLF)](https://cn.aliyun.com/product/bigdata/dlf), as a core component of the
cloud-native data lake architecture, helps users quickly build cloud-native
data lake solutions. DLF provides unified metadata management on the data lake,
enterprise-level permission control, and seamless integration with multiple
compute engines, breaking down data silos and enabling business insights.
+
+- Unified Metadata and Storage
+
+ Big data compute engines share a single set of lake metadata and storage,
with data flowing seamlessly between lake products.
+
+- Unified Permission Management
+
+ Big data compute engines share a single set of lake table permission
configurations, enabling one-time setup with universal effect.
+
+- Storage Optimization
+
+ Provides optimization strategies including small file compaction, expired
snapshot cleanup, partition reorganization, and obsolete file cleanup to
improve storage efficiency.
+
+- Comprehensive Cloud Ecosystem Support
+
+ Deep integration with Alibaba Cloud products, including streaming and
batch compute engines, delivering out-of-the-box functionality and enhanced
user experience.
+
+Doris supports integration with DLF Iceberg Rest Catalog starting from version
4.1.0, enabling seamless connection to DLF for accessing and analyzing Iceberg
table data. This article demonstrates how to connect Apache Doris with DLF and
access Iceberg table data.
+
+:::tip
+This feature is supported starting from Doris version 4.1.0.
+:::
+
+## Usage Guide
+
+### 01 Enable DLF Service
+
+Please refer to the DLF official documentation to enable the DLF service and
create the corresponding Catalog, Database, and Table.
+
+### 02 Access DLF Using EMR Spark SQL
+
+- Connect
+
+ ```shell
+ spark-sql --master yarn \
+ --conf
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
\
+ --conf spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog
\
+ --conf
spark.sql.catalog.iceberg.catalog-impl=org.apache.iceberg.rest.RESTCatalog \
+ --conf
spark.sql.catalog.iceberg.uri=http://<region>-vpc.dlf.aliyuncs.com/iceberg \
+ --conf spark.sql.catalog.iceberg.warehouse=<your-catalog-name> \
+ --conf spark.sql.catalog.iceberg.credential=<ak>:<sk>
+ ```
+
+ > Replace the corresponding `<region>`, `warehouse`, `<ak>`, and `<sk>`.
+
+- Write Data
+
+ ```sql
+ USE iceberg.<your-catalog-name>;
+
+ CREATE TABLE users_samples
+ (
+ user_id INT,
+ age_level STRING,
+ final_gender_code STRING,
+ clk BOOLEAN
+ ) USING iceberg;
+
+ INSERT INTO users_samples VALUES
+ (1, '25-34', 'M', true),
+ (2, '18-24', 'F', false);
+
+ INSERT INTO users_samples VALUES
+ (3, '25-34', 'M', true),
+ (4, '18-24', 'F', false);
+
+ INSERT INTO users_samples VALUES
+ (5, '25-34', 'M', true),
+ (6, '18-24', 'F', false);
+ ```
+
+### 03 Connect to DLF Using Doris
+
+- Create Iceberg Catalog
+
+ ```sql
+ CREATE CATALOG ice PROPERTIES (
+ 'type' = 'iceberg',
+ 'iceberg.catalog.type' = 'rest',
+ 'iceberg.rest.uri' = 'http://<region>-vpc.dlf.aliyuncs.com/iceberg',
+ 'warehouse' = '<your-catalog-name>',
+ 'iceberg.rest.sigv4-enabled' = 'true',
+ 'iceberg.rest.signing-name' = 'DlfNext',
+ 'iceberg.rest.access-key-id' = '<ak>',
+ 'iceberg.rest.secret-access-key' = '<sk>',
+ 'iceberg.rest.signing-region' = '<region>',
+ 'iceberg.rest.vended-credentials-enabled' = 'true',
+ 'io-impl' = 'org.apache.iceberg.rest.DlfFileIO',
+ 'fs.oss.support' = 'true'
+ );
+ ```
+
+ - Doris uses the temporary credentials returned by DLF to access OSS
object storage, so no additional OSS credentials are required.
+ - DLF can only be accessed within the same VPC. Ensure you provide the
correct URI address.
+ - DLF Iceberg REST catalog requires SigV4 signature enabled, with specific
signing name for DLF `DlfNext`.
+
+- Query Data
+
+ ```sql
+ SELECT * FROM users_samples ORDER BY user_id;
+ +---------+-----------+-------------------+------+
+ | user_id | age_level | final_gender_code | clk |
+ +---------+-----------+-------------------+------+
+ | 1 | 25-34 | M | 1 |
+ | 2 | 18-24 | F | 0 |
+ | 3 | 25-34 | M | 1 |
+ | 4 | 18-24 | F | 0 |
+ | 5 | 25-34 | M | 1 |
+ | 6 | 18-24 | F | 0 |
+ +---------+-----------+-------------------+------+
+ ```
diff --git a/versioned_docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
b/versioned_docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
index c21e0355b12..6c1f2d11acc 100644
--- a/versioned_docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
+++ b/versioned_docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
@@ -604,6 +604,29 @@ Support for Nested Namespace needs to be explicitly
enabled. For details, please
</details>
### Aliyun DLF
+<details>
+ <summary>4.1+ Version</summary>
+ <Tabs>
+ <TabItem value='DLF 2.5+' label='DLF 2.5+' default>
+ ```sql
+ CREATE CATALOG iceberg_dlf2_catalog PROPERTIES (
+ 'type' = 'iceberg',
+ 'iceberg.catalog.type' = 'rest',
+ 'iceberg.rest.uri' =
'http://<region>-vpc.dlf.aliyuncs.com/iceberg',
+ 'warehouse' = '<your-catalog-name>',
+ 'iceberg.rest.sigv4-enabled' = 'true',
+ 'iceberg.rest.signing-name' = 'DlfNext',
+ 'iceberg.rest.access-key-id' = '<ak>',
+ 'iceberg.rest.secret-access-key' = '<sk>',
+ 'iceberg.rest.signing-region' = '<region>',
+ 'iceberg.rest.vended-credentials-enabled' = 'true',
+ 'io-impl' = 'org.apache.iceberg.rest.DlfFileIO',
+ 'fs.oss.support' = 'true'
+ );
+ ```
+ </TabItem>
+ </Tabs>
+</details>
<details>
<summary>3.1+ Version</summary>
<Tabs>
@@ -2140,6 +2163,21 @@ Supported schema change operations include:
Use the `MODIFY COLUMN` statement to modify column attributes, including
type, nullable, default value, comment, and column position.
+ Since version 4.0.4, Doris supports modifying complex types (STRUCT,
ARRAY, MAP), including safe type promotions and appending struct fields.
+
+ Safe type promotions supported in nested types:
+ - INT -> BIGINT, LARGEINT
+ - TINYINT -> SMALLINT, INT, BIGINT, LARGEINT
+ - SMALLINT -> INT, BIGINT, LARGEINT
+ - BIGINT -> LARGEINT
+ - FLOAT -> DOUBLE
+ - VARCHAR(n) -> VARCHAR(m) where m > n
+
+ Constraints for complex types:
+ - All new nested fields must be nullable.
+ - Cannot change optional to required.
+ - Default values for complex types only support NULL.
+
Note: When modifying column attributes, all attributes that are not being
modified should also be explicitly specified with their original values.
```sql
@@ -2157,6 +2195,23 @@ Supported schema change operations include:
ALTER TABLE iceberg_table MODIFY COLUMN id BIGINT NOT NULL DEFAULT 0 COMMENT
'This is a modified id column';
```
+ Example of modifying complex types:
+
+ ```sql
+ -- Create Iceberg table with complex types
+ CREATE TABLE iceberg_tbl (
+ id BIGINT,
+ user_info STRUCT<name:STRING, scores:ARRAY<INT>, age:INT>,
+ dt STRING
+ );
+
+ -- Append a new field (email) to the STRUCT column
+ ALTER TABLE iceberg_tbl MODIFY COLUMN user_info STRUCT<name:STRING,
scores:ARRAY<INT>, age:INT, email:STRING>;
+
+ -- Promote the nested ARRAY element type from INT to BIGINT
+ ALTER TABLE iceberg_tbl MODIFY COLUMN user_info STRUCT<name:STRING,
scores:ARRAY<BIGINT>, age:INT, email:STRING>;
+ ```
+
* **Reorder Columns**
Use `ORDER BY` to reorder columns by specifying the new column order.
diff --git a/versioned_docs/version-4.x/lakehouse/storages/huawei-obs.md
b/versioned_docs/version-4.x/lakehouse/storages/huawei-obs.md
index 4574ffee15d..e9d2842422c 100644
--- a/versioned_docs/version-4.x/lakehouse/storages/huawei-obs.md
+++ b/versioned_docs/version-4.x/lakehouse/storages/huawei-obs.md
@@ -14,8 +14,11 @@ This document describes the parameters required to access
Huawei Cloud OBS, whic
- Export properties
- Outfile properties
-**Doris uses S3 Client to access Huawei Cloud OBS through S3-compatible
protocol.**
+**Doris supports accessing Huawei Cloud OBS through S3-compatible protocol
(using S3 Client) or OBS native protocol (using native SDK).**
+:::info
+Starting from versions 3.0.5 and 4.1.0, Doris natively integrates the Huawei
Cloud OBS SDK. Users can access OBS data (such as Paimon Catalog) directly with
the `obs://` prefix, enabling better support for Huawei Cloud's Parallel File
System (PFS).
+:::
## Parameter Overview
| Property Name | Former Name | Description
| Default Value | Required |
diff --git a/versioned_sidebars/version-3.x-sidebars.json
b/versioned_sidebars/version-3.x-sidebars.json
index 15c58ec24f5..d381677d1bb 100644
--- a/versioned_sidebars/version-3.x-sidebars.json
+++ b/versioned_sidebars/version-3.x-sidebars.json
@@ -414,7 +414,8 @@
"lakehouse/best-practices/doris-onelake",
"lakehouse/best-practices/doris-unity-catalog",
"lakehouse/best-practices/doris-lakekeeper",
- "lakehouse/best-practices/doris-nessie"
+
"lakehouse/best-practices/doris-nessie",
+
"lakehouse/best-practices/doris-dlf-iceberg"
]
},
{
diff --git a/versioned_sidebars/version-4.x-sidebars.json
b/versioned_sidebars/version-4.x-sidebars.json
index 7f0db07a843..73619501c95 100644
--- a/versioned_sidebars/version-4.x-sidebars.json
+++ b/versioned_sidebars/version-4.x-sidebars.json
@@ -366,7 +366,8 @@
"lakehouse/best-practices/doris-onelake",
"lakehouse/best-practices/doris-unity-catalog",
"lakehouse/best-practices/doris-lakekeeper",
- "lakehouse/best-practices/doris-nessie"
+
"lakehouse/best-practices/doris-nessie",
+
"lakehouse/best-practices/doris-dlf-iceberg"
]
},
{
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]