(doris-website) branch master updated: [doc] add SeaweedFS integration doc (#3607)

morningman Thu, 14 May 2026 16:03:29 -0700

This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git



The following commit(s) were added to refs/heads/master by this push:
     new 59102c55ca9 [doc] add SeaweedFS integration doc (#3607)
59102c55ca9 is described below

commit 59102c55ca9c96e2c50cbf96c43e2bb969b3c56c
Author: Chris Lu <[email protected]>
AuthorDate: Thu May 14 16:03:18 2026 -0700

    [doc] add SeaweedFS integration doc (#3607)
    
    ## Versions
    
    - [x] dev
    - [x] 4.x
    - [x] 3.x
    - [ ] 2.1
    
    ## Languages
    
    - [x] Chinese
    - [x] English
    
    ## Summary
    
    Adds an Iceberg lakehouse integration page for
    [SeaweedFS](https://github.com/seaweedfs/seaweedfs), which exposes both
    an S3 object endpoint and an Apache Iceberg REST Catalog from the same
    `weed` process. The doc walks through:
    
    1. Starting `weed mini` with a single IAM config and a pre-created S3
    Tables bucket.
    2. Registering the catalog in Doris with `iceberg.catalog.type =
    "rest"`, where the OAuth2 client credentials and the S3 keys are the
    same access pair.
    3. Reading and writing an Iceberg table.
    
    The same end-to-end path is exercised in CI by the
    `TestDorisIcebergCatalog` integration test in the SeaweedFS repo
    (`test/s3tables/catalog_doris/`), which boots SeaweedFS, registers a
    Doris Iceberg catalog against it, writes rows via PyIceberg, and reads
    them back from `apache/doris:doris-all-in-one-2.1.0`. The Doris catalog
    properties in the doc are the ones the test uses verbatim.
    
    ## Files
    
    - 4 English pages: `docs/`, `docs-next/`, `versioned_docs/version-3.x`,
    `versioned_docs/version-4.x`
    - 4 Chinese pages: corresponding `i18n/zh-CN/...` paths
    - 4 sidebars: `sidebars.ts`, `sidebars-next.ts`,
    `versioned_sidebars/version-{3,4}.x-sidebars.json` — `doris-seaweedfs`
    slotted under the Iceberg Catalog category, after `doris-lakekeeper`.
    
    I'm the maintainer of SeaweedFS, so I can keep this page in sync with
    future changes upstream.
---
 .../lakehouse/best-practices/doris-seaweedfs.md    | 170 ++++++++++++++++++++
 .../lakehouse/best-practices/doris-seaweedfs.md    | 171 +++++++++++++++++++++
 .../lakehouse/best-practices/doris-seaweedfs.md    | 171 +++++++++++++++++++++
 .../lakehouse/best-practices/doris-seaweedfs.md    | 171 +++++++++++++++++++++
 sidebars-next.ts                                   |   1 +
 .../lakehouse/best-practices/doris-seaweedfs.md    | 170 ++++++++++++++++++++
 .../lakehouse/best-practices/doris-seaweedfs.md    | 170 ++++++++++++++++++++
 versioned_sidebars/version-3.x-sidebars.json       |   1 +
 versioned_sidebars/version-4.x-sidebars.json       |   1 +
 9 files changed, 1026 insertions(+)

diff --git a/docs-next/lakehouse/best-practices/doris-seaweedfs.md 
b/docs-next/lakehouse/best-practices/doris-seaweedfs.md
new file mode 100644
index 00000000000..89ca2995e04
--- /dev/null
+++ b/docs-next/lakehouse/best-practices/doris-seaweedfs.md
@@ -0,0 +1,170 @@
+---
+{
+    "title": "Integration with SeaweedFS",
+    "language": "en"
+}
+---
+
+[SeaweedFS](https://seaweedfs.com/) is a distributed storage system that 
exposes both an S3-compatible object API and an Apache Iceberg REST Catalog 
from the same `weed` process. Parquet data and Iceberg metadata are served by 
one executable, authenticated by one S3 credential pair.
+
+This page shows the minimal configuration that turns SeaweedFS into a 
Doris-backed Iceberg lakehouse. The same end-to-end path is exercised by the 
[`TestDorisIcebergCatalog`](https://github.com/seaweedfs/seaweedfs/tree/master/test/s3tables/catalog_doris)
 integration test in the SeaweedFS repository, which boots a SeaweedFS mini 
cluster, registers a Doris Iceberg catalog against it, writes rows with 
PyIceberg, and reads them back from `apache/doris:doris-all-in-one-2.1.0`.
+
+## Why SeaweedFS for an Iceberg lakehouse
+
+A typical lakehouse stack today stitches together three layers:
+
+* Object storage (S3 or compatible)
+* A standalone Iceberg catalog (Hive Metastore, Glue, Polaris, Lakekeeper, 
Nessie, ...)
+* A query engine (Doris, Spark, Trino, ...)
+
+SeaweedFS collapses the first two into one process. The same `weed` executable 
is both:
+
+* the S3-compatible object store that holds the parquet files, and
+* the Iceberg REST Catalog that holds the table metadata.
+
+So Doris talks to one system instead of two. The practical implications:
+
+* **Fewer moving parts.** No Hive Metastore, no Glue, no Postgres backing a 
separate catalog, no STS role to provision.
+* **Simpler deployment.** One executable, one IAM config, one S3 credential 
pair shared by Doris's Iceberg REST client and its S3 reader.
+* **Local or on-prem friendly.** Nothing in the path requires a cloud-native 
service. The same setup runs on a laptop, a single VM, or a Kubernetes cluster.
+* **Lower latency on the metadata path.** Catalog state lives in the same 
SeaweedFS filer that serves the data, so namespace and table lookups don't 
cross a separate service boundary.
+* **S3-native on disk.** Tables are stored as standard Iceberg directories in 
S3 buckets. Any S3 client (rclone, `aws s3`, Spark, Trino, Dremio, RisingWave) 
can read or replicate them alongside Doris.
+
+Architecturally:
+
+```text
+Doris
+  |
+  v
+Iceberg tables
+  |
+  v
+SeaweedFS  (S3 storage + REST catalog)
+```
+
+For smaller teams or internal platforms, this is a clean way to build a 
lakehouse without depending on a separate metastore service.
+
+## 1. Start SeaweedFS
+
+Build or install `weed` from 
[github.com/seaweedfs/seaweedfs](https://github.com/seaweedfs/seaweedfs).
+
+Create an IAM config that grants an access key full S3 access. The same key is 
also used as the OAuth2 client for the Iceberg REST endpoint:
+
+```json
+{
+  "identities": [
+    {
+      "name": "doris",
+      "credentials": [
+        {
+          "accessKey": "AKIAIOSFODNN7EXAMPLE",
+          "secretKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
+        }
+      ],
+      "actions": ["Admin"]
+    }
+  ]
+}
+```
+
+Start a single-process cluster with the Iceberg REST endpoint and a 
pre-created table bucket:
+
+```bash
+weed mini \
+  -ip $(hostname -I | awk '{print $1}') \
+  -dir /var/lib/seaweedfs \
+  -s3.config /etc/seaweedfs/iam_config.json \
+  -tableBucket iceberg-tables
+```
+
+`weed mini` runs master, volume, filer, S3, and the Iceberg REST catalog in 
one process. Default ports:
+
+| Component | Port | Override flag |
+| --------- | ---- | ------------- |
+| Master HTTP | 9333 | `-master.port` |
+| Filer HTTP | 8888 | `-filer.port` |
+| S3 | 8333 | `-s3.port` |
+| Iceberg REST | 8181 | `-s3.port.iceberg` |
+
+`-tableBucket iceberg-tables` creates the S3 Tables bucket on startup, which 
is the Iceberg-aware bucket type Doris will write into.
+
+To verify the catalog is reachable:
+
+```bash
+curl -s http://SEAWEED_HOST:8181/v1/config | jq .
+```
+
+## 2. Register the Iceberg catalog in Doris
+
+```sql
+CREATE CATALOG seaweedfs PROPERTIES (
+    "type" = "iceberg",
+    "iceberg.catalog.type" = "rest",
+    "uri" = "http://SEAWEED_HOST:8181";,
+    "warehouse" = "s3://iceberg-tables",
+    "credential" = 
"AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+    "s3.endpoint" = "http://SEAWEED_HOST:8333";,
+    "s3.access_key" = "AKIAIOSFODNN7EXAMPLE",
+    "s3.secret_key" = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+    "s3.region" = "us-west-2",
+    "use_path_style" = "true"
+);
+```
+
+Notes:
+
+* `credential = "<access_key>:<secret_key>"` is forwarded by Doris's Iceberg 
REST client as OAuth2 client credentials. SeaweedFS validates them against the 
same IAM config that secures the S3 endpoint.
+* The `s3.*` properties are used by Doris's own parquet reader and writer. 
They point at the same `weed` process — same host, same key pair.
+* `use_path_style = "true"` is required because SeaweedFS serves S3 in 
path-style by default.
+* The integration test uses these exact properties; see 
[`createDorisIcebergCatalog`](https://github.com/seaweedfs/seaweedfs/blob/master/test/s3tables/catalog_doris/doris_catalog_test.go)
 for the canonical form.
+
+If you create namespaces or tables outside Doris (for example with PyIceberg) 
before the catalog is registered, refresh the metadata cache:
+
+```sql
+REFRESH CATALOG seaweedfs;
+```
+
+## 3. Use the catalog
+
+```sql
+USE seaweedfs;
+
+CREATE DATABASE IF NOT EXISTS demo;
+
+USE seaweedfs.demo;
+
+CREATE TABLE iceberg_smoke (
+  id BIGINT,
+  label STRING
+);
+
+INSERT INTO iceberg_smoke VALUES (1, 'one'), (2, 'two'), (3, 'three');
+
+SELECT id, label FROM iceberg_smoke ORDER BY id;
+```
+
+Expected output:
+
+```text
++----+-------+
+| id | label |
++----+-------+
+|  1 | one   |
+|  2 | two   |
+|  3 | three |
++----+-------+
+```
+
+This is the same path the SeaweedFS integration test exercises: namespace and 
table created through the Iceberg REST catalog, rows appended via PyIceberg, 
and reads served by Doris through the standard S3 plus Iceberg metadata flow.
+
+## Production notes
+
+* For a production cluster, replace `weed mini` with `weed master`, `weed 
volume`, `weed filer`, and `weed s3 -iceberg.port=8181` (or use the SeaweedFS 
Helm chart). The Doris-side configuration is identical — only the host and 
ports change.
+* The OAuth2 credential is the S3 access key. To rotate Doris's catalog 
access, rotate the IAM identity that holds it, the same way you rotate any S3 
user.
+* Iceberg table maintenance (compaction, snapshot expiration, orphan removal, 
manifest rewriting) is built into SeaweedFS and runs against the same bucket. 
See the [SeaweedFS Iceberg Catalog 
wiki](https://github.com/seaweedfs/seaweedfs/wiki/SeaweedFS-Iceberg-Catalog) 
for details.
+
+## References
+
+* [SeaweedFS](https://github.com/seaweedfs/seaweedfs)
+* [Doris Iceberg integration test in 
SeaweedFS](https://github.com/seaweedfs/seaweedfs/tree/master/test/s3tables/catalog_doris)
+* [Doris Iceberg Catalog 
reference](https://doris.apache.org/docs/lakehouse/catalogs/iceberg-catalog)
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs-next/current/lakehouse/best-practices/doris-seaweedfs.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs-next/current/lakehouse/best-practices/doris-seaweedfs.md
new file mode 100644
index 00000000000..cfa87e50c79
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs-next/current/lakehouse/best-practices/doris-seaweedfs.md
@@ -0,0 +1,171 @@
+---
+{
+    "title": "集成 SeaweedFS",
+    "language": "zh-CN",
+    "description": "使用 SeaweedFS 同时承载 Iceberg 表的对象存储和 REST 
Catalog，凭证、部署、运维三位一体。"
+}
+---
+
+[SeaweedFS](https://seaweedfs.com/) 是一个分布式存储系统，单个 `weed` 进程即可同时提供 S3 
兼容的对象存储接口和 Apache Iceberg REST Catalog。Parquet 数据和 Iceberg 
元数据由同一个执行文件对外服务，并使用同一对 S3 凭证完成鉴权。
+
+本文介绍将 SeaweedFS 作为 Doris 的 Iceberg Lakehouse 后端的最小配置。完整的端到端路径已经在 SeaweedFS 仓库的 
[`TestDorisIcebergCatalog`](https://github.com/seaweedfs/seaweedfs/tree/master/test/s3tables/catalog_doris)
 集成测试中验证：测试会启动 SeaweedFS mini 集群，在 Doris 中注册 Iceberg Catalog，通过 PyIceberg 
写入数据，再由 `apache/doris:doris-all-in-one-2.1.0` 容器读回。
+
+## 为什么用 SeaweedFS 搭 Iceberg Lakehouse
+
+当下的 Lakehouse 架构通常需要把三层系统拼起来：
+
+* 对象存储（S3 或兼容实现）
+* 独立的 Iceberg Catalog（Hive Metastore、Glue、Polaris、Lakekeeper、Nessie 等）
+* 查询引擎（Doris、Spark、Trino 等）
+
+SeaweedFS 把前两层合并到了同一个进程里。同一个 `weed` 执行文件既是：
+
+* 存放 parquet 文件的 S3 兼容对象存储，
+* 也是存放表元数据的 Iceberg REST Catalog。
+
+也就是说，Doris 只需要对接一个系统，而不是两个。具体好处：
+
+* **更少的组件。** 不再需要 Hive Metastore、Glue，不需要为 Catalog 单独部署 Postgres，也不需要单独维护 STS 
角色。
+* **更简单的部署。** 一个执行文件、一份 IAM 配置；Doris 的 Iceberg REST 客户端和 S3 读写器共用同一对 S3 凭证。
+* **适合本地与私有化场景。** 整个链路不依赖任何云服务，从笔记本、单台 VM 到 Kubernetes 集群，部署方式一致。
+* **元数据路径更低延时。** Catalog 状态保存在同一个 SeaweedFS filer 中，与数据为邻；命名空间和表元数据查询不再跨独立服务。
+* **磁盘上是标准 S3。** 表以标准 Iceberg 目录结构存放在 S3 桶中，任何 S3 客户端（rclone、`aws 
s3`、Spark、Trino、Dremio、RisingWave）都可以与 Doris 一同读取或复制。
+
+架构上：
+
+```text
+Doris
+  |
+  v
+Iceberg 表
+  |
+  v
+SeaweedFS  (S3 存储 + REST Catalog)
+```
+
+对于小团队和内部数据平台来说，这是一种不依赖独立 Catalog 服务、就能搭起 Lakehouse 的干净方式。
+
+## 1. 启动 SeaweedFS
+
+在 [github.com/seaweedfs/seaweedfs](https://github.com/seaweedfs/seaweedfs) 
编译或安装 `weed`。
+
+准备一份 IAM 配置，给一个访问密钥授予 S3 权限。同一个密钥也作为 Iceberg REST 端点的 OAuth2 客户端：
+
+```json
+{
+  "identities": [
+    {
+      "name": "doris",
+      "credentials": [
+        {
+          "accessKey": "AKIAIOSFODNN7EXAMPLE",
+          "secretKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
+        }
+      ],
+      "actions": ["Admin"]
+    }
+  ]
+}
+```
+
+启动单进程集群，并在启动时创建用于 Iceberg 的 Table Bucket：
+
+```bash
+weed mini \
+  -ip $(hostname -I | awk '{print $1}') \
+  -dir /var/lib/seaweedfs \
+  -s3.config /etc/seaweedfs/iam_config.json \
+  -tableBucket iceberg-tables
+```
+
+`weed mini` 会在一个进程内同时启动 master、volume、filer、S3 和 Iceberg REST Catalog。默认端口：
+
+| 组件 | 端口 | 修改参数 |
+| ---- | ---- | -------- |
+| Master HTTP | 9333 | `-master.port` |
+| Filer HTTP | 8888 | `-filer.port` |
+| S3 | 8333 | `-s3.port` |
+| Iceberg REST | 8181 | `-s3.port.iceberg` |
+
+`-tableBucket iceberg-tables` 会在启动时创建一个 S3 Tables 类型的 Bucket，也就是 Doris 后续写入 
Iceberg 表所用的 Bucket。
+
+验证 Catalog 端点可用：
+
+```bash
+curl -s http://SEAWEED_HOST:8181/v1/config | jq .
+```
+
+## 2. 在 Doris 中注册 Iceberg Catalog
+
+```sql
+CREATE CATALOG seaweedfs PROPERTIES (
+    "type" = "iceberg",
+    "iceberg.catalog.type" = "rest",
+    "uri" = "http://SEAWEED_HOST:8181";,
+    "warehouse" = "s3://iceberg-tables",
+    "credential" = 
"AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+    "s3.endpoint" = "http://SEAWEED_HOST:8333";,
+    "s3.access_key" = "AKIAIOSFODNN7EXAMPLE",
+    "s3.secret_key" = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+    "s3.region" = "us-west-2",
+    "use_path_style" = "true"
+);
+```
+
+说明：
+
+* `credential = "<access_key>:<secret_key>"` 会被 Doris 的 Iceberg REST 客户端作为 
OAuth2 client credentials 发起鉴权。SeaweedFS 用同一份 IAM 配置校验。
+* `s3.*` 系列属性给 Doris 本地的 parquet 读写器使用，指向同一个 `weed` 进程，主机和密钥都和上面一致。
+* `use_path_style = "true"` 是必需的，SeaweedFS 默认采用 path-style 的 S3 协议。
+* 集成测试使用的就是上述属性，可参考 
[`createDorisIcebergCatalog`](https://github.com/seaweedfs/seaweedfs/blob/master/test/s3tables/catalog_doris/doris_catalog_test.go)。
+
+如果在注册 Catalog 前已经通过其他客户端（例如 PyIceberg）创建了 Namespace 或表，需要刷新元数据缓存：
+
+```sql
+REFRESH CATALOG seaweedfs;
+```
+
+## 3. 使用 Catalog
+
+```sql
+USE seaweedfs;
+
+CREATE DATABASE IF NOT EXISTS demo;
+
+USE seaweedfs.demo;
+
+CREATE TABLE iceberg_smoke (
+  id BIGINT,
+  label STRING
+);
+
+INSERT INTO iceberg_smoke VALUES (1, 'one'), (2, 'two'), (3, 'three');
+
+SELECT id, label FROM iceberg_smoke ORDER BY id;
+```
+
+预期结果：
+
+```text
++----+-------+
+| id | label |
++----+-------+
+|  1 | one   |
+|  2 | two   |
+|  3 | three |
++----+-------+
+```
+
+这正是 SeaweedFS 集成测试覆盖的路径：通过 Iceberg REST Catalog 创建 Namespace 和表，由 PyIceberg 
追加数据，再由 Doris 通过 S3 加 Iceberg 元数据走标准链路读回。
+
+## 生产部署建议
+
+* 生产环境可以把 `weed mini` 拆成 `weed master`、`weed volume`、`weed filer`，再加 `weed s3 
-iceberg.port=8181`，也可以使用 SeaweedFS Helm Chart。Doris 这边的配置完全不用改，只需替换主机和端口。
+* OAuth2 credential 就是 S3 访问密钥，需要轮换 Doris 的 Catalog 凭证时，按普通 S3 用户的方式轮换 IAM 
身份即可。
+* Iceberg 表的运维任务（Compaction、Snapshot Expiration、Orphan Removal、Manifest 
Rewriting）由 SeaweedFS 内置实现，针对同一个 Bucket 运行，详见 [SeaweedFS Iceberg Catalog 
Wiki](https://github.com/seaweedfs/seaweedfs/wiki/SeaweedFS-Iceberg-Catalog)。
+
+## 相关链接
+
+* [SeaweedFS](https://github.com/seaweedfs/seaweedfs)
+* [SeaweedFS 中的 Doris Iceberg 
集成测试](https://github.com/seaweedfs/seaweedfs/tree/master/test/s3tables/catalog_doris)
+* [Doris Iceberg Catalog 
文档](https://doris.apache.org/zh-CN/docs/lakehouse/catalogs/iceberg-catalog)
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/doris-seaweedfs.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/doris-seaweedfs.md
new file mode 100644
index 00000000000..cfa87e50c79
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/doris-seaweedfs.md
@@ -0,0 +1,171 @@
+---
+{
+    "title": "集成 SeaweedFS",
+    "language": "zh-CN",
+    "description": "使用 SeaweedFS 同时承载 Iceberg 表的对象存储和 REST 
Catalog，凭证、部署、运维三位一体。"
+}
+---
+
+[SeaweedFS](https://seaweedfs.com/) 是一个分布式存储系统，单个 `weed` 进程即可同时提供 S3 
兼容的对象存储接口和 Apache Iceberg REST Catalog。Parquet 数据和 Iceberg 
元数据由同一个执行文件对外服务，并使用同一对 S3 凭证完成鉴权。
+
+本文介绍将 SeaweedFS 作为 Doris 的 Iceberg Lakehouse 后端的最小配置。完整的端到端路径已经在 SeaweedFS 仓库的 
[`TestDorisIcebergCatalog`](https://github.com/seaweedfs/seaweedfs/tree/master/test/s3tables/catalog_doris)
 集成测试中验证：测试会启动 SeaweedFS mini 集群，在 Doris 中注册 Iceberg Catalog，通过 PyIceberg 
写入数据，再由 `apache/doris:doris-all-in-one-2.1.0` 容器读回。
+
+## 为什么用 SeaweedFS 搭 Iceberg Lakehouse
+
+当下的 Lakehouse 架构通常需要把三层系统拼起来：
+
+* 对象存储（S3 或兼容实现）
+* 独立的 Iceberg Catalog（Hive Metastore、Glue、Polaris、Lakekeeper、Nessie 等）
+* 查询引擎（Doris、Spark、Trino 等）
+
+SeaweedFS 把前两层合并到了同一个进程里。同一个 `weed` 执行文件既是：
+
+* 存放 parquet 文件的 S3 兼容对象存储，
+* 也是存放表元数据的 Iceberg REST Catalog。
+
+也就是说，Doris 只需要对接一个系统，而不是两个。具体好处：
+
+* **更少的组件。** 不再需要 Hive Metastore、Glue，不需要为 Catalog 单独部署 Postgres，也不需要单独维护 STS 
角色。
+* **更简单的部署。** 一个执行文件、一份 IAM 配置；Doris 的 Iceberg REST 客户端和 S3 读写器共用同一对 S3 凭证。
+* **适合本地与私有化场景。** 整个链路不依赖任何云服务，从笔记本、单台 VM 到 Kubernetes 集群，部署方式一致。
+* **元数据路径更低延时。** Catalog 状态保存在同一个 SeaweedFS filer 中，与数据为邻；命名空间和表元数据查询不再跨独立服务。
+* **磁盘上是标准 S3。** 表以标准 Iceberg 目录结构存放在 S3 桶中，任何 S3 客户端（rclone、`aws 
s3`、Spark、Trino、Dremio、RisingWave）都可以与 Doris 一同读取或复制。
+
+架构上：
+
+```text
+Doris
+  |
+  v
+Iceberg 表
+  |
+  v
+SeaweedFS  (S3 存储 + REST Catalog)
+```
+
+对于小团队和内部数据平台来说，这是一种不依赖独立 Catalog 服务、就能搭起 Lakehouse 的干净方式。
+
+## 1. 启动 SeaweedFS
+
+在 [github.com/seaweedfs/seaweedfs](https://github.com/seaweedfs/seaweedfs) 
编译或安装 `weed`。
+
+准备一份 IAM 配置，给一个访问密钥授予 S3 权限。同一个密钥也作为 Iceberg REST 端点的 OAuth2 客户端：
+
+```json
+{
+  "identities": [
+    {
+      "name": "doris",
+      "credentials": [
+        {
+          "accessKey": "AKIAIOSFODNN7EXAMPLE",
+          "secretKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
+        }
+      ],
+      "actions": ["Admin"]
+    }
+  ]
+}
+```
+
+启动单进程集群，并在启动时创建用于 Iceberg 的 Table Bucket：
+
+```bash
+weed mini \
+  -ip $(hostname -I | awk '{print $1}') \
+  -dir /var/lib/seaweedfs \
+  -s3.config /etc/seaweedfs/iam_config.json \
+  -tableBucket iceberg-tables
+```
+
+`weed mini` 会在一个进程内同时启动 master、volume、filer、S3 和 Iceberg REST Catalog。默认端口：
+
+| 组件 | 端口 | 修改参数 |
+| ---- | ---- | -------- |
+| Master HTTP | 9333 | `-master.port` |
+| Filer HTTP | 8888 | `-filer.port` |
+| S3 | 8333 | `-s3.port` |
+| Iceberg REST | 8181 | `-s3.port.iceberg` |
+
+`-tableBucket iceberg-tables` 会在启动时创建一个 S3 Tables 类型的 Bucket，也就是 Doris 后续写入 
Iceberg 表所用的 Bucket。
+
+验证 Catalog 端点可用：
+
+```bash
+curl -s http://SEAWEED_HOST:8181/v1/config | jq .
+```
+
+## 2. 在 Doris 中注册 Iceberg Catalog
+
+```sql
+CREATE CATALOG seaweedfs PROPERTIES (
+    "type" = "iceberg",
+    "iceberg.catalog.type" = "rest",
+    "uri" = "http://SEAWEED_HOST:8181";,
+    "warehouse" = "s3://iceberg-tables",
+    "credential" = 
"AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+    "s3.endpoint" = "http://SEAWEED_HOST:8333";,
+    "s3.access_key" = "AKIAIOSFODNN7EXAMPLE",
+    "s3.secret_key" = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+    "s3.region" = "us-west-2",
+    "use_path_style" = "true"
+);
+```
+
+说明：
+
+* `credential = "<access_key>:<secret_key>"` 会被 Doris 的 Iceberg REST 客户端作为 
OAuth2 client credentials 发起鉴权。SeaweedFS 用同一份 IAM 配置校验。
+* `s3.*` 系列属性给 Doris 本地的 parquet 读写器使用，指向同一个 `weed` 进程，主机和密钥都和上面一致。
+* `use_path_style = "true"` 是必需的，SeaweedFS 默认采用 path-style 的 S3 协议。
+* 集成测试使用的就是上述属性，可参考 
[`createDorisIcebergCatalog`](https://github.com/seaweedfs/seaweedfs/blob/master/test/s3tables/catalog_doris/doris_catalog_test.go)。
+
+如果在注册 Catalog 前已经通过其他客户端（例如 PyIceberg）创建了 Namespace 或表，需要刷新元数据缓存：
+
+```sql
+REFRESH CATALOG seaweedfs;
+```
+
+## 3. 使用 Catalog
+
+```sql
+USE seaweedfs;
+
+CREATE DATABASE IF NOT EXISTS demo;
+
+USE seaweedfs.demo;
+
+CREATE TABLE iceberg_smoke (
+  id BIGINT,
+  label STRING
+);
+
+INSERT INTO iceberg_smoke VALUES (1, 'one'), (2, 'two'), (3, 'three');
+
+SELECT id, label FROM iceberg_smoke ORDER BY id;
+```
+
+预期结果：
+
+```text
++----+-------+
+| id | label |
++----+-------+
+|  1 | one   |
+|  2 | two   |
+|  3 | three |
++----+-------+
+```
+
+这正是 SeaweedFS 集成测试覆盖的路径：通过 Iceberg REST Catalog 创建 Namespace 和表，由 PyIceberg 
追加数据，再由 Doris 通过 S3 加 Iceberg 元数据走标准链路读回。
+
+## 生产部署建议
+
+* 生产环境可以把 `weed mini` 拆成 `weed master`、`weed volume`、`weed filer`，再加 `weed s3 
-iceberg.port=8181`，也可以使用 SeaweedFS Helm Chart。Doris 这边的配置完全不用改，只需替换主机和端口。
+* OAuth2 credential 就是 S3 访问密钥，需要轮换 Doris 的 Catalog 凭证时，按普通 S3 用户的方式轮换 IAM 
身份即可。
+* Iceberg 表的运维任务（Compaction、Snapshot Expiration、Orphan Removal、Manifest 
Rewriting）由 SeaweedFS 内置实现，针对同一个 Bucket 运行，详见 [SeaweedFS Iceberg Catalog 
Wiki](https://github.com/seaweedfs/seaweedfs/wiki/SeaweedFS-Iceberg-Catalog)。
+
+## 相关链接
+
+* [SeaweedFS](https://github.com/seaweedfs/seaweedfs)
+* [SeaweedFS 中的 Doris Iceberg 
集成测试](https://github.com/seaweedfs/seaweedfs/tree/master/test/s3tables/catalog_doris)
+* [Doris Iceberg Catalog 
文档](https://doris.apache.org/zh-CN/docs/lakehouse/catalogs/iceberg-catalog)
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/best-practices/doris-seaweedfs.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/best-practices/doris-seaweedfs.md
new file mode 100644
index 00000000000..cfa87e50c79
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/best-practices/doris-seaweedfs.md
@@ -0,0 +1,171 @@
+---
+{
+    "title": "集成 SeaweedFS",
+    "language": "zh-CN",
+    "description": "使用 SeaweedFS 同时承载 Iceberg 表的对象存储和 REST 
Catalog，凭证、部署、运维三位一体。"
+}
+---
+
+[SeaweedFS](https://seaweedfs.com/) 是一个分布式存储系统，单个 `weed` 进程即可同时提供 S3 
兼容的对象存储接口和 Apache Iceberg REST Catalog。Parquet 数据和 Iceberg 
元数据由同一个执行文件对外服务，并使用同一对 S3 凭证完成鉴权。
+
+本文介绍将 SeaweedFS 作为 Doris 的 Iceberg Lakehouse 后端的最小配置。完整的端到端路径已经在 SeaweedFS 仓库的 
[`TestDorisIcebergCatalog`](https://github.com/seaweedfs/seaweedfs/tree/master/test/s3tables/catalog_doris)
 集成测试中验证：测试会启动 SeaweedFS mini 集群，在 Doris 中注册 Iceberg Catalog，通过 PyIceberg 
写入数据，再由 `apache/doris:doris-all-in-one-2.1.0` 容器读回。
+
+## 为什么用 SeaweedFS 搭 Iceberg Lakehouse
+
+当下的 Lakehouse 架构通常需要把三层系统拼起来：
+
+* 对象存储（S3 或兼容实现）
+* 独立的 Iceberg Catalog（Hive Metastore、Glue、Polaris、Lakekeeper、Nessie 等）
+* 查询引擎（Doris、Spark、Trino 等）
+
+SeaweedFS 把前两层合并到了同一个进程里。同一个 `weed` 执行文件既是：
+
+* 存放 parquet 文件的 S3 兼容对象存储，
+* 也是存放表元数据的 Iceberg REST Catalog。
+
+也就是说，Doris 只需要对接一个系统，而不是两个。具体好处：
+
+* **更少的组件。** 不再需要 Hive Metastore、Glue，不需要为 Catalog 单独部署 Postgres，也不需要单独维护 STS 
角色。
+* **更简单的部署。** 一个执行文件、一份 IAM 配置；Doris 的 Iceberg REST 客户端和 S3 读写器共用同一对 S3 凭证。
+* **适合本地与私有化场景。** 整个链路不依赖任何云服务，从笔记本、单台 VM 到 Kubernetes 集群，部署方式一致。
+* **元数据路径更低延时。** Catalog 状态保存在同一个 SeaweedFS filer 中，与数据为邻；命名空间和表元数据查询不再跨独立服务。
+* **磁盘上是标准 S3。** 表以标准 Iceberg 目录结构存放在 S3 桶中，任何 S3 客户端（rclone、`aws 
s3`、Spark、Trino、Dremio、RisingWave）都可以与 Doris 一同读取或复制。
+
+架构上：
+
+```text
+Doris
+  |
+  v
+Iceberg 表
+  |
+  v
+SeaweedFS  (S3 存储 + REST Catalog)
+```
+
+对于小团队和内部数据平台来说，这是一种不依赖独立 Catalog 服务、就能搭起 Lakehouse 的干净方式。
+
+## 1. 启动 SeaweedFS
+
+在 [github.com/seaweedfs/seaweedfs](https://github.com/seaweedfs/seaweedfs) 
编译或安装 `weed`。
+
+准备一份 IAM 配置，给一个访问密钥授予 S3 权限。同一个密钥也作为 Iceberg REST 端点的 OAuth2 客户端：
+
+```json
+{
+  "identities": [
+    {
+      "name": "doris",
+      "credentials": [
+        {
+          "accessKey": "AKIAIOSFODNN7EXAMPLE",
+          "secretKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
+        }
+      ],
+      "actions": ["Admin"]
+    }
+  ]
+}
+```
+
+启动单进程集群，并在启动时创建用于 Iceberg 的 Table Bucket：
+
+```bash
+weed mini \
+  -ip $(hostname -I | awk '{print $1}') \
+  -dir /var/lib/seaweedfs \
+  -s3.config /etc/seaweedfs/iam_config.json \
+  -tableBucket iceberg-tables
+```
+
+`weed mini` 会在一个进程内同时启动 master、volume、filer、S3 和 Iceberg REST Catalog。默认端口：
+
+| 组件 | 端口 | 修改参数 |
+| ---- | ---- | -------- |
+| Master HTTP | 9333 | `-master.port` |
+| Filer HTTP | 8888 | `-filer.port` |
+| S3 | 8333 | `-s3.port` |
+| Iceberg REST | 8181 | `-s3.port.iceberg` |
+
+`-tableBucket iceberg-tables` 会在启动时创建一个 S3 Tables 类型的 Bucket，也就是 Doris 后续写入 
Iceberg 表所用的 Bucket。
+
+验证 Catalog 端点可用：
+
+```bash
+curl -s http://SEAWEED_HOST:8181/v1/config | jq .
+```
+
+## 2. 在 Doris 中注册 Iceberg Catalog
+
+```sql
+CREATE CATALOG seaweedfs PROPERTIES (
+    "type" = "iceberg",
+    "iceberg.catalog.type" = "rest",
+    "uri" = "http://SEAWEED_HOST:8181";,
+    "warehouse" = "s3://iceberg-tables",
+    "credential" = 
"AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+    "s3.endpoint" = "http://SEAWEED_HOST:8333";,
+    "s3.access_key" = "AKIAIOSFODNN7EXAMPLE",
+    "s3.secret_key" = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+    "s3.region" = "us-west-2",
+    "use_path_style" = "true"
+);
+```
+
+说明：
+
+* `credential = "<access_key>:<secret_key>"` 会被 Doris 的 Iceberg REST 客户端作为 
OAuth2 client credentials 发起鉴权。SeaweedFS 用同一份 IAM 配置校验。
+* `s3.*` 系列属性给 Doris 本地的 parquet 读写器使用，指向同一个 `weed` 进程，主机和密钥都和上面一致。
+* `use_path_style = "true"` 是必需的，SeaweedFS 默认采用 path-style 的 S3 协议。
+* 集成测试使用的就是上述属性，可参考 
[`createDorisIcebergCatalog`](https://github.com/seaweedfs/seaweedfs/blob/master/test/s3tables/catalog_doris/doris_catalog_test.go)。
+
+如果在注册 Catalog 前已经通过其他客户端（例如 PyIceberg）创建了 Namespace 或表，需要刷新元数据缓存：
+
+```sql
+REFRESH CATALOG seaweedfs;
+```
+
+## 3. 使用 Catalog
+
+```sql
+USE seaweedfs;
+
+CREATE DATABASE IF NOT EXISTS demo;
+
+USE seaweedfs.demo;
+
+CREATE TABLE iceberg_smoke (
+  id BIGINT,
+  label STRING
+);
+
+INSERT INTO iceberg_smoke VALUES (1, 'one'), (2, 'two'), (3, 'three');
+
+SELECT id, label FROM iceberg_smoke ORDER BY id;
+```
+
+预期结果：
+
+```text
++----+-------+
+| id | label |
++----+-------+
+|  1 | one   |
+|  2 | two   |
+|  3 | three |
++----+-------+
+```
+
+这正是 SeaweedFS 集成测试覆盖的路径：通过 Iceberg REST Catalog 创建 Namespace 和表，由 PyIceberg 
追加数据，再由 Doris 通过 S3 加 Iceberg 元数据走标准链路读回。
+
+## 生产部署建议
+
+* 生产环境可以把 `weed mini` 拆成 `weed master`、`weed volume`、`weed filer`，再加 `weed s3 
-iceberg.port=8181`，也可以使用 SeaweedFS Helm Chart。Doris 这边的配置完全不用改，只需替换主机和端口。
+* OAuth2 credential 就是 S3 访问密钥，需要轮换 Doris 的 Catalog 凭证时，按普通 S3 用户的方式轮换 IAM 
身份即可。
+* Iceberg 表的运维任务（Compaction、Snapshot Expiration、Orphan Removal、Manifest 
Rewriting）由 SeaweedFS 内置实现，针对同一个 Bucket 运行，详见 [SeaweedFS Iceberg Catalog 
Wiki](https://github.com/seaweedfs/seaweedfs/wiki/SeaweedFS-Iceberg-Catalog)。
+
+## 相关链接
+
+* [SeaweedFS](https://github.com/seaweedfs/seaweedfs)
+* [SeaweedFS 中的 Doris Iceberg 
集成测试](https://github.com/seaweedfs/seaweedfs/tree/master/test/s3tables/catalog_doris)
+* [Doris Iceberg Catalog 
文档](https://doris.apache.org/zh-CN/docs/lakehouse/catalogs/iceberg-catalog)
diff --git a/sidebars-next.ts b/sidebars-next.ts
index e9434173083..43bc453f9e1 100644
--- a/sidebars-next.ts
+++ b/sidebars-next.ts
@@ -610,6 +610,7 @@ const sidebars: SidebarsConfig = {
                                 'lakehouse/best-practices/doris-onelake',
                                 'lakehouse/best-practices/doris-unity-catalog',
                                 'lakehouse/best-practices/doris-lakekeeper',
+                                'lakehouse/best-practices/doris-seaweedfs',
                                 'lakehouse/best-practices/doris-nessie',
                                 'lakehouse/best-practices/doris-dlf-iceberg',
                             ],
diff --git 
a/versioned_docs/version-3.x/lakehouse/best-practices/doris-seaweedfs.md 
b/versioned_docs/version-3.x/lakehouse/best-practices/doris-seaweedfs.md
new file mode 100644
index 00000000000..89ca2995e04
--- /dev/null
+++ b/versioned_docs/version-3.x/lakehouse/best-practices/doris-seaweedfs.md
@@ -0,0 +1,170 @@
+---
+{
+    "title": "Integration with SeaweedFS",
+    "language": "en"
+}
+---
+
+[SeaweedFS](https://seaweedfs.com/) is a distributed storage system that 
exposes both an S3-compatible object API and an Apache Iceberg REST Catalog 
from the same `weed` process. Parquet data and Iceberg metadata are served by 
one executable, authenticated by one S3 credential pair.
+
+This page shows the minimal configuration that turns SeaweedFS into a 
Doris-backed Iceberg lakehouse. The same end-to-end path is exercised by the 
[`TestDorisIcebergCatalog`](https://github.com/seaweedfs/seaweedfs/tree/master/test/s3tables/catalog_doris)
 integration test in the SeaweedFS repository, which boots a SeaweedFS mini 
cluster, registers a Doris Iceberg catalog against it, writes rows with 
PyIceberg, and reads them back from `apache/doris:doris-all-in-one-2.1.0`.
+
+## Why SeaweedFS for an Iceberg lakehouse
+
+A typical lakehouse stack today stitches together three layers:
+
+* Object storage (S3 or compatible)
+* A standalone Iceberg catalog (Hive Metastore, Glue, Polaris, Lakekeeper, 
Nessie, ...)
+* A query engine (Doris, Spark, Trino, ...)
+
+SeaweedFS collapses the first two into one process. The same `weed` executable 
is both:
+
+* the S3-compatible object store that holds the parquet files, and
+* the Iceberg REST Catalog that holds the table metadata.
+
+So Doris talks to one system instead of two. The practical implications:
+
+* **Fewer moving parts.** No Hive Metastore, no Glue, no Postgres backing a 
separate catalog, no STS role to provision.
+* **Simpler deployment.** One executable, one IAM config, one S3 credential 
pair shared by Doris's Iceberg REST client and its S3 reader.
+* **Local or on-prem friendly.** Nothing in the path requires a cloud-native 
service. The same setup runs on a laptop, a single VM, or a Kubernetes cluster.
+* **Lower latency on the metadata path.** Catalog state lives in the same 
SeaweedFS filer that serves the data, so namespace and table lookups don't 
cross a separate service boundary.
+* **S3-native on disk.** Tables are stored as standard Iceberg directories in 
S3 buckets. Any S3 client (rclone, `aws s3`, Spark, Trino, Dremio, RisingWave) 
can read or replicate them alongside Doris.
+
+Architecturally:
+
+```text
+Doris
+  |
+  v
+Iceberg tables
+  |
+  v
+SeaweedFS  (S3 storage + REST catalog)
+```
+
+For smaller teams or internal platforms, this is a clean way to build a 
lakehouse without depending on a separate metastore service.
+
+## 1. Start SeaweedFS
+
+Build or install `weed` from 
[github.com/seaweedfs/seaweedfs](https://github.com/seaweedfs/seaweedfs).
+
+Create an IAM config that grants an access key full S3 access. The same key is 
also used as the OAuth2 client for the Iceberg REST endpoint:
+
+```json
+{
+  "identities": [
+    {
+      "name": "doris",
+      "credentials": [
+        {
+          "accessKey": "AKIAIOSFODNN7EXAMPLE",
+          "secretKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
+        }
+      ],
+      "actions": ["Admin"]
+    }
+  ]
+}
+```
+
+Start a single-process cluster with the Iceberg REST endpoint and a 
pre-created table bucket:
+
+```bash
+weed mini \
+  -ip $(hostname -I | awk '{print $1}') \
+  -dir /var/lib/seaweedfs \
+  -s3.config /etc/seaweedfs/iam_config.json \
+  -tableBucket iceberg-tables
+```
+
+`weed mini` runs master, volume, filer, S3, and the Iceberg REST catalog in 
one process. Default ports:
+
+| Component | Port | Override flag |
+| --------- | ---- | ------------- |
+| Master HTTP | 9333 | `-master.port` |
+| Filer HTTP | 8888 | `-filer.port` |
+| S3 | 8333 | `-s3.port` |
+| Iceberg REST | 8181 | `-s3.port.iceberg` |
+
+`-tableBucket iceberg-tables` creates the S3 Tables bucket on startup, which 
is the Iceberg-aware bucket type Doris will write into.
+
+To verify the catalog is reachable:
+
+```bash
+curl -s http://SEAWEED_HOST:8181/v1/config | jq .
+```
+
+## 2. Register the Iceberg catalog in Doris
+
+```sql
+CREATE CATALOG seaweedfs PROPERTIES (
+    "type" = "iceberg",
+    "iceberg.catalog.type" = "rest",
+    "uri" = "http://SEAWEED_HOST:8181";,
+    "warehouse" = "s3://iceberg-tables",
+    "credential" = 
"AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+    "s3.endpoint" = "http://SEAWEED_HOST:8333";,
+    "s3.access_key" = "AKIAIOSFODNN7EXAMPLE",
+    "s3.secret_key" = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+    "s3.region" = "us-west-2",
+    "use_path_style" = "true"
+);
+```
+
+Notes:
+
+* `credential = "<access_key>:<secret_key>"` is forwarded by Doris's Iceberg 
REST client as OAuth2 client credentials. SeaweedFS validates them against the 
same IAM config that secures the S3 endpoint.
+* The `s3.*` properties are used by Doris's own parquet reader and writer. 
They point at the same `weed` process — same host, same key pair.
+* `use_path_style = "true"` is required because SeaweedFS serves S3 in 
path-style by default.
+* The integration test uses these exact properties; see 
[`createDorisIcebergCatalog`](https://github.com/seaweedfs/seaweedfs/blob/master/test/s3tables/catalog_doris/doris_catalog_test.go)
 for the canonical form.
+
+If you create namespaces or tables outside Doris (for example with PyIceberg) 
before the catalog is registered, refresh the metadata cache:
+
+```sql
+REFRESH CATALOG seaweedfs;
+```
+
+## 3. Use the catalog
+
+```sql
+USE seaweedfs;
+
+CREATE DATABASE IF NOT EXISTS demo;
+
+USE seaweedfs.demo;
+
+CREATE TABLE iceberg_smoke (
+  id BIGINT,
+  label STRING
+);
+
+INSERT INTO iceberg_smoke VALUES (1, 'one'), (2, 'two'), (3, 'three');
+
+SELECT id, label FROM iceberg_smoke ORDER BY id;
+```
+
+Expected output:
+
+```text
++----+-------+
+| id | label |
++----+-------+
+|  1 | one   |
+|  2 | two   |
+|  3 | three |
++----+-------+
+```
+
+This is the same path the SeaweedFS integration test exercises: namespace and 
table created through the Iceberg REST catalog, rows appended via PyIceberg, 
and reads served by Doris through the standard S3 plus Iceberg metadata flow.
+
+## Production notes
+
+* For a production cluster, replace `weed mini` with `weed master`, `weed 
volume`, `weed filer`, and `weed s3 -iceberg.port=8181` (or use the SeaweedFS 
Helm chart). The Doris-side configuration is identical — only the host and 
ports change.
+* The OAuth2 credential is the S3 access key. To rotate Doris's catalog 
access, rotate the IAM identity that holds it, the same way you rotate any S3 
user.
+* Iceberg table maintenance (compaction, snapshot expiration, orphan removal, 
manifest rewriting) is built into SeaweedFS and runs against the same bucket. 
See the [SeaweedFS Iceberg Catalog 
wiki](https://github.com/seaweedfs/seaweedfs/wiki/SeaweedFS-Iceberg-Catalog) 
for details.
+
+## References
+
+* [SeaweedFS](https://github.com/seaweedfs/seaweedfs)
+* [Doris Iceberg integration test in 
SeaweedFS](https://github.com/seaweedfs/seaweedfs/tree/master/test/s3tables/catalog_doris)
+* [Doris Iceberg Catalog 
reference](https://doris.apache.org/docs/lakehouse/catalogs/iceberg-catalog)
diff --git 
a/versioned_docs/version-4.x/lakehouse/best-practices/doris-seaweedfs.md 
b/versioned_docs/version-4.x/lakehouse/best-practices/doris-seaweedfs.md
new file mode 100644
index 00000000000..89ca2995e04
--- /dev/null
+++ b/versioned_docs/version-4.x/lakehouse/best-practices/doris-seaweedfs.md
@@ -0,0 +1,170 @@
+---
+{
+    "title": "Integration with SeaweedFS",
+    "language": "en"
+}
+---
+
+[SeaweedFS](https://seaweedfs.com/) is a distributed storage system that 
exposes both an S3-compatible object API and an Apache Iceberg REST Catalog 
from the same `weed` process. Parquet data and Iceberg metadata are served by 
one executable, authenticated by one S3 credential pair.
+
+This page shows the minimal configuration that turns SeaweedFS into a 
Doris-backed Iceberg lakehouse. The same end-to-end path is exercised by the 
[`TestDorisIcebergCatalog`](https://github.com/seaweedfs/seaweedfs/tree/master/test/s3tables/catalog_doris)
 integration test in the SeaweedFS repository, which boots a SeaweedFS mini 
cluster, registers a Doris Iceberg catalog against it, writes rows with 
PyIceberg, and reads them back from `apache/doris:doris-all-in-one-2.1.0`.
+
+## Why SeaweedFS for an Iceberg lakehouse
+
+A typical lakehouse stack today stitches together three layers:
+
+* Object storage (S3 or compatible)
+* A standalone Iceberg catalog (Hive Metastore, Glue, Polaris, Lakekeeper, 
Nessie, ...)
+* A query engine (Doris, Spark, Trino, ...)
+
+SeaweedFS collapses the first two into one process. The same `weed` executable 
is both:
+
+* the S3-compatible object store that holds the parquet files, and
+* the Iceberg REST Catalog that holds the table metadata.
+
+So Doris talks to one system instead of two. The practical implications:
+
+* **Fewer moving parts.** No Hive Metastore, no Glue, no Postgres backing a 
separate catalog, no STS role to provision.
+* **Simpler deployment.** One executable, one IAM config, one S3 credential 
pair shared by Doris's Iceberg REST client and its S3 reader.
+* **Local or on-prem friendly.** Nothing in the path requires a cloud-native 
service. The same setup runs on a laptop, a single VM, or a Kubernetes cluster.
+* **Lower latency on the metadata path.** Catalog state lives in the same 
SeaweedFS filer that serves the data, so namespace and table lookups don't 
cross a separate service boundary.
+* **S3-native on disk.** Tables are stored as standard Iceberg directories in 
S3 buckets. Any S3 client (rclone, `aws s3`, Spark, Trino, Dremio, RisingWave) 
can read or replicate them alongside Doris.
+
+Architecturally:
+
+```text
+Doris
+  |
+  v
+Iceberg tables
+  |
+  v
+SeaweedFS  (S3 storage + REST catalog)
+```
+
+For smaller teams or internal platforms, this is a clean way to build a 
lakehouse without depending on a separate metastore service.
+
+## 1. Start SeaweedFS
+
+Build or install `weed` from 
[github.com/seaweedfs/seaweedfs](https://github.com/seaweedfs/seaweedfs).
+
+Create an IAM config that grants an access key full S3 access. The same key is 
also used as the OAuth2 client for the Iceberg REST endpoint:
+
+```json
+{
+  "identities": [
+    {
+      "name": "doris",
+      "credentials": [
+        {
+          "accessKey": "AKIAIOSFODNN7EXAMPLE",
+          "secretKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
+        }
+      ],
+      "actions": ["Admin"]
+    }
+  ]
+}
+```
+
+Start a single-process cluster with the Iceberg REST endpoint and a 
pre-created table bucket:
+
+```bash
+weed mini \
+  -ip $(hostname -I | awk '{print $1}') \
+  -dir /var/lib/seaweedfs \
+  -s3.config /etc/seaweedfs/iam_config.json \
+  -tableBucket iceberg-tables
+```
+
+`weed mini` runs master, volume, filer, S3, and the Iceberg REST catalog in 
one process. Default ports:
+
+| Component | Port | Override flag |
+| --------- | ---- | ------------- |
+| Master HTTP | 9333 | `-master.port` |
+| Filer HTTP | 8888 | `-filer.port` |
+| S3 | 8333 | `-s3.port` |
+| Iceberg REST | 8181 | `-s3.port.iceberg` |
+
+`-tableBucket iceberg-tables` creates the S3 Tables bucket on startup, which 
is the Iceberg-aware bucket type Doris will write into.
+
+To verify the catalog is reachable:
+
+```bash
+curl -s http://SEAWEED_HOST:8181/v1/config | jq .
+```
+
+## 2. Register the Iceberg catalog in Doris
+
+```sql
+CREATE CATALOG seaweedfs PROPERTIES (
+    "type" = "iceberg",
+    "iceberg.catalog.type" = "rest",
+    "uri" = "http://SEAWEED_HOST:8181";,
+    "warehouse" = "s3://iceberg-tables",
+    "credential" = 
"AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+    "s3.endpoint" = "http://SEAWEED_HOST:8333";,
+    "s3.access_key" = "AKIAIOSFODNN7EXAMPLE",
+    "s3.secret_key" = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+    "s3.region" = "us-west-2",
+    "use_path_style" = "true"
+);
+```
+
+Notes:
+
+* `credential = "<access_key>:<secret_key>"` is forwarded by Doris's Iceberg 
REST client as OAuth2 client credentials. SeaweedFS validates them against the 
same IAM config that secures the S3 endpoint.
+* The `s3.*` properties are used by Doris's own parquet reader and writer. 
They point at the same `weed` process — same host, same key pair.
+* `use_path_style = "true"` is required because SeaweedFS serves S3 in 
path-style by default.
+* The integration test uses these exact properties; see 
[`createDorisIcebergCatalog`](https://github.com/seaweedfs/seaweedfs/blob/master/test/s3tables/catalog_doris/doris_catalog_test.go)
 for the canonical form.
+
+If you create namespaces or tables outside Doris (for example with PyIceberg) 
before the catalog is registered, refresh the metadata cache:
+
+```sql
+REFRESH CATALOG seaweedfs;
+```
+
+## 3. Use the catalog
+
+```sql
+USE seaweedfs;
+
+CREATE DATABASE IF NOT EXISTS demo;
+
+USE seaweedfs.demo;
+
+CREATE TABLE iceberg_smoke (
+  id BIGINT,
+  label STRING
+);
+
+INSERT INTO iceberg_smoke VALUES (1, 'one'), (2, 'two'), (3, 'three');
+
+SELECT id, label FROM iceberg_smoke ORDER BY id;
+```
+
+Expected output:
+
+```text
++----+-------+
+| id | label |
++----+-------+
+|  1 | one   |
+|  2 | two   |
+|  3 | three |
++----+-------+
+```
+
+This is the same path the SeaweedFS integration test exercises: namespace and 
table created through the Iceberg REST catalog, rows appended via PyIceberg, 
and reads served by Doris through the standard S3 plus Iceberg metadata flow.
+
+## Production notes
+
+* For a production cluster, replace `weed mini` with `weed master`, `weed 
volume`, `weed filer`, and `weed s3 -iceberg.port=8181` (or use the SeaweedFS 
Helm chart). The Doris-side configuration is identical — only the host and 
ports change.
+* The OAuth2 credential is the S3 access key. To rotate Doris's catalog 
access, rotate the IAM identity that holds it, the same way you rotate any S3 
user.
+* Iceberg table maintenance (compaction, snapshot expiration, orphan removal, 
manifest rewriting) is built into SeaweedFS and runs against the same bucket. 
See the [SeaweedFS Iceberg Catalog 
wiki](https://github.com/seaweedfs/seaweedfs/wiki/SeaweedFS-Iceberg-Catalog) 
for details.
+
+## References
+
+* [SeaweedFS](https://github.com/seaweedfs/seaweedfs)
+* [Doris Iceberg integration test in 
SeaweedFS](https://github.com/seaweedfs/seaweedfs/tree/master/test/s3tables/catalog_doris)
+* [Doris Iceberg Catalog 
reference](https://doris.apache.org/docs/lakehouse/catalogs/iceberg-catalog)
diff --git a/versioned_sidebars/version-3.x-sidebars.json 
b/versioned_sidebars/version-3.x-sidebars.json
index d9fc4046c41..ffb70d35376 100644
--- a/versioned_sidebars/version-3.x-sidebars.json
+++ b/versioned_sidebars/version-3.x-sidebars.json
@@ -417,6 +417,7 @@
                                         
"lakehouse/best-practices/doris-onelake",
                                         
"lakehouse/best-practices/doris-unity-catalog",
                                         
"lakehouse/best-practices/doris-lakekeeper",
+                                        
"lakehouse/best-practices/doris-seaweedfs",
                                         
"lakehouse/best-practices/doris-nessie",
                                         
"lakehouse/best-practices/doris-dlf-iceberg"
                                     ]
diff --git a/versioned_sidebars/version-4.x-sidebars.json 
b/versioned_sidebars/version-4.x-sidebars.json
index 9060a442fe7..11138299b9f 100644
--- a/versioned_sidebars/version-4.x-sidebars.json
+++ b/versioned_sidebars/version-4.x-sidebars.json
@@ -398,6 +398,7 @@
                                         
"lakehouse/best-practices/doris-onelake",
                                         
"lakehouse/best-practices/doris-unity-catalog",
                                         
"lakehouse/best-practices/doris-lakekeeper",
+                                        
"lakehouse/best-practices/doris-seaweedfs",
                                         
"lakehouse/best-practices/doris-nessie",
                                         
"lakehouse/best-practices/doris-dlf-iceberg"
                                     ]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(doris-website) branch master updated: [doc] add SeaweedFS integration doc (#3607)

Reply via email to