This is an automated email from the ASF dual-hosted git repository.

zhangshenghang pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/seatunnel.git


The following commit(s) were added to refs/heads/dev by this push:
     new fd9999cf79 docs: add REST API job lifecycle cookbook (#10983)
fd9999cf79 is described below

commit fd9999cf794df8a82b96bafcf5f14c9930dc8b30
Author: Daniel <[email protected]>
AuthorDate: Sun Jun 7 21:15:26 2026 +0800

    docs: add REST API job lifecycle cookbook (#10983)
    
    Co-authored-by: davidzollo <[email protected]>
---
 docs/en/engines/zeta/rest-api-job-lifecycle.md | 353 +++++++++++++++++++++++++
 docs/sidebars.js                               |   1 +
 docs/zh/engines/zeta/rest-api-job-lifecycle.md | 348 ++++++++++++++++++++++++
 3 files changed, 702 insertions(+)

diff --git a/docs/en/engines/zeta/rest-api-job-lifecycle.md 
b/docs/en/engines/zeta/rest-api-job-lifecycle.md
new file mode 100644
index 0000000000..4d208b78ff
--- /dev/null
+++ b/docs/en/engines/zeta/rest-api-job-lifecycle.md
@@ -0,0 +1,353 @@
+---
+sidebar_position: 5
+---
+
+# REST API Job Lifecycle Cookbook
+
+## Overview
+
+This guide supplements the [REST API v2 reference](rest-api-v2.md) with 
practical recipes for
+managing the complete job lifecycle: submission, monitoring, stopping, 
cancelling, savepoint,
+and recovery. It also covers authentication, performance considerations, and 
common errors.
+
+---
+
+## 1. Prerequisites
+
+Enable the REST API in `seatunnel.yaml`:
+
+```yaml
+seatunnel:
+  engine:
+    http:
+      enable-http: true
+      port: 8080
+      enable-dynamic-port: true
+      port-range: 100
+```
+
+All examples below use `http://<master>:8080`. Replace with your actual master 
host and port.
+
+---
+
+## 2. Job Submission
+
+### 2.1 Submit a job from a config file (JSON body)
+
+```bash
+curl -X POST http://<master>:8080/submit-job \
+  -H "Content-Type: application/json" \
+  -d @job.json
+```
+
+Minimal `job.json` structure:
+
+```json
+{
+  "env": {
+    "job.name": "my-cdc-job",
+    "job.mode": "STREAMING",
+    "checkpoint.interval": 30000
+  },
+  "source": [
+    {
+      "plugin_name": "MySQL-CDC",
+      "plugin_output": "mysql_cdc_result",
+      "base-url": "jdbc:mysql://localhost:3306/mydb",
+      "username": "cdc_user",
+      "password": "password",
+      "database-names": ["mydb"],
+      "table-names": ["mydb.orders"],
+      "startup.mode": "initial",
+      "server-id": "5400-5404"
+    }
+  ],
+  "transform": [],
+  "sink": [
+    {
+      "plugin_name": "Console",
+      "plugin_input": ["mysql_cdc_result"]
+    }
+  ]
+}
+```
+
+### 2.2 Submit a job with multiple transforms (JSON format)
+
+```json
+{
+  "env": {
+    "job.name": "etl-with-transforms",
+    "job.mode": "BATCH"
+  },
+  "source": [
+    {
+      "plugin_name": "FakeSource",
+      "plugin_output": "fake",
+      "row.num": 100,
+      "schema": {
+        "fields": {
+          "id": "int",
+          "name": "string",
+          "amount": "double"
+        }
+      }
+    }
+  ],
+  "transform": [
+    {
+      "plugin_name": "FieldMapper",
+      "plugin_input": ["fake"],
+      "plugin_output": "after_field_map",
+      "field_mapper": {
+        "id": "user_id",
+        "name": "user_name"
+      }
+    },
+    {
+      "plugin_name": "Filter",
+      "plugin_input": ["after_field_map"],
+      "plugin_output": "filtered",
+      "fields": ["user_id", "user_name", "amount"]
+    }
+  ],
+  "sink": [
+    {
+      "plugin_name": "Console",
+      "plugin_input": ["filtered"]
+    }
+  ]
+}
+```
+
+### 2.3 Submission response
+
+A successful submission returns:
+
+```json
+{
+  "jobId": "733584788375093248",
+  "jobName": "my-cdc-job"
+}
+```
+
+Save the `jobId` for all subsequent lifecycle operations.
+
+---
+
+## 3. Job Status Query
+
+### 3.1 Query a single running job
+
+```bash
+curl http://<master>:8080/job-info/<jobId>
+```
+
+Response fields:
+
+| Field | Description |
+|---|---|
+| `jobId` | Unique job identifier |
+| `jobName` | Human-readable job name |
+| `jobStatus` | `RUNNING`, `FINISHED`, `FAILED`, `CANCELLED` |
+| `envOptions` | Env configuration applied |
+| `createTime` | Job creation timestamp |
+| `jobDag` | DAG structure |
+| `metrics` | Source/sink throughput counters |
+
+### 3.2 Query all running jobs
+
+```bash
+curl "http://<master>:8080/running-jobs?page=1&rows=10"
+```
+
+### 3.3 Query finished jobs
+
+```bash
+curl "http://<master>:8080/finished-jobs/FINISHED?page=1&rows=10"
+```
+
+Use the `state` path parameter to filter finished jobs, for example 
`FINISHED`, `FAILED`,
+`CANCELED`, or `SAVEPOINT_DONE`.
+
+### 3.4 Query job metrics only
+
+```bash
+curl http://<master>:8080/job-info/<jobId>
+```
+
+Key metric fields:
+
+| Metric | Meaning |
+|---|---|
+| `SourceReceivedCount` | Total rows read from source |
+| `SinkWriteCount` | Total rows written to sink |
+| `SourceReceivedQPS` | Current read throughput (rows/sec) |
+| `SinkWriteQPS` | Current write throughput (rows/sec) |
+
+---
+
+## 4. Querying Job Logs
+
+```bash
+# Get the last N lines of a running job's log
+curl "http://<master>:8080/logs/<jobId>"
+```
+
+For large deployments where log files are on individual workers, use the 
worker's REST port
+directly, or configure centralized logging (see [Logging](logging.md)).
+
+---
+
+## 5. Stop, Cancel, and Savepoint Semantics
+
+### Semantics comparison
+
+| Operation | What happens | State preserved | Can resume |
+|---|---|---|---|
+| `stop` (graceful) | Waits for in-flight data to flush | Checkpoint at stop 
point | Yes, via `--restore` |
+| `stop-with-savepoint` | Graceful stop + explicit savepoint written | Full 
savepoint | Yes, via `--restore` |
+| `cancel` (force kill) | Immediate termination | No new state written | Only 
from last checkpoint |
+
+### 5.1 Graceful stop (no savepoint)
+
+```bash
+curl -X POST "http://<master>:8080/stop-job" \
+  -H "Content-Type: application/json" \
+  -d '{"jobId": "733584788375093248", "isStopWithSavePoint": false}'
+```
+
+### 5.2 Stop with savepoint
+
+```bash
+curl -X POST "http://<master>:8080/stop-job" \
+  -H "Content-Type: application/json" \
+  -d '{"jobId": "733584788375093248", "isStopWithSavePoint": true}'
+```
+
+The savepoint path is printed in the job log and returned in the job final 
state:
+
+```bash
+curl http://<master>:8080/job-info/733584788375093248 | \
+  python3 -c "import sys,json; d=json.load(sys.stdin); 
print(d.get('savepointPath', 'N/A'))"
+```
+
+### 5.3 Cancel (force)
+
+```bash
+curl -X POST "http://<master>:8080/stop-job" \
+  -H "Content-Type: application/json" \
+  -d '{"jobId": "733584788375093248", "isStopWithSavePoint": false, "force": 
true}'
+```
+
+---
+
+## 6. Job Recovery and Restart
+
+### 6.1 Restart from the latest checkpoint
+
+Submit the job again with the `jobId` parameter to restore from the last 
successful checkpoint:
+
+```bash
+curl -X POST http://<master>:8080/submit-job \
+  -H "Content-Type: application/json" \
+  -d '{
+    "env": {
+      "job.id": "733584788375093248",
+      "job.name": "my-cdc-job",
+      "job.mode": "STREAMING",
+      "checkpoint.interval": 30000
+    },
+    "source": [ ... ],
+    "sink": [ ... ]
+  }'
+```
+
+Providing the same `job.id` instructs the engine to restore state from the 
existing checkpoint
+directory for that job.
+
+### 6.2 Restart from a specific savepoint path
+
+```bash
+curl -X POST http://<master>:8080/submit-job \
+  -H "Content-Type: application/json" \
+  -d '{
+    "env": {
+      "job.name": "my-cdc-job-restored",
+      "job.mode": "STREAMING",
+      "checkpoint.interval": 30000,
+      "restore.mode": "savepoint",
+      "savepoint.path": 
"/seatunnel/checkpoint/savepoint/733584788375093248/1748595600000"
+    },
+    "source": [ ... ],
+    "sink": [ ... ]
+  }'
+```
+
+---
+
+## 7. Authentication and Authorization
+
+When basic authentication is enabled (see [Security](security.md)), all REST 
API calls must
+include the configured username and password.
+
+### Basic auth example
+
+```bash
+curl -u admin:password "http://<master>:8080/running-jobs?page=1&rows=10"
+```
+
+---
+
+## 8. REST API Performance Considerations
+
+### `job-info` slowness with many finished jobs
+
+When `finished-job-state` IMap grows large (thousands of entries), the
+`/running-jobs` and `/finished-jobs/:state` endpoints may become slow because 
they scan all entries.
+
+**Mitigations:**
+1. Reduce `history-job-expire-minutes` to shorten the retention window
+2. Avoid polling finished-jobs endpoints at high frequency; cache the result 
in your monitoring layer
+3. For dashboards, query specific `jobId` directly instead of listing all jobs
+
+### Concurrent submission rate
+
+The REST API processes submissions synchronously in the Hazelcast executor 
pool. For bulk
+submission scenarios (importing hundreds of jobs), throttle submissions to 
10–20 per second
+to avoid overwhelming the master node.
+
+### Dynamic port allocation
+
+If `enable-dynamic-port: true`, different master nodes may use different 
ports. Use the
+REST API overview endpoint to inspect the active cluster state from a 
reachable master:
+
+```bash
+# Inspect the cluster state from a reachable master
+curl http://<master>:8080/overview | \
+  python3 -c "import sys,json; print(json.load(sys.stdin))"
+```
+
+---
+
+## 9. Common Errors and Troubleshooting
+
+| Error | Cause | Fix |
+|---|---|---|
+| `HTTP 404` on any endpoint | REST API not enabled or wrong port | Set 
`enable-http: true` and check port |
+| `Connection refused` | Master not started or firewall blocking port | Verify 
master process is running; check firewall |
+| `jobId not found` in `job-info` | Job has already finished or was never 
started | Query `/finished-jobs/:state` with the expected final state |
+| Submit returns `400 Bad Request` | Malformed JSON or missing required fields 
| Validate JSON; check `plugin_name` spelling |
+| `Job already exists with same job.id` | Submitting duplicate `job.id` 
without stopping first | Cancel or stop the existing job, then resubmit |
+| `Unauthorized 401` | Basic authentication is enabled but no credentials were 
provided | Include `-u user:pass` |
+| `Savepoint path not found` | Savepoint was deleted or path is wrong | Check 
checkpoint storage and provide correct path |
+
+---
+
+## See Also
+
+- [REST API v2 Reference](rest-api-v2.md)
+- [REST API v1 Reference](rest-api-v1.md)
+- [Security Configuration](security.md)
+- [Checkpoint Storage](checkpoint-storage.md)
+- [CDC Pipeline Architecture](../../architecture/cdc-pipeline-architecture.md)
diff --git a/docs/sidebars.js b/docs/sidebars.js
index 1eb75b8319..7da70c1e79 100644
--- a/docs/sidebars.js
+++ b/docs/sidebars.js
@@ -300,6 +300,7 @@ const sidebars = {
                             "items": [
                                 "engines/zeta/rest-api-v1",
                                 "engines/zeta/rest-api-v2",
+                                "engines/zeta/rest-api-job-lifecycle",
                                 "engines/zeta/security",
                                 "engines/zeta/python-sdk"
                             ]
diff --git a/docs/zh/engines/zeta/rest-api-job-lifecycle.md 
b/docs/zh/engines/zeta/rest-api-job-lifecycle.md
new file mode 100644
index 0000000000..2b8fc9b82e
--- /dev/null
+++ b/docs/zh/engines/zeta/rest-api-job-lifecycle.md
@@ -0,0 +1,348 @@
+---
+sidebar_position: 5
+---
+
+# REST API 作业生命周期手册
+
+## 概述
+
+本手册是 [REST API v2 参考文档](rest-api-v2.md) 的实践补充,提供管理完整作业生命周期的操作食谱:
+提交、监控、停止、取消、保存点以及恢复。同时涵盖认证、性能注意事项和常见错误。
+
+---
+
+## 1. 前置条件
+
+在 `seatunnel.yaml` 中启用 REST API:
+
+```yaml
+seatunnel:
+  engine:
+    http:
+      enable-http: true
+      port: 8080
+      enable-dynamic-port: true
+      port-range: 100
+```
+
+以下所有示例均使用 `http://<master>:8080`,请替换为实际的 Master 节点地址和端口。
+
+---
+
+## 2. 作业提交
+
+### 2.1 通过 JSON 提交作业
+
+```bash
+curl -X POST http://<master>:8080/submit-job \
+  -H "Content-Type: application/json" \
+  -d @job.json
+```
+
+最小 `job.json` 结构示例:
+
+```json
+{
+  "env": {
+    "job.name": "my-cdc-job",
+    "job.mode": "STREAMING",
+    "checkpoint.interval": 30000
+  },
+  "source": [
+    {
+      "plugin_name": "MySQL-CDC",
+      "plugin_output": "mysql_cdc_result",
+      "base-url": "jdbc:mysql://localhost:3306/mydb",
+      "username": "cdc_user",
+      "password": "password",
+      "database-names": ["mydb"],
+      "table-names": ["mydb.orders"],
+      "startup.mode": "initial",
+      "server-id": "5400-5404"
+    }
+  ],
+  "transform": [],
+  "sink": [
+    {
+      "plugin_name": "Console",
+      "plugin_input": ["mysql_cdc_result"]
+    }
+  ]
+}
+```
+
+### 2.2 提交含多个 Transform 的作业(JSON 格式)
+
+```json
+{
+  "env": {
+    "job.name": "etl-with-transforms",
+    "job.mode": "BATCH"
+  },
+  "source": [
+    {
+      "plugin_name": "FakeSource",
+      "plugin_output": "fake",
+      "row.num": 100,
+      "schema": {
+        "fields": {
+          "id": "int",
+          "name": "string",
+          "amount": "double"
+        }
+      }
+    }
+  ],
+  "transform": [
+    {
+      "plugin_name": "FieldMapper",
+      "plugin_input": ["fake"],
+      "plugin_output": "after_field_map",
+      "field_mapper": {
+        "id": "user_id",
+        "name": "user_name"
+      }
+    },
+    {
+      "plugin_name": "Filter",
+      "plugin_input": ["after_field_map"],
+      "plugin_output": "filtered",
+      "fields": ["user_id", "user_name", "amount"]
+    }
+  ],
+  "sink": [
+    {
+      "plugin_name": "Console",
+      "plugin_input": ["filtered"]
+    }
+  ]
+}
+```
+
+### 2.3 提交响应
+
+提交成功返回:
+
+```json
+{
+  "jobId": "733584788375093248",
+  "jobName": "my-cdc-job"
+}
+```
+
+请保存 `jobId`,后续所有生命周期操作都需要它。
+
+---
+
+## 3. 作业状态查询
+
+### 3.1 查询单个运行中作业
+
+```bash
+curl http://<master>:8080/job-info/<jobId>
+```
+
+响应字段说明:
+
+| 字段 | 描述 |
+|---|---|
+| `jobId` | 作业唯一标识 |
+| `jobName` | 作业名称 |
+| `jobStatus` | `RUNNING`、`FINISHED`、`FAILED`、`CANCELLED` |
+| `envOptions` | 生效的 env 配置 |
+| `createTime` | 作业创建时间戳 |
+| `jobDag` | DAG 拓扑结构 |
+| `metrics` | Source / Sink 吞吐量计数器 |
+
+### 3.2 查询所有运行中作业
+
+```bash
+curl "http://<master>:8080/running-jobs?page=1&rows=10"
+```
+
+### 3.3 查询已完成的作业列表
+
+```bash
+curl "http://<master>:8080/finished-jobs/FINISHED?page=1&rows=10"
+```
+
+通过 `state` 路径参数筛选已完成作业,例如 `FINISHED`、`FAILED`、`CANCELED`
+或 `SAVEPOINT_DONE`。
+
+### 3.4 仅查询作业指标
+
+```bash
+curl http://<master>:8080/job-info/<jobId>
+```
+
+关键指标字段:
+
+| 指标 | 含义 |
+|---|---|
+| `SourceReceivedCount` | 从 Source 读取的总行数 |
+| `SinkWriteCount` | 写入 Sink 的总行数 |
+| `SourceReceivedQPS` | 当前读取吞吐量(行/秒)|
+| `SinkWriteQPS` | 当前写入吞吐量(行/秒)|
+
+---
+
+## 4. 查询作业日志
+
+```bash
+# 获取运行中作业的最后 N 行日志
+curl "http://<master>:8080/logs/<jobId>"
+```
+
+对于日志文件分散在各 Worker 节点的大规模部署,建议直接访问 Worker 节点的 REST 端口,
+或配置集中式日志系统(参见 [日志配置](logging.md))。
+
+---
+
+## 5. 停止、取消与保存点语义
+
+### 操作语义对比
+
+| 操作 | 行为 | 状态保留 | 是否可恢复 |
+|---|---|---|---|
+| `stop`(优雅停止)| 等待在途数据刷写完毕 | 停止时刻的 Checkpoint | 是,通过 `--restore` |
+| `stop-with-savepoint` | 优雅停止 + 写入显式 Savepoint | 完整 Savepoint | 是,通过 
`--restore` |
+| `cancel`(强制终止)| 立即终止 | 不写入新状态 | 仅从上次 Checkpoint 恢复 |
+
+### 5.1 优雅停止(不创建 Savepoint)
+
+```bash
+curl -X POST "http://<master>:8080/stop-job" \
+  -H "Content-Type: application/json" \
+  -d '{"jobId": "733584788375093248", "isStopWithSavePoint": false}'
+```
+
+### 5.2 停止并创建 Savepoint
+
+```bash
+curl -X POST "http://<master>:8080/stop-job" \
+  -H "Content-Type: application/json" \
+  -d '{"jobId": "733584788375093248", "isStopWithSavePoint": true}'
+```
+
+Savepoint 路径会打印在作业日志中,也可通过查询已完成作业获取:
+
+```bash
+curl http://<master>:8080/job-info/733584788375093248 | \
+  python3 -c "import sys,json; d=json.load(sys.stdin); 
print(d.get('savepointPath', 'N/A'))"
+```
+
+### 5.3 强制取消
+
+```bash
+curl -X POST "http://<master>:8080/stop-job" \
+  -H "Content-Type: application/json" \
+  -d '{"jobId": "733584788375093248", "isStopWithSavePoint": false, "force": 
true}'
+```
+
+---
+
+## 6. 作业恢复与重启
+
+### 6.1 从最新 Checkpoint 恢复
+
+重新提交作业时携带相同的 `job.id`,引擎会自动从该作业的 Checkpoint 目录恢复状态:
+
+```bash
+curl -X POST http://<master>:8080/submit-job \
+  -H "Content-Type: application/json" \
+  -d '{
+    "env": {
+      "job.id": "733584788375093248",
+      "job.name": "my-cdc-job",
+      "job.mode": "STREAMING",
+      "checkpoint.interval": 30000
+    },
+    "source": [ ... ],
+    "sink": [ ... ]
+  }'
+```
+
+### 6.2 从指定 Savepoint 恢复
+
+```bash
+curl -X POST http://<master>:8080/submit-job \
+  -H "Content-Type: application/json" \
+  -d '{
+    "env": {
+      "job.name": "my-cdc-job-restored",
+      "job.mode": "STREAMING",
+      "checkpoint.interval": 30000,
+      "restore.mode": "savepoint",
+      "savepoint.path": 
"/seatunnel/checkpoint/savepoint/733584788375093248/1748595600000"
+    },
+    "source": [ ... ],
+    "sink": [ ... ]
+  }'
+```
+
+---
+
+## 7. 认证与授权
+
+开启 Basic Auth 后(参见 [安全配置](security.md)),所有 REST API 请求必须携带配置的用户名和密码。
+
+### Basic Auth 示例
+
+```bash
+curl -u admin:password "http://<master>:8080/running-jobs?page=1&rows=10"
+```
+
+---
+
+## 8. REST API 性能注意事项
+
+### 已完成作业过多导致 job-info 查询变慢
+
+当 `finished-job-state` IMap 条目增多(数千条)时,`/running-jobs` 和 
`/finished-jobs/:state` 端点
+可能变慢,因为它们需要全量扫描所有条目。
+
+**缓解措施:**
+
+1. 缩短 `history-job-expire-minutes`,减少保留窗口
+2. 避免高频轮询 finished-jobs 端点;在监控层缓存结果
+3. 在看板中直接按 `jobId` 查询,而非列出全部作业
+
+### 并发提交速率
+
+REST API 在 Hazelcast 执行器池中同步处理提交请求。对于批量提交场景(一次导入数百个作业),
+建议将提交频率限制在每秒 10–20 个,避免 Master 节点过载。
+
+### 动态端口分配
+
+若 `enable-dynamic-port: true`,不同 Master 节点可能使用不同端口。可通过 REST API overview
+端点从可访问的 Master 查看当前集群状态:
+
+```bash
+# 从可访问的 Master 查看集群状态
+curl http://<master>:8080/overview | \
+  python3 -c "import sys,json; print(json.load(sys.stdin))"
+```
+
+---
+
+## 9. 常见错误与故障排查
+
+| 错误 | 原因 | 修复方法 |
+|---|---|---|
+| 任意端点返回 `HTTP 404` | REST API 未启用或端口不对 | 设置 `enable-http: true` 并检查端口 |
+| `Connection refused` | Master 未启动或防火墙拦截 | 确认 Master 进程在运行;检查防火墙 |
+| `job-info` 中找不到 `jobId` | 作业已完成或未启动 | 按预期终态改用 `/finished-jobs/:state` 查询 |
+| 提交返回 `400 Bad Request` | JSON 格式错误或缺少必填字段 | 验证 JSON;检查 `plugin_name` 拼写 |
+| `Job already exists with same job.id` | 相同 `job.id` 重复提交但未先停止 | 
先取消或停止现有作业,再重新提交 |
+| `Unauthorized 401` | 已启用 Basic Auth 但未提供凭据 | 添加 `-u user:pass` |
+| `Savepoint path not found` | Savepoint 已删除或路径有误 | 检查 Checkpoint 存储,提供正确路径 |
+
+---
+
+## 参考
+
+- [REST API v2 参考文档](rest-api-v2.md)
+- [REST API v1 参考文档](rest-api-v1.md)
+- [安全配置](security.md)
+- [Checkpoint 存储](checkpoint-storage.md)
+- [CDC 流水线架构](../../architecture/cdc-pipeline-architecture.md)

Reply via email to