This is an automated email from the ASF dual-hosted git repository. corgy pushed a commit to branch dev in repository https://gitbox.apache.org/repos/asf/seatunnel.git
The following commit(s) were added to refs/heads/dev by this push:
new 3c061668ce [Improve][Docs] Clarify snapshot.split.column unique key
requirement (#10312)
3c061668ce is described below
commit 3c061668ce9340837a99a1181219846896351da9
Author: icekimchi <[email protected]>
AuthorDate: Wed Mar 25 02:10:34 2026 -0700
[Improve][Docs] Clarify snapshot.split.column unique key requirement
(#10312)
---
docs/en/connectors/source/MySQL-CDC.md | 2 +-
docs/en/connectors/source/PostgreSQL-CDC.md | 2 +-
docs/en/connectors/source/SqlServer-CDC.md | 2 +-
docs/zh/connectors/source/MySQL-CDC.md | 2 +-
docs/zh/connectors/source/PostgreSQL-CDC.md | 2 +-
docs/zh/connectors/source/SqlServer-CDC.md | 2 +-
6 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/docs/en/connectors/source/MySQL-CDC.md
b/docs/en/connectors/source/MySQL-CDC.md
index 61b8838435..a13772d300 100644
--- a/docs/en/connectors/source/MySQL-CDC.md
+++ b/docs/en/connectors/source/MySQL-CDC.md
@@ -192,7 +192,7 @@ When an initial consistent snapshot is made for large
databases, your establishe
| database-pattern | String | No | .* |
The database names RegEx of the database to capture, for example:
`database_prefix.*`.
[...]
| table-names | List | Yes | - |
Table name of the database to monitor. The table name needs to include the
database name, for example: `database_name.table_name`
[...]
| table-pattern | String | Yes | - |
The table names RegEx of the database to capture. The table name needs to
include the database name, for example: `database.*\\.table_.*`
[...]
-| table-names-config | List | No | - |
Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys":
["key1"],"snapshotSplitColumn": "key2"}]
[...]
+| table-names-config | List | No | - |
Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys":
["key1"],"snapshotSplitColumn": "key2"}]. The snapshotSplitColumn option must
be configured with a unique key. If a non-unique column is provided, the
configuration is ignored and SeaTunnel automatically selects an appropriate
split column internally.
[...]
| startup.mode | Enum | No | INITIAL |
Optional startup mode for MySQL CDC consumer, valid enumerations are `initial`,
`earliest`, `latest` , `specific` and `timestamp`. <br/> `initial`: Synchronize
historical data at startup, and then synchronize incremental data.<br/>
`earliest`: Startup from the earliest offset possible.<br/> `latest`: Startup
from the latest offset.<br/> `specific`: Startup from user-supplied specific
offsets.<br/> `timestamp`: [...]
| startup.specific-offset.file | String | No | - |
Start from the specified binlog file name. **Note, This option is required when
the `startup.mode` option used `specific`.**
[...]
| startup.specific-offset.pos | Long | No | - |
Start from the specified binlog file position. **Note, This option is required
when the `startup.mode` option used `specific`.**
[...]
diff --git a/docs/en/connectors/source/PostgreSQL-CDC.md
b/docs/en/connectors/source/PostgreSQL-CDC.md
index 940d8d8aa3..edf3af398d 100644
--- a/docs/en/connectors/source/PostgreSQL-CDC.md
+++ b/docs/en/connectors/source/PostgreSQL-CDC.md
@@ -95,7 +95,7 @@ ALTER TABLE your_table_name REPLICA IDENTITY FULL;
| password | String | Yes | - |
Password to use when connecting to the database server.
[...]
| database-names | List | No | - |
Database name of the database to monitor.
[...]
| table-names | List | Yes | - |
Table name of the database to monitor. The table name needs to include the
database name, for example: `database_name.table_name`
[...]
-| table-names-config | List | No | - |
Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys":
["key1"],"snapshotSplitColumn": "key2"}]
[...]
+| table-names-config | List | No | - |
Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys":
["key1"],"snapshotSplitColumn": "key2"}]. The snapshotSplitColumn option must
be configured with a unique key. If a non-unique column is provided, the
configuration is ignored and SeaTunnel automatically selects an appropriate
split column internally.
[...]
| startup.mode | Enum | No | INITIAL |
Optional startup mode for PostgreSQL CDC consumer, valid enumerations are
`initial`, `earliest` and `latest`. <br/> `initial`: Synchronize historical
data at startup, and then synchronize incremental data.<br/> `earliest`:
Startup from the earliest offset possible.<br/> `latest`: Startup from the
latest offset.
[...]
| snapshot.split.size | Integer | No | 8096 |
The split size (number of rows) of table snapshot, captured tables are split
into multiple splits when read the snapshot of table.
[...]
| snapshot.fetch.size | Integer | No | 1024 |
The maximum fetch size for per poll when read table snapshot.
[...]
diff --git a/docs/en/connectors/source/SqlServer-CDC.md
b/docs/en/connectors/source/SqlServer-CDC.md
index 90da3bf970..be51e9d626 100644
--- a/docs/en/connectors/source/SqlServer-CDC.md
+++ b/docs/en/connectors/source/SqlServer-CDC.md
@@ -79,7 +79,7 @@ case-sensitive databases, make sure the configured identifier
case matches the d
| password | String | Yes | - |
Password to use when connecting to the database server.
[...]
| database-names | List | Yes | - |
Database name of the database to monitor.
[...]
| table-names | List | Yes | - |
Table name is a combination of schema name and table name
(databaseName.schemaName.tableName).
[...]
-| table-names-config | List | No | - |
Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys":
["key1"],"snapshotSplitColumn": "key2"}]
[...]
+| table-names-config | List | No | - |
Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys":
["key1"],"snapshotSplitColumn": "key2"}]. The snapshotSplitColumn option must
be configured with a unique key. If a non-unique column is provided, the
configuration is ignored and SeaTunnel automatically selects an appropriate
split column internally.
[...]
| url | String | Yes | - |
URL has to be with database, like
"jdbc:sqlserver://localhost:1433;databaseName=test".
[...]
| startup.mode | Enum | No | INITIAL |
Optional startup mode for SqlServer CDC consumer, valid enumerations are
"initial", "earliest", "latest", "timestamp" and "specific".
[...]
| startup.timestamp | Long | No | - |
Start from the specified epoch timestamp (in milliseconds). This timestamp is
converted with `server-time-zone` when `startup.mode = timestamp`.<br/> **Note,
This option is required when** the **"startup.mode" option used
`'timestamp'`.**
[...]
diff --git a/docs/zh/connectors/source/MySQL-CDC.md
b/docs/zh/connectors/source/MySQL-CDC.md
index e9456f02a9..86afcb9953 100644
--- a/docs/zh/connectors/source/MySQL-CDC.md
+++ b/docs/zh/connectors/source/MySQL-CDC.md
@@ -191,7 +191,7 @@ show variables where variable_name in ('log_bin',
'binlog_format', 'binlog_row_i
| database-pattern | String | 否 | .* |
要捕获的数据库名称的正则表达式, 例如: `database_prefix.*`.
|
| table-names | List | 是 | - |
要监控的表名. 表名需要包括库名, 例如: `database_name.table_name`
|
| table-pattern | String | 是 | - |
要捕获的表名称的正则表达式. 表名需要包括库名, 例如: `database.*\\.table_.*`
|
-| table-names-config | List | 否 | - |
表配置的列表集合. 例如: [{"table": "db1.schema1.table1","primaryKeys":
["key1"],"snapshotSplitColumn": "key2"}]
|
+| table-names-config | List | 否 | - |
表配置的列表集合. 例如: [{"table": "db1.schema1.table1","primaryKeys":
["key1"],"snapshotSplitColumn": "key2"}]. snapshotSplitColumn
选项必须配置为唯一键(主键或唯一索引). 如果指定了非唯一列,该配置将被忽略,SeaTunnel 会在内部自动选择合适的拆分列.
|
| startup.mode | Enum | 否 | INITIAL |
MySQL CDC 消费者的可选启动模式, 有效枚举值为 `initial`, `earliest`, `latest` , `specific` 和
`timestamp`. <br/> `initial`: 启动时同步历史数据, 然后同步增量数据.<br/> `earliest`:
从尽可能最早的偏移量开始启动.<br/> `latest`: 从最近的偏移量启动.<br/> `specific`:
从用户提供的特定偏移量开始启动.<br/> `timestamp`: 从用户提供的特定时间戳开始启动. |
| startup.specific-offset.file | String | 否 | - |
从指定的binlog日志文件名开始. **注意, 当使用 `startup.mode` 选项为 `specific` 时,此选项为必填项.**
|
| startup.specific-offset.pos | Long | 否 | - |
从指定的binlog日志文件位置开始. **注意, 当使用 `startup.mode` 选项为 `specific` 时,此选项为必填项.**
|
diff --git a/docs/zh/connectors/source/PostgreSQL-CDC.md
b/docs/zh/connectors/source/PostgreSQL-CDC.md
index 9546dbbda5..7774f80757 100644
--- a/docs/zh/connectors/source/PostgreSQL-CDC.md
+++ b/docs/zh/connectors/source/PostgreSQL-CDC.md
@@ -93,7 +93,7 @@ ALTER TABLE your_table_name REPLICA IDENTITY FULL;
| password | String | 是 | - |
连接到数据库服务器时使用的密码。
[...]
| database-names | List | 否 | - |
需要监控的数据库名称。
[...]
| table-names | List | 是 | - |
需要监控的数据库表名称。表名称需要包含数据库名称,例如:`database_name.table_name`。
[...]
-| table-names-config | List | 否 | - |
表配置列表。例如: [{"table": "db1.schema1.table1","primaryKeys":
["key1"],"snapshotSplitColumn": "key2"}]
[...]
+| table-names-config | List | 否 | - |
表配置列表。例如: [{"table": "db1.schema1.table1","primaryKeys":
["key1"],"snapshotSplitColumn": "key2"}]。 snapshotSplitColumn
选项必须配置为唯一键(主键或唯一索引)。 如果指定了非唯一列,该配置将被忽略,SeaTunnel 会在内部自动选择合适的拆分列。
[...]
| startup.mode | List | 否 | INITIAL |
PostgreSQL CDC 消费者的可选启动模式,有效枚举为 `initial`、`earliest` 和 `latest`。<br/>
`initial`: 启动时同步历史数据,然后同步增量数据。<br/> `earliest`: 从可能的最早偏移量启动。<br/> `latest`:
从最新偏移量启动。
[...]
| snapshot.split.size | Integer | 否 | 8096 |
表快照的拆分大小(行数),捕获的表在读取表快照时被拆分成多个拆分。
[...]
| snapshot.fetch.size | Integer | 否 | 1024 |
读取表快照时每次轮询的最大获取大小。
[...]
diff --git a/docs/zh/connectors/source/SqlServer-CDC.md
b/docs/zh/connectors/source/SqlServer-CDC.md
index 5192bfaaab..af02eea08f 100644
--- a/docs/zh/connectors/source/SqlServer-CDC.md
+++ b/docs/zh/connectors/source/SqlServer-CDC.md
@@ -77,7 +77,7 @@ Sql Server CDC 连接器允许从 SqlServer 数据库读取快照数据和增量
| password | String | 是 | -
| 连接数据库服务器时使用的密码。
|
| database-names | List | 是 | -
| 要监控的数据库名称。
|
| table-names | List | 是 | -
| 表名是模式名和表名的组合 (databaseName.schemaName.tableName)。
|
-| table-names-config | List | 否 | -
| 表配置列表。例如:[{"table": "db1.schema1.table1","primaryKeys":
["key1"],"snapshotSplitColumn": "key2"}]
|
+| table-names-config | List | 否 | -
| 表配置列表。例如:[{"table": "db1.schema1.table1","primaryKeys":
["key1"],"snapshotSplitColumn": "key2"}]。 snapshotSplitColumn
选项必须配置为唯一键(主键或唯一索引)。 如果指定了非唯一列,该配置将被忽略,SeaTunnel 会在内部自动选择合适的拆分列。
|
| url | String | 是 | -
| URL 必须包含数据库,如 "jdbc:sqlserver://localhost:1433;databaseName=test"。
|
| startup.mode | Enum | 否 |
INITIAL | SqlServer CDC 消费者的可选启动模式,有效枚举为
"initial"、"earliest"、"latest"、"timestamp" 和 "specific"。
|
| startup.timestamp | Long | 否 | -
| 从指定的纪元时间戳(以毫秒为单位)开始。当 `startup.mode = timestamp` 时,该时间戳会按 `server-time-zone`
转换。<br/> **注意,当 "startup.mode" 选项使用 `'timestamp'` 时,此选项是必需的。**
|
