(seatunnel) branch dev updated: [Improve][Docs] Clarify snapshot.split.column unique key requirement (#10312)

corgy Wed, 25 Mar 2026 02:12:07 -0700

This is an automated email from the ASF dual-hosted git repository.

corgy pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/seatunnel.git



The following commit(s) were added to refs/heads/dev by this push:
     new 3c061668ce [Improve][Docs] Clarify snapshot.split.column unique key 
requirement (#10312)
3c061668ce is described below

commit 3c061668ce9340837a99a1181219846896351da9
Author: icekimchi <[email protected]>
AuthorDate: Wed Mar 25 02:10:34 2026 -0700

    [Improve][Docs] Clarify snapshot.split.column unique key requirement 
(#10312)
---
 docs/en/connectors/source/MySQL-CDC.md      | 2 +-
 docs/en/connectors/source/PostgreSQL-CDC.md | 2 +-
 docs/en/connectors/source/SqlServer-CDC.md  | 2 +-
 docs/zh/connectors/source/MySQL-CDC.md      | 2 +-
 docs/zh/connectors/source/PostgreSQL-CDC.md | 2 +-
 docs/zh/connectors/source/SqlServer-CDC.md  | 2 +-
 6 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/docs/en/connectors/source/MySQL-CDC.md 
b/docs/en/connectors/source/MySQL-CDC.md
index 61b8838435..a13772d300 100644
--- a/docs/en/connectors/source/MySQL-CDC.md
+++ b/docs/en/connectors/source/MySQL-CDC.md
@@ -192,7 +192,7 @@ When an initial consistent snapshot is made for large 
databases, your establishe
 | database-pattern                          | String   | No       | .*      | 
The database names RegEx of the database to capture, for example: 
`database_prefix.*`.                                                            
                                                                                
                                                                                
                                                                                
                             [...]
 | table-names                               | List     | Yes      | -       | 
Table name of the database to monitor. The table name needs to include the 
database name, for example: `database_name.table_name`                          
                                                                                
                                                                                
                                                                                
                    [...]
 | table-pattern                             | String   | Yes      | -       | 
The table names RegEx of the database to capture. The table name needs to 
include the database name, for example: `database.*\\.table_.*`                 
                                                                                
                                                                                
                                                                                
                     [...]
-| table-names-config                        | List     | No       | -       | 
Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys": 
["key1"],"snapshotSplitColumn": "key2"}]                                        
                                                                                
                                                                                
                                                                                
                [...]
+| table-names-config                        | List     | No       | -       | 
Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys": 
["key1"],"snapshotSplitColumn": "key2"}]. The snapshotSplitColumn option must 
be configured with a unique key. If a non-unique column is provided, the 
configuration is ignored and SeaTunnel automatically selects an appropriate 
split column internally.                                                        
                             [...]
 | startup.mode                              | Enum     | No       | INITIAL | 
Optional startup mode for MySQL CDC consumer, valid enumerations are `initial`, 
`earliest`, `latest` , `specific` and `timestamp`. <br/> `initial`: Synchronize 
historical data at startup, and then synchronize incremental data.<br/> 
`earliest`: Startup from the earliest offset possible.<br/> `latest`: Startup 
from the latest offset.<br/> `specific`: Startup from user-supplied specific 
offsets.<br/> `timestamp`:  [...]
 | startup.specific-offset.file              | String   | No       | -       | 
Start from the specified binlog file name. **Note, This option is required when 
the `startup.mode` option used `specific`.**                                    
                                                                                
                                                                                
                                                                                
               [...]
 | startup.specific-offset.pos               | Long     | No       | -       | 
Start from the specified binlog file position. **Note, This option is required 
when the `startup.mode` option used `specific`.**                               
                                                                                
                                                                                
                                                                                
                [...]
diff --git a/docs/en/connectors/source/PostgreSQL-CDC.md 
b/docs/en/connectors/source/PostgreSQL-CDC.md
index 940d8d8aa3..edf3af398d 100644
--- a/docs/en/connectors/source/PostgreSQL-CDC.md
+++ b/docs/en/connectors/source/PostgreSQL-CDC.md
@@ -95,7 +95,7 @@ ALTER TABLE your_table_name REPLICA IDENTITY FULL;
 | password                                  | String   | Yes      | -        | 
Password to use when connecting to the database server.                         
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
 | database-names                            | List     | No       | -        | 
Database name of the database to monitor.                                       
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
 | table-names                               | List     | Yes      | -        | 
Table name of the database to monitor. The table name needs to include the 
database name, for example: `database_name.table_name`                          
                                                                                
                                                                                
                                                                                
                   [...]
-| table-names-config                        | List     | No       | -        | 
Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys": 
["key1"],"snapshotSplitColumn": "key2"}]                                        
                                                                                
                                                                                
                                                                                
               [...]
+| table-names-config                        | List     | No       | -       | 
Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys": 
["key1"],"snapshotSplitColumn": "key2"}]. The snapshotSplitColumn option must 
be configured with a unique key. If a non-unique column is provided, the 
configuration is ignored and SeaTunnel automatically selects an appropriate 
split column internally.                                                        
                             [...]
 | startup.mode                              | Enum     | No       | INITIAL  | 
Optional startup mode for PostgreSQL CDC consumer, valid enumerations are 
`initial`, `earliest` and `latest`. <br/> `initial`: Synchronize historical 
data at startup, and then synchronize incremental data.<br/> `earliest`: 
Startup from the earliest offset possible.<br/> `latest`: Startup from the 
latest offset.                                                                  
                                    [...]
 | snapshot.split.size                       | Integer  | No       | 8096     | 
The split size (number of rows) of table snapshot, captured tables are split 
into multiple splits when read the snapshot of table.                           
                                                                                
                                                                                
                                                                                
                 [...]
 | snapshot.fetch.size                       | Integer  | No       | 1024     | 
The maximum fetch size for per poll when read table snapshot.                   
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
diff --git a/docs/en/connectors/source/SqlServer-CDC.md 
b/docs/en/connectors/source/SqlServer-CDC.md
index 90da3bf970..be51e9d626 100644
--- a/docs/en/connectors/source/SqlServer-CDC.md
+++ b/docs/en/connectors/source/SqlServer-CDC.md
@@ -79,7 +79,7 @@ case-sensitive databases, make sure the configured identifier 
case matches the d
 | password                                  | String   | Yes      | -       | 
Password to use when connecting to the database server.                         
                                                                                
                                                                                
                                                                                
                                                                                
               [...]
 | database-names                            | List     | Yes      | -       | 
Database name of the database to monitor.                                       
                                                                                
                                                                                
                                                                                
                                                                                
               [...]
 | table-names                               | List     | Yes      | -       | 
Table name is a combination of schema name and table name 
(databaseName.schemaName.tableName).                                            
                                                                                
                                                                                
                                                                                
                                     [...]
-| table-names-config                        | List     | No       | -       | 
Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys": 
["key1"],"snapshotSplitColumn": "key2"}]                                        
                                                                                
                                                                                
                                                                                
                [...]
+| table-names-config                        | List     | No       | -       | 
Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys": 
["key1"],"snapshotSplitColumn": "key2"}]. The snapshotSplitColumn option must 
be configured with a unique key. If a non-unique column is provided, the 
configuration is ignored and SeaTunnel automatically selects an appropriate 
split column internally.                                                        
                             [...]
 | url                                       | String   | Yes      | -       | 
URL has to be with database, like 
"jdbc:sqlserver://localhost:1433;databaseName=test".                            
                                                                                
                                                                                
                                                                                
                                                             [...]
 | startup.mode                              | Enum     | No       | INITIAL | 
Optional startup mode for SqlServer CDC consumer, valid enumerations are 
"initial", "earliest", "latest", "timestamp" and "specific".                    
                                                                                
                                                                                
                                                                                
                      [...]
 | startup.timestamp                         | Long     | No       | -       | 
Start from the specified epoch timestamp (in milliseconds). This timestamp is 
converted with `server-time-zone` when `startup.mode = timestamp`.<br/> **Note, 
This option is required when** the **"startup.mode" option used 
`'timestamp'`.**                                                                
                                                                                
                                 [...]
diff --git a/docs/zh/connectors/source/MySQL-CDC.md 
b/docs/zh/connectors/source/MySQL-CDC.md
index e9456f02a9..86afcb9953 100644
--- a/docs/zh/connectors/source/MySQL-CDC.md
+++ b/docs/zh/connectors/source/MySQL-CDC.md
@@ -191,7 +191,7 @@ show variables where variable_name in ('log_bin', 
'binlog_format', 'binlog_row_i
 | database-pattern                          | String   | 否    | .*      | 
要捕获的数据库名称的正则表达式, 例如: `database_prefix.*`.                                       
                                                                                
                                                                             |
 | table-names                               | List     | 是    | -       | 
要监控的表名. 表名需要包括库名, 例如: `database_name.table_name`                                
                                                                                
                                                                             |
 | table-pattern                             | String   | 是    | -       | 
要捕获的表名称的正则表达式. 表名需要包括库名, 例如: `database.*\\.table_.*`                            
                                                                                
                                                                             |
-| table-names-config                        | List     | 否    | -       | 
表配置的列表集合. 例如: [{"table": "db1.schema1.table1","primaryKeys": 
["key1"],"snapshotSplitColumn": "key2"}]                                        
                                                                                
                |
+| table-names-config                        | List     | 否    | -       | 
表配置的列表集合. 例如: [{"table": "db1.schema1.table1","primaryKeys": 
["key1"],"snapshotSplitColumn": "key2"}]. snapshotSplitColumn 
选项必须配置为唯一键(主键或唯一索引). 如果指定了非唯一列，该配置将被忽略，SeaTunnel 会在内部自动选择合适的拆分列.                
                                                                                
                                                                                
         |
 | startup.mode                              | Enum     | 否    | INITIAL | 
MySQL CDC 消费者的可选启动模式, 有效枚举值为 `initial`, `earliest`, `latest` , `specific` 和 
`timestamp`. <br/> `initial`: 启动时同步历史数据, 然后同步增量数据.<br/> `earliest`: 
从尽可能最早的偏移量开始启动.<br/> `latest`: 从最近的偏移量启动.<br/> `specific`: 
从用户提供的特定偏移量开始启动.<br/> `timestamp`: 从用户提供的特定时间戳开始启动.                 |
 | startup.specific-offset.file              | String   | 否    | -       | 
从指定的binlog日志文件名开始. **注意, 当使用 `startup.mode` 选项为 `specific` 时，此选项为必填项.**         
                                                                                
                                                                             |
 | startup.specific-offset.pos               | Long     | 否    | -       | 
从指定的binlog日志文件位置开始. **注意, 当使用 `startup.mode` 选项为 `specific` 时，此选项为必填项.**        
                                                                                
                                                                             |
diff --git a/docs/zh/connectors/source/PostgreSQL-CDC.md 
b/docs/zh/connectors/source/PostgreSQL-CDC.md
index 9546dbbda5..7774f80757 100644
--- a/docs/zh/connectors/source/PostgreSQL-CDC.md
+++ b/docs/zh/connectors/source/PostgreSQL-CDC.md
@@ -93,7 +93,7 @@ ALTER TABLE your_table_name REPLICA IDENTITY FULL;
 | password                                  | String   | 是   | -        | 
连接到数据库服务器时使用的密码。                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                   [...]
 | database-names                            | List     | 否   | -        | 
需要监控的数据库名称。                                                                     
                                                                                
                                                                                
                                                                                
                                                                                
                   [...]
 | table-names                               | List     | 是   | -        | 
需要监控的数据库表名称。表名称需要包含数据库名称，例如：`database_name.table_name`。                         
                                                                                
                                                                                
                                                                                
                                                                                
                   [...]
-| table-names-config                        | List     | 否   | -        | 
表配置列表。例如： [{"table": "db1.schema1.table1","primaryKeys": 
["key1"],"snapshotSplitColumn": "key2"}]                                        
                                                                                
                                                                                
                                                                                
                                          [...]
+| table-names-config                        | List     | 否   | -        | 
表配置列表。例如： [{"table": "db1.schema1.table1","primaryKeys": 
["key1"],"snapshotSplitColumn": "key2"}]。 snapshotSplitColumn 
选项必须配置为唯一键(主键或唯一索引)。 如果指定了非唯一列，该配置将被忽略，SeaTunnel 会在内部自动选择合适的拆分列。                
                                                                                
                                                                                
                                                            [...]
 | startup.mode                              | List     | 否   | INITIAL  | 
PostgreSQL CDC 消费者的可选启动模式，有效枚举为 `initial`、`earliest` 和 `latest`。<br/> 
`initial`: 启动时同步历史数据，然后同步增量数据。<br/> `earliest`: 从可能的最早偏移量启动。<br/> `latest`: 
从最新偏移量启动。                                                                       
                                                                                
                                                                                
                                 [...]
 | snapshot.split.size                       | Integer  | 否   | 8096     | 
表快照的拆分大小（行数），捕获的表在读取表快照时被拆分成多个拆分。                                               
                                                                                
                                                                                
                                                                                
                                                                                
                   [...]
 | snapshot.fetch.size                       | Integer  | 否   | 1024     | 
读取表快照时每次轮询的最大获取大小。                                                              
                                                                                
                                                                                
                                                                                
                                                                                
                   [...]
diff --git a/docs/zh/connectors/source/SqlServer-CDC.md 
b/docs/zh/connectors/source/SqlServer-CDC.md
index 5192bfaaab..af02eea08f 100644
--- a/docs/zh/connectors/source/SqlServer-CDC.md
+++ b/docs/zh/connectors/source/SqlServer-CDC.md
@@ -77,7 +77,7 @@ Sql Server CDC 连接器允许从 SqlServer 数据库读取快照数据和增量
 | password                                       | String   | 是       | -      
 | 连接数据库服务器时使用的密码。                                                              
                                                                                
                                                                                
                                                        |
 | database-names                                 | List     | 是       | -      
 | 要监控的数据库名称。                                                                   
                                                                                
                                                                                
                                                             |
 | table-names                                    | List     | 是       | -      
 | 表名是模式名和表名的组合 (databaseName.schemaName.tableName)。                            
                                                                                
                                                                                
                                                          |
-| table-names-config                             | List     | 否       | -      
 | 表配置列表。例如：[{"table": "db1.schema1.table1","primaryKeys": 
["key1"],"snapshotSplitColumn": "key2"}]                                        
                                                                                
                                                                                
   |
+| table-names-config                             | List     | 否       | -      
 | 表配置列表。例如：[{"table": "db1.schema1.table1","primaryKeys": 
["key1"],"snapshotSplitColumn": "key2"}]。 snapshotSplitColumn 
选项必须配置为唯一键(主键或唯一索引)。 如果指定了非唯一列，该配置将被忽略，SeaTunnel 会在内部自动选择合适的拆分列。                
                                                                                
                                                                                
                           |
 | url                                            | String   | 是       | -      
 | URL 必须包含数据库，如 "jdbc:sqlserver://localhost:1433;databaseName=test"。           
                                                                                
                                                                                
                                                             |
 | startup.mode                                   | Enum     | 否       | 
INITIAL | SqlServer CDC 消费者的可选启动模式，有效枚举为 
"initial"、"earliest"、"latest"、"timestamp" 和 "specific"。                         
                                                                                
                                                                    |
 | startup.timestamp                              | Long     | 否       | -      
 | 从指定的纪元时间戳（以毫秒为单位）开始。当 `startup.mode = timestamp` 时，该时间戳会按 `server-time-zone` 
转换。<br/> **注意，当 "startup.mode" 选项使用 `'timestamp'` 时，此选项是必需的。**                  
                                                                                
                                                          |

(seatunnel) branch dev updated: [Improve][Docs] Clarify snapshot.split.column unique key requirement (#10312)

Reply via email to