This is an automated email from the ASF dual-hosted git repository.

wanghailin pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/seatunnel.git


The following commit(s) were added to refs/heads/dev by this push:
     new 3f7f795717 [Doc][Mysql-cdc]Update doc to support mysql 8.0 (#8579)
3f7f795717 is described below

commit 3f7f79571770cb4a241ece52d46783fa948989e4
Author: litiliu <[email protected]>
AuthorDate: Thu Jan 23 19:52:55 2025 +0800

    [Doc][Mysql-cdc]Update doc to support mysql 8.0 (#8579)
    
    Co-authored-by: litiliu <[email protected]>
---
 docs/en/connector-v2/source/MySQL-CDC.md | 86 ++++++++++++++++++--------------
 1 file changed, 49 insertions(+), 37 deletions(-)

diff --git a/docs/en/connector-v2/source/MySQL-CDC.md 
b/docs/en/connector-v2/source/MySQL-CDC.md
index 42d3db09c9..9a95b3e566 100644
--- a/docs/en/connector-v2/source/MySQL-CDC.md
+++ b/docs/en/connector-v2/source/MySQL-CDC.md
@@ -78,10 +78,9 @@ mysql> show variables where variable_name in ('log_bin', 
'binlog_format', 'binlo
 | gtid_mode                | ON             |
 | log_bin                  | ON             |
 +--------------------------+----------------+
-5 rows in set (0.00 sec)
 ```
 
-2. If inconsistent with the above results, configure your MySQL server 
configuration file(`$MYSQL_HOME/mysql.cnf`) with the following properties, 
which are described in the table below:
+2. If the value of `log_bin` is not `on`, configure your MySQL server 
configuration file(`$MYSQL_HOME/mysql.cnf`) with the following properties, 
which are described in the table below:
 
 ```
 # Enable binary replication log and set the prefix, expiration, and log format.
@@ -95,8 +94,8 @@ binlog_format     = row
 # mysql 5.6+ requires binlog_row_image to be set to FULL
 binlog_row_image  = FULL
 
-# enable gtid mode
-# mysql 5.6+ requires gtid_mode to be set to ON
+# optional enable gtid mode
+# mysql 5.6+ requires gtid_mode to be set to ON, but not required by mysql 8.0+
 gtid_mode = on
 enforce_gtid_consistency = on
 ```
@@ -119,7 +118,6 @@ mysql> show variables where variable_name in ('log_bin', 
'binlog_format', 'binlo
 | binlog_format            | ROW            |
 | log_bin                  | ON             |
 +--------------------------+----------------+
-5 rows in set (0.00 sec)
 ```
 
 MySQL 5.6+:
@@ -135,8 +133,22 @@ mysql> show variables where variable_name in ('log_bin', 
'binlog_format', 'binlo
 | gtid_mode                | ON             |
 | log_bin                  | ON             |
 +--------------------------+----------------+
-5 rows in set (0.00 sec)
 ```
+MySQL 8.0+:
+```sql
+show variables where variable_name in ('log_bin', 'binlog_format', 
'binlog_row_image', 'gtid_mode', 'enforce_gtid_consistency')
++--------------------------+----------------+
+| Variable_name            | Value          |
++--------------------------+----------------+
+| binlog_format            | ROW            |
+| binlog_row_image         | FULL           |
+| enforce_gtid_consistency | OFF            |
+| gtid_mode                | OFF            |
+| log_bin                  | ON             |
++--------------------------+----------------+  
+     
+```
+
 
 ### Notes
 
@@ -169,38 +181,38 @@ When an initial consistent snapshot is made for large 
databases, your establishe
 
 ## Source Options
 
-| Name                                           | Type     | Required | 
Default | Description                                                           
                                                                                
                                                                                
                                                                                
                                                                                
                    [...]
-|------------------------------------------------|----------|----------|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
-| base-url                                       | String   | Yes      | -     
  | The URL of the JDBC connection. Refer to a case: 
`jdbc:mysql://localhost:3306:3306/test`.                                        
                                                                                
                                                                                
                                                                                
                                         [...]
-| username                                       | String   | Yes      | -     
  | Name of the database to use when connecting to the database server.         
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
-| password                                       | String   | Yes      | -     
  | Password to use when connecting to the database server.                     
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
-| database-names                                 | List     | No       | -     
  | Database name of the database to monitor.                                   
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
-| database-pattern                               | String   | No       | .*    
  | The database names RegEx of the database to capture, for example: 
`database_prefix.*`.                                                            
                                                                                
                                                                                
                                                                                
                        [...]
-| table-names                                    | List     | Yes      | -     
  | Table name of the database to monitor. The table name needs to include the 
database name, for example: `database_name.table_name`                          
                                                                                
                                                                                
                                                                                
               [...]
-| table-pattern                                  | String   | Yes      | -     
  | The table names RegEx of the database to capture. The table name needs to 
include the database name, for example: `database.*\\.table_.*`                 
                                                                                
                                                                                
                                                                                
                [...]
-| table-names-config                             | List     | No       | -     
  | Table config list. for example: [{"table": 
"db1.schema1.table1","primaryKeys": ["key1"],"snapshotSplitColumn": "key2"}]    
                                                                                
                                                                                
                                                                                
                                               [...]
-| startup.mode                                   | Enum     | No       | 
INITIAL | Optional startup mode for MySQL CDC consumer, valid enumerations are 
`initial`, `earliest`, `latest` and `specific`. <br/> `initial`: Synchronize 
historical data at startup, and then synchronize incremental data.<br/> 
`earliest`: Startup from the earliest offset possible.<br/> `latest`: Startup 
from the latest offset.<br/> `specific`: Startup from user-supplied specific 
offsets.                             [...]
-| startup.specific-offset.file                   | String   | No       | -     
  | Start from the specified binlog file name. **Note, This option is required 
when the `startup.mode` option used `specific`.**                               
                                                                                
                                                                                
                                                                                
               [...]
-| startup.specific-offset.pos                    | Long     | No       | -     
  | Start from the specified binlog file position. **Note, This option is 
required when the `startup.mode` option used `specific`.**                      
                                                                                
                                                                                
                                                                                
                    [...]
-| stop.mode                                      | Enum     | No       | NEVER 
  | Optional stop mode for MySQL CDC consumer, valid enumerations are `never`, 
`latest` or `specific`. <br/> `never`: Real-time job don't stop the 
source.<br/> `latest`: Stop from the latest offset.<br/> `specific`: Stop from 
user-supplied specific offset.                                                  
                                                                                
                            [...]
-| stop.specific-offset.file                      | String   | No       | -     
  | Stop from the specified binlog file name. **Note, This option is required 
when the `stop.mode` option used `specific`.**                                  
                                                                                
                                                                                
                                                                                
                [...]
-| stop.specific-offset.pos                       | Long     | No       | -     
  | Stop from the specified binlog file position. **Note, This option is 
required when the `stop.mode` option used `specific`.**                         
                                                                                
                                                                                
                                                                                
                     [...]
-| snapshot.split.size                            | Integer  | No       | 8096  
  | The split size (number of rows) of table snapshot, captured tables are 
split into multiple splits when read the snapshot of table.                     
                                                                                
                                                                                
                                                                                
                   [...]
-| snapshot.fetch.size                            | Integer  | No       | 1024  
  | The maximum fetch size for per poll when read table snapshot.               
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
-| server-id                                      | String   | No       | -     
  | A numeric ID or a numeric ID range of this database client, The numeric ID 
syntax is like `5400`, the numeric ID range syntax is like '5400-5408'. <br/> 
Every ID must be unique across all currently-running database processes in the 
MySQL cluster. This connector joins the <br/> MySQL cluster as another server 
(with this unique ID) so it can read the binlog. <br/> By default, a random 
number is generated bet [...]
-| server-time-zone                               | String   | No       | UTC   
  | The session time zone in database server. If not set, then 
ZoneId.systemDefault() is used to determine the server time zone.               
                                                                                
                                                                                
                                                                                
                               [...]
-| connect.timeout.ms                             | Duration | No       | 30000 
  | The maximum time that the connector should wait after trying to connect to 
the database server before timing out.                                          
                                                                                
                                                                                
                                                                                
               [...]
-| connect.max-retries                            | Integer  | No       | 3     
  | The max retry times that the connector should retry to build database 
server connection.                                                              
                                                                                
                                                                                
                                                                                
                    [...]
-| connection.pool.size                           | Integer  | No       | 20    
  | The jdbc connection pool size.                                              
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| Name                                           | Type     | Required | 
Default | Description                                                           
                                                                                
                                                                                
                                                                                
                                                                                
                    [...]
+|------------------------------------------------|----------|----------|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| base-url                                       | String   | Yes      | -     
  | The URL of the JDBC connection. Refer to a case: 
`jdbc:mysql://localhost:3306/test`.                                             
                                                                                
                                                                                
                                                                                
                                         [...]
+| username                                       | String   | Yes      | -     
  | Name of the database to use when connecting to the database server.         
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| password                                       | String   | Yes      | -     
  | Password to use when connecting to the database server.                     
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| database-names                                 | List     | No       | -     
  | Database name of the database to monitor.                                   
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| database-pattern                               | String   | No       | .*    
  | The database names RegEx of the database to capture, for example: 
`database_prefix.*`.                                                            
                                                                                
                                                                                
                                                                                
                        [...]
+| table-names                                    | List     | Yes      | -     
  | Table name of the database to monitor. The table name needs to include the 
database name, for example: `database_name.table_name`                          
                                                                                
                                                                                
                                                                                
               [...]
+| table-pattern                                  | String   | Yes      | -     
  | The table names RegEx of the database to capture. The table name needs to 
include the database name, for example: `database.*\\.table_.*`                 
                                                                                
                                                                                
                                                                                
                [...]
+| table-names-config                             | List     | No       | -     
  | Table config list. for example: [{"table": 
"db1.schema1.table1","primaryKeys": ["key1"],"snapshotSplitColumn": "key2"}]    
                                                                                
                                                                                
                                                                                
                                               [...]
+| startup.mode                                   | Enum     | No       | 
INITIAL | Optional startup mode for MySQL CDC consumer, valid enumerations are 
`initial`, `earliest`, `latest` and `specific`. <br/> `initial`: Synchronize 
historical data at startup, and then synchronize incremental data.<br/> 
`earliest`: Startup from the earliest offset possible.<br/> `latest`: Startup 
from the latest offset.<br/> `specific`: Startup from user-supplied specific 
offsets.                             [...]
+| startup.specific-offset.file                   | String   | No       | -     
  | Start from the specified binlog file name. **Note, This option is required 
when the `startup.mode` option used `specific`.**                               
                                                                                
                                                                                
                                                                                
               [...]
+| startup.specific-offset.pos                    | Long     | No       | -     
  | Start from the specified binlog file position. **Note, This option is 
required when the `startup.mode` option used `specific`.**                      
                                                                                
                                                                                
                                                                                
                    [...]
+| stop.mode                                      | Enum     | No       | NEVER 
  | Optional stop mode for MySQL CDC consumer, valid enumerations are `never`, 
`latest` or `specific`. <br/> `never`: Real-time job don't stop the 
source.<br/> `latest`: Stop from the latest offset.<br/> `specific`: Stop from 
user-supplied specific offset.                                                  
                                                                                
                            [...]
+| stop.specific-offset.file                      | String   | No       | -     
  | Stop from the specified binlog file name. **Note, This option is required 
when the `stop.mode` option used `specific`.**                                  
                                                                                
                                                                                
                                                                                
                [...]
+| stop.specific-offset.pos                       | Long     | No       | -     
  | Stop from the specified binlog file position. **Note, This option is 
required when the `stop.mode` option used `specific`.**                         
                                                                                
                                                                                
                                                                                
                     [...]
+| snapshot.split.size                            | Integer  | No       | 8096  
  | The split size (number of rows) of table snapshot, captured tables are 
split into multiple splits when read the snapshot of table.                     
                                                                                
                                                                                
                                                                                
                   [...]
+| snapshot.fetch.size                            | Integer  | No       | 1024  
  | The maximum fetch size for per poll when read table snapshot.               
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| server-id                                      | String   | No       | -     
  | A numeric ID or a numeric ID range of this database client, The numeric ID 
syntax is like `5400`, the numeric ID range syntax is like '5400-5408'. <br/> 
Every ID must be unique across all currently-running database processes in the 
MySQL cluster. This connector joins the <br/> MySQL cluster as another server 
(with this unique ID) so it can read the binlog. <br/> By default, a random 
number is generated bet [...]
+| server-time-zone                               | String   | No       | UTC   
  | The session time zone in database server. If not set, then 
ZoneId.systemDefault() is used to determine the server time zone.               
                                                                                
                                                                                
                                                                                
                               [...]
+| connect.timeout.ms                             | Duration | No       | 30000 
  | The maximum time that the connector should wait after trying to connect to 
the database server before timing out.                                          
                                                                                
                                                                                
                                                                                
               [...]
+| connect.max-retries                            | Integer  | No       | 3     
  | The max retry times that the connector should retry to build database 
server connection.                                                              
                                                                                
                                                                                
                                                                                
                    [...]
+| connection.pool.size                           | Integer  | No       | 20    
  | The jdbc connection pool size.                                              
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
 | chunk-key.even-distribution.factor.upper-bound | Double   | No       | 100   
  | The upper bound of the chunk key distribution factor. This factor is used 
to determine whether the table data is evenly distributed. If the distribution 
factor is calculated to be less than or equal to this upper bound (i.e., 
(MAX(id) - MIN(id) + 1) / row count), the table chunks would be optimized for 
even distribution. Otherwise, if the distribution factor is greater, the table 
will be considered as unev [...]
-| chunk-key.even-distribution.factor.lower-bound | Double   | No       | 0.05  
  | The lower bound of the chunk key distribution factor. This factor is used 
to determine whether the table data is evenly distributed. If the distribution 
factor is calculated to be greater than or equal to this lower bound (i.e., 
(MAX(id) - MIN(id) + 1) / row count), the table chunks would be optimized for 
even distribution. Otherwise, if the distribution factor is less, the table 
will be considered as unev [...]
-| sample-sharding.threshold                      | Integer  | No       | 1000  
  | This configuration specifies the threshold of estimated shard count to 
trigger the sample sharding strategy. When the distribution factor is outside 
the bounds specified by `chunk-key.even-distribution.factor.upper-bound` and 
`chunk-key.even-distribution.factor.lower-bound`, and the estimated shard count 
(calculated as approximate row count / chunk size) exceeds this threshold, the 
sample sharding strategy [...]
-| inverse-sampling.rate                          | Integer  | No       | 1000  
  | The inverse of the sampling rate used in the sample sharding strategy. For 
example, if this value is set to 1000, it means a 1/1000 sampling rate is 
applied during the sampling process. This option provides flexibility in 
controlling the granularity of the sampling, thus affecting the final number of 
shards. It's especially useful when dealing with very large datasets where a 
lower sampling rate is preferr [...]
-| exactly_once                                   | Boolean  | No       | false 
  | Enable exactly once semantic.                                               
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
-| format                                         | Enum     | No       | 
DEFAULT | Optional output format for MySQL CDC, valid enumerations are 
`DEFAULT`、`COMPATIBLE_DEBEZIUM_JSON`.                                           
                                                                                
                                                                                
                                                                                
                             [...]
-| schema-changes.enabled                         | Boolean  | No       | false 
  | Schema evolution is disabled by default. Now we only support `add 
column`、`drop column`、`rename column` and `modify column`.                      
                                                                                
                                                                                
                                                                                
                        [...]
-| debezium                                       | Config   | No       | -     
  | Pass-through [Debezium's 
properties](https://github.com/debezium/debezium/blob/v1.9.8.Final/documentation/modules/ROOT/pages/connectors/mysql.adoc#connector-properties)
 to Debezium Embedded Engine which is used to capture data changes from MySQL 
server.                                                                         
                                                                                
    [...]
-| common-options                                 |          | no       | -     
  | Source plugin common parameters, please refer to [Source Common 
Options](../source-common-options.md) for details                               
                                                                                
                                                                                
                                                                                
                          [...]
+| chunk-key.even-distribution.factor.lower-bound | Double   | No       | 0.05  
  | The lower bound of the chunk key distribution factor. This factor is used 
to determine whether the table data is evenly distributed. If the distribution 
factor is calculated to be greater than or equal to this lower bound (i.e., 
(MAX(id) - MIN(id) + 1) / row count), the table chunks would be optimized for 
even distribution. Otherwise, if the distribution factor is less, the table 
will be considered as unev [...]
+| sample-sharding.threshold                      | Integer  | No       | 1000  
  | This configuration specifies the threshold of estimated shard count to 
trigger the sample sharding strategy. When the distribution factor is outside 
the bounds specified by `chunk-key.even-distribution.factor.upper-bound` and 
`chunk-key.even-distribution.factor.lower-bound`, and the estimated shard count 
(calculated as approximate row count / chunk size) exceeds this threshold, the 
sample sharding strategy [...]
+| inverse-sampling.rate                          | Integer  | No       | 1000  
  | The inverse of the sampling rate used in the sample sharding strategy. For 
example, if this value is set to 1000, it means a 1/1000 sampling rate is 
applied during the sampling process. This option provides flexibility in 
controlling the granularity of the sampling, thus affecting the final number of 
shards. It's especially useful when dealing with very large datasets where a 
lower sampling rate is preferr [...]
+| exactly_once                                   | Boolean  | No       | false 
  | Enable exactly once semantic.                                               
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| format                                         | Enum     | No       | 
DEFAULT | Optional output format for MySQL CDC, valid enumerations are 
`DEFAULT`、`COMPATIBLE_DEBEZIUM_JSON`.                                           
                                                                                
                                                                                
                                                                                
                             [...]
+| schema-changes.enabled                         | Boolean  | No       | false 
  | Schema evolution is disabled by default. Now we only support `add 
column`、`drop column`、`rename column` and `modify column`.                      
                                                                                
                                                                                
                                                                                
                        [...]
+| debezium                                       | Config   | No       | -     
  | Pass-through [Debezium's 
properties](https://github.com/debezium/debezium/blob/v1.9.8.Final/documentation/modules/ROOT/pages/connectors/mysql.adoc#connector-properties)
 to Debezium Embedded Engine which is used to capture data changes from MySQL 
server.                                                                         
                                                                                
    [...]
+| common-options                                 |          | no       | -     
  | Source plugin common parameters, please refer to [Source Common 
Options](../source-common-options.md) for details                               
                                                                                
                                                                                
                                                                                
                          [...]
 
 ## Task Example
 

Reply via email to