This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 1216a9e5f84 [doc](connector) update flink connector version (#1102)
1216a9e5f84 is described below

commit 1216a9e5f840ee8778553748900b801f8ee16b7a
Author: wudi <[email protected]>
AuthorDate: Thu Sep 12 18:45:48 2024 +0800

    [doc](connector) update flink connector version (#1102)
    
    # Versions
    
    - [ ] dev
    - [ ] 3.0
    - [ ] 2.1
    - [ ] 2.0
    
    # Languages
    
    - [ ] Chinese
    - [ ] English
---
 common_docs_zh/ecosystem/flink-doris-connector.md | 37 +++++++++++++----------
 ecosystem/flink-doris-connector.md                | 15 +++++----
 2 files changed, 28 insertions(+), 24 deletions(-)

diff --git a/common_docs_zh/ecosystem/flink-doris-connector.md 
b/common_docs_zh/ecosystem/flink-doris-connector.md
index d31fd16df48..ba7314e709e 100644
--- a/common_docs_zh/ecosystem/flink-doris-connector.md
+++ b/common_docs_zh/ecosystem/flink-doris-connector.md
@@ -48,6 +48,7 @@ under the License.
 | 1.4.0             | 1.15,1.16,1.17      | 1.0+   | 8   |- |
 | 1.5.2             | 1.15,1.16,1.17,1.18 | 1.0+ | 8 |- |
 | 1.6.2             | 1.15,1.16,1.17,1.18,1.19 | 1.0+ | 8 |- |
+| 24.0.0            | 1.15,1.16,1.17,1.18,1.19,1.20 | 1.0+ | 8 |- |
 
 ## 使用
 
@@ -60,7 +61,7 @@ under the License.
 <dependency>
   <groupId>org.apache.doris</groupId>
   <artifactId>flink-doris-connector-1.16</artifactId>
-  <version>1.6.2</version>
+  <version>24.0.0</version>
 </dependency>  
 ```
 
@@ -74,7 +75,7 @@ under the License.
 
 编译时,可直接运行`sh 
build.sh`,具体可参考[这里](https://github.com/apache/doris-flink-connector/blob/master/README.md)。
 
-编译成功后,会在 `dist` 目录生成目标 jar 包,如:`flink-doris-connector-1.5.0-SNAPSHOT.jar`。
+编译成功后,会在 `dist` 目录生成目标 jar 包,如:`flink-doris-connector-24.0.0-SNAPSHOT.jar`。
 将此文件复制到 `Flink` 的 `classpath` 中即可使用 `Flink-Doris-Connector` 。例如, `Local` 模式运行的 
`Flink` ,将此文件放入 `lib/` 文件夹下。 `Yarn` 集群模式运行的 `Flink` ,则将此文件放入预部署包中。
 
 ## 使用方法
@@ -246,7 +247,11 @@ DataStream<RowData> source = env.fromElements("")
 source.sinkTo(builder.build());
 ```
 
-**SchemaChange 数据流 (JsonDebeziumSchemaSerializer)**
+**CDC 数据流 (JsonDebeziumSchemaSerializer)**
+
+:::info 备注
+上游数据必须符合Debezium数据格式。
+:::
 
 ```java
 // enable checkpoint
@@ -274,7 +279,7 @@ 
builder.setDorisReadOptions(DorisReadOptions.builder().build())
 env.fromSource(mySqlSource, WatermarkStrategy.noWatermarks(), "MySQL Source")
         .sinkTo(builder.build());
 ```
-参考: 
[CDCSchemaChangeExample](https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/test/java/org/apache/doris/flink/CDCSchemaChangeExample.java)
+完整代码参考: 
[CDCSchemaChangeExample](https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/test/java/org/apache/doris/flink/CDCSchemaChangeExample.java)
 
 ### Lookup Join
 
@@ -359,8 +364,8 @@ ON a.city = c.city
 | sink.use-cache              | false         | N        | 
异常时,是否使用内存缓存进行恢复,开启后缓存中会保留 Checkpoint 期间的数据                                     
                                                                                
                                                                                
                                                                                
|
 | sink.enable.batch-mode      | false         | N        | 是否使用攒批模式写入 
Doris,开启后写入时机不依赖 Checkpoint,通过 
sink.buffer-flush.max-rows/sink.buffer-flush.max-bytes/sink.buffer-flush.interval
 参数来控制写入时机。<br />同时开启后将不保证 Exactly-once 语义,可借助 Uniq 模型做到幂等                      
                                                                                
                                     |
 | sink.flush.queue-size       | 2             | N        | 攒批模式下,缓存的队列大小。      
                                                                                
                                                                                
                                                                                
                                                            |
-| sink.buffer-flush.max-rows  | 50000         | N        | 
攒批模式下,单个批次最多写入的数据行数。                                                            
                                                                                
                                                                                
                                                                                
|
-| sink.buffer-flush.max-bytes | 10MB          | N        | 攒批模式下,单个批次最多写入的字节数。 
                                                                                
                                                                                
                                                                                
                                                            |
+| sink.buffer-flush.max-rows  | 500000         | N        | 
攒批模式下,单个批次最多写入的数据行数。                                                            
                                                                                
                                                                                
                                                                                
|
+| sink.buffer-flush.max-bytes | 100MB          | N        | 
攒批模式下,单个批次最多写入的字节数。                                                             
                                                                                
                                                                                
                                                                                
|
 | sink.buffer-flush.interval  | 10s           | N        | 攒批模式下,异步刷新缓存的间隔     
                                                                                
                                                                                
                                                                                
                                                            |
 | sink.ignore.update-before   | true          | N        | 是否忽略 update-before 
事件,默认忽略。                                                                        
                                                                                
                                                                                
                                                             |
 
@@ -514,7 +519,7 @@ insert into doris_sink select id,name,bank,age from 
cdc_mysql_source;
 ```shell
 <FLINK_HOME>bin/flink run \
     -c org.apache.doris.flink.tools.cdc.CdcTools \
-    lib/flink-doris-connector-1.16-1.4.0-SNAPSHOT.jar \
+    lib/flink-doris-connector-1.16-1.6.1.jar \
     
<mysql-sync-database|oracle-sync-database|postgres-sync-database|sqlserver-sync-database|mongodb-sync-database>
 \
     --database <doris-database-name> \
     [--job-name <flink-job-name>] \
@@ -531,7 +536,6 @@ insert into doris_sink select id,name,bank,age from 
cdc_mysql_source;
 ```
 
 
-
 | Key                     | Comment                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                 |
 
|-------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | --job-name              | Flink 任务名称,非必需                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                                 |
@@ -557,7 +561,8 @@ insert into doris_sink select id,name,bank,age from 
cdc_mysql_source;
 | --create-table-only     | 是否只仅仅同步表的结构                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                 |
 
 :::info 备注
-同步时需要在$FLINK_HOME/lib 目录下添加对应的 Flink CDC 依赖,比如 
flink-sql-connector-mysql-cdc-${version}.jar,flink-sql-connector-oracle-cdc-${version}.jar,flink-sql-connector-mongodb-cdc-${version}.jar
+1. 同步时需要在$FLINK_HOME/lib 目录下添加对应的 Flink CDC 依赖,比如 
flink-sql-connector-mysql-cdc-${version}.jar,flink-sql-connector-oracle-cdc-${version}.jar
 ,flink-sql-connector-mongodb-cdc-${version}.jar
+2. Connector24.0.0之后依赖的FlinkCDC版本需要在3.1以上。
 :::
 
 ### MySQL 多表同步示例
@@ -566,7 +571,7 @@ insert into doris_sink select id,name,bank,age from 
cdc_mysql_source;
     -Dexecution.checkpointing.interval=10s \
     -Dparallelism.default=1 \
     -c org.apache.doris.flink.tools.cdc.CdcTools \
-    lib/flink-doris-connector-1.16-1.4.0-SNAPSHOT.jar \
+    lib/flink-doris-connector-1.16-1.6.1.jar \
     mysql-sync-database \
     --database test_db \
     --mysql-conf hostname=127.0.0.1 \
@@ -590,7 +595,7 @@ insert into doris_sink select id,name,bank,age from 
cdc_mysql_source;
      -Dexecution.checkpointing.interval=10s \
      -Dparallelism.default=1 \
      -c org.apache.doris.flink.tools.cdc.CdcTools \
-     ./lib/flink-doris-connector-1.16-1.5.0-SNAPSHOT.jar \
+     ./lib/flink-doris-connector-1.16-1.6.1.jar \
      oracle-sync-database \
      --database test_db \
      --oracle-conf hostname=127.0.0.1 \
@@ -615,7 +620,7 @@ insert into doris_sink select id,name,bank,age from 
cdc_mysql_source;
      -Dexecution.checkpointing.interval=10s \
      -Dparallelism.default=1\
      -c org.apache.doris.flink.tools.cdc.CdcTools \
-     ./lib/flink-doris-connector-1.16-1.5.0-SNAPSHOT.jar \
+     ./lib/flink-doris-connector-1.16-1.6.1.jar \
      postgres-sync-database \
      --database db1\
      --postgres-conf hostname=127.0.0.1 \
@@ -642,7 +647,7 @@ insert into doris_sink select id,name,bank,age from 
cdc_mysql_source;
      -Dexecution.checkpointing.interval=10s \
      -Dparallelism.default=1 \
      -c org.apache.doris.flink.tools.cdc.CdcTools \
-     ./lib/flink-doris-connector-1.16-1.5.0-SNAPSHOT.jar \
+     ./lib/flink-doris-connector-1.16-1.6.1.jar \
      sqlserver-sync-database \
      --database db1\
      --sqlserver-conf hostname=127.0.0.1 \
@@ -667,7 +672,7 @@ insert into doris_sink select id,name,bank,age from 
cdc_mysql_source;
     -Dexecution.checkpointing.interval=10s \
     -Dparallelism.default=1 \
     -c org.apache.doris.flink.tools.cdc.CdcTools \
-    lib/flink-doris-connector-1.17-SNAPSHOT.jar \
+    lib/flink-doris-connector-1.16-1.6.1.jar \
     db2-sync-database \
     --database db2_test \
     --db2-conf hostname=127.0.0.1 \
@@ -694,7 +699,7 @@ insert into doris_sink select id,name,bank,age from 
cdc_mysql_source;
     -Dexecution.checkpointing.interval=10s \
     -Dparallelism.default=1 \
     -c org.apache.doris.flink.tools.cdc.CdcTools \
-    ./lib/flink-doris-connector-1.18-1.6.2-SNAPSHOT.jar \
+    ./lib/flink-doris-connector-1.18-1.6.1.jar \
     mongodb-sync-database \
     --database doris_db \
     --schema-change-mode debezium_structure \
@@ -860,4 +865,4 @@ Flink 在数据导入时,如果有脏数据,比如字段格式、长度等
 
 16. **如果使用整库同步 MySQL 数据到 Doris,出现 timestamp 类型与源数据相差多个小时**
 
-整库同步默认 timezone="UTC+8",如果你同步的数据不是该时区,可以尝试如下设置相对应的时区,例如:`--mysql-conf 
debezium.date.format.timestamp.zone="UTC+3"来解决。`
+整库同步默认 timezone="UTC+8",如果你同步的数据不是该时区,可以尝试如下设置相对应的时区,例如:`--mysql-conf 
debezium.date.format.timestamp.zone="UTC+3"来解决。`
\ No newline at end of file
diff --git a/ecosystem/flink-doris-connector.md 
b/ecosystem/flink-doris-connector.md
index 8da3fd1b082..38735250543 100644
--- a/ecosystem/flink-doris-connector.md
+++ b/ecosystem/flink-doris-connector.md
@@ -46,6 +46,7 @@ under the License.
 | 1.4.0             | 1.15,1.16,1.17      | 1.0+   | 8   |- |
 | 1.5.2             | 1.15,1.16,1.17,1.18 | 1.0+ | 8 |- |
 | 1.6.2             | 1.15,1.16,1.17,1.18,1.19 | 1.0+ | 8 | - |
+| 24.0.0            | 1.15,1.16,1.17,1.18,1.19,1.20 | 1.0+ | 8 |- |
 
 ## USE
 
@@ -58,7 +59,7 @@ Add flink-doris-connector
 <dependency>
    <groupId>org.apache.doris</groupId>
    <artifactId>flink-doris-connector-1.16</artifactId>
-   <version>1.6.2</version>
+   <version>24.0.0</version>
 </dependency>
 ```
 
@@ -243,7 +244,7 @@ DataStream<RowData> source = env. fromElements("")
 source. sinkTo(builder. build());
 ```
 
-**SchemaChange data stream (JsonDebeziumSchemaSerializer)**
+**CDC data stream (JsonDebeziumSchemaSerializer)**
 
 ```java
 // enable checkpoint
@@ -351,8 +352,8 @@ ON a.city = c.city
 | sink.use-cache              | false         | N        | In case of an 
exception, whether to use the memory cache for recovery. When enabled, the data 
during the Checkpoint period will be retained in the cache. |
 | sink.enable.batch-mode      | false         | N        | Whether to use the 
batch mode to write to Doris. After it is enabled, the writing timing does not 
depend on Checkpoint. The writing is controlled through the 
sink.buffer-flush.max-rows/sink.buffer-flush.max-bytes/sink.buffer-flush.interval
 parameter. Enter the opportunity. <br />After being turned on at the same 
time, Exactly-once semantics will not be guaranteed. Uniq model can be used to 
achieve idempotence. |
 | sink.flush.queue-size       | 2             | N        | In batch mode, the 
cached column size.                       |
-| sink.buffer-flush.max-rows  | 50000         | N        | In batch mode, the 
maximum number of data rows written in a single batch. |
-| sink.buffer-flush.max-bytes | 10MB          | N        | In batch mode, the 
maximum number of bytes written in a single batch. |
+| sink.buffer-flush.max-rows  | 500000         | N        | In batch mode, the 
maximum number of data rows written in a single batch. |
+| sink.buffer-flush.max-bytes | 100MB          | N        | In batch mode, the 
maximum number of bytes written in a single batch. |
 | sink.buffer-flush.interval  | 10s           | N        | In batch mode, the 
interval for asynchronously refreshing the cache |
 | sink.ignore.update-before   | true          | N        | Whether to ignore 
the update-before event, ignored by default. |
 
@@ -538,11 +539,8 @@ insert into doris_sink select id,name,bank,age from 
cdc_mysql_source;
 | --oracle-conf           | Oracle CDCSource configuration, for example 
--oracle-conf hostname=127.0.0.1, you can find 
[here](https://nightlies.apache.org/flink/flink-cdc-docs-release-3.0/docs/connectors/legacy-flink-cdc-sources/oracle-cdc/)
 View all configurations Oracle-CDC, where 
hostname/username/password/database-name/schema-name is required. |
 | --postgres-conf         | Postgres CDCSource configuration, e.g. 
--postgres-conf hostname=127.0.0.1, you can find 
[here](https://nightlies.apache.org/flink/flink-cdc-docs-release-3.0/docs/connectors/legacy-flink-cdc-sources/postgres-cdc/)
 View all configurations Postgres-CDC where 
hostname/username/password/database-name/schema-name/slot.name is required. |
 | --sqlserver-conf        | SQLServer CDCSource configuration, for example 
--sqlserver-conf hostname=127.0.0.1, you can find it 
[here](https://nightlies.apache.org/flink/flink-cdc-docs-release-3.0/docs/connectors/legacy-flink-cdc-sources/sqlserver-cdc/)
 View all configurations SQLServer-CDC, where 
hostname/username/password/database-name/schema-name is required. |
-<<<<<<< HEAD
 | --db2-conf        | DB2 CDCSource configuration, for example --db2-conf 
hostname=127.0.0.1, you can find it 
[here](https://nightlies.apache.org/flink/flink-cdc-docs-release-3.1/docs/connectors/flink-sources/db2-cdc/)
 View all configurations DB2-CDC, where 
hostname/username/password/database-name/schema-name is required. |
-=======
 | --mongodb-conf          | MongoDB CDCSource configuration, for example 
--mongodb-conf hosts=127.0.0.1:27017, you can find all Mongo-CDC configurations 
[here](https://nightlies.apache.org/flink/flink-cdc-docs-release-3.0/docs/connectors/flink-sources/mongodb-cdc/),
 where hosts/username/password/database are required. The --mongodb-conf 
schema.sample-percent configuration is for automatically sampling MongoDB data 
for creating a table in Doris, with a default value of 0.2. |
->>>>>>> 59a26707 (Add a guide related to Mongo CDC to the 
flink-doris-connector documentation.)
 | --sink-conf             | All configurations of Doris Sink can be found 
[here](https://doris.apache.org/zh-CN/docs/dev/ecosystem/flink-doris-connector/#%E9%80%9A%E7%94%A8%E9%85%8D%E7%BD%AE%E9%A1%B9)
 View the complete configuration items. |
 | --table-conf            | The configuration items of the Doris table(The 
exception is table-buckets, non-properties attributes), that is, the content 
contained in properties. For example `--table-conf replication_num=1`, and the 
`--table-conf table-buckets="tbl1:10,tbl2:20,a.*:30,b.*:40,.*:50"` option 
specifies the number of buckets for different tables based on the order of 
regular expressions. If there is no match, the table is created with the 
default setting of BUCKETS AUTO. |
 | --ignore-default-value  | Turn off the default value of synchronizing MySQL 
table structure. It is suitable for synchronizing MySQL data to Doris when the 
field has a default value but the actual inserted data is null. Reference 
[here](https://github.com/apache/doris-flink-connector/pull/152) |
@@ -554,7 +552,8 @@ insert into doris_sink select id,name,bank,age from 
cdc_mysql_source;
 | --create-table-only     | Whether only the table schema should be 
synchronized                                                                   |
 
 :::info Note
-When synchronizing, you need to add the corresponding Flink CDC dependencies 
in the $FLINK_HOME/lib directory, such as 
flink-sql-connector-mysql-cdc-${version}.jar, 
flink-sql-connector-oracle-cdc-${version}.jar , 
flink-sql-connector-mongodb-cdc-${version}.jar
+1. When synchronizing, you need to add the corresponding Flink CDC 
dependencies in the $FLINK_HOME/lib directory, such as 
flink-sql-connector-mysql-cdc-${version}.jar, 
flink-sql-connector-oracle-cdc-${version}.jar , 
flink-sql-connector-mongodb-cdc-${version}.jar
+2. The FlinkCDC version that Connector 24.0.0 depends on must be above 3.1.
 :::
 
 ### MySQL synchronization example


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to