This is an automated email from the ASF dual-hosted git repository.
yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push:
new c93e5d9e89b [doc](flink-connector) update flink doc and options
(#27875)
c93e5d9e89b is described below
commit c93e5d9e89b71bc7f2cfc4ed84a25786f3e75fe6
Author: wudi <[email protected]>
AuthorDate: Fri Dec 1 17:40:08 2023 +0800
[doc](flink-connector) update flink doc and options (#27875)
---------
Co-authored-by: wudi <>
---
docs/en/docs/ecosystem/flink-doris-connector.md | 80 ++++++++++++---------
docs/zh-CN/docs/ecosystem/flink-doris-connector.md | 83 +++++++++++++---------
2 files changed, 96 insertions(+), 67 deletions(-)
diff --git a/docs/en/docs/ecosystem/flink-doris-connector.md
b/docs/en/docs/ecosystem/flink-doris-connector.md
index ccacc47a55a..0b211748cfe 100644
--- a/docs/en/docs/ecosystem/flink-doris-connector.md
+++ b/docs/en/docs/ecosystem/flink-doris-connector.md
@@ -44,6 +44,7 @@ under the License.
| 1.2.1 | 1.15 | 1.0+ | 8 | - |
| 1.3.0 | 1.16 | 1.0+ | 8 | - |
| 1.4.0 | 1.15,1.16,1.17 | 1.0+ | 8 |- |
+| 1.5.0 | 1.15,1.16,1.17,1.18 | 1.0+ | 8 |- |
## USE
@@ -309,16 +310,18 @@ ON a.city = c.city
### General configuration items
-| Key | Default Value | Required | Comment
|
-|----------------------------------|---------------|----------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
-| fenodes | -- | Y | Doris FE http
address, multiple addresses are supported, separated by commas
|
+| Key | Default Value | Required | Comment
|
+| -------------------------------- | ------------- | -------- |
------------------------------------------------------------ |
+| fenodes | -- | Y | Doris FE http
address, multiple addresses are supported, separated by commas |
| benodes | -- | N | Doris BE http
address, multiple addresses are supported, separated by commas. refer to
[#187](https://github.com/apache/doris-flink-connector/pull/187) |
-| table.identifier | -- | Y | Doris table
name, such as: db.tbl
|
-| username | -- | Y | username to
access Doris
|
-| password | -- | Y | Password to
access Doris
|
-| doris.request.retries | 3 | N | Number of
retries to send requests to Doris
|
-| doris.request.connect.timeout.ms | 30000 | N | Connection
timeout for sending requests to Doris
|
-| doris.request.read.timeout.ms | 30000 | N | Read timeout
for sending requests to Doris
|
+| jdbc-url | -- | N | jdbc
connection information, such as: jdbc:mysql://127.0.0.1:9030 |
+| table.identifier | -- | Y | Doris table
name, such as: db.tbl |
+| username | -- | Y | username to
access Doris |
+| password | -- | Y | Password to
access Doris |
+| auto-redirect | false | N | Whether to
redirect StreamLoad requests. After being turned on, StreamLoad will be written
through FE, and BE information will no longer be displayed. At the same time,
it can also be written to SelectDB Cloud by turning on this parameter. |
+| doris.request.retries | 3 | N | Number of
retries to send requests to Doris |
+| doris.request.connect.timeout.ms | 30000 | N | Connection
timeout for sending requests to Doris |
+| doris.request.read.timeout.ms | 30000 | N | Read timeout
for sending requests to Doris |
### Source configuration item
@@ -335,21 +338,27 @@ ON a.city = c.city
### Sink configuration items
-| Key | Default Value | Required | Comment
|
-| ------------------ | ------------- | -------- |
------------------------------------------------------------ |
-| sink.label-prefix | -- | Y | The label prefix used by
Stream load import. In the 2pc scenario, global uniqueness is required to
ensure Flink's EOS semantics. |
-| sink.properties.* | -- | N | Import parameters for Stream
Load. <br/>For example: 'sink.properties.column_separator' = ', ' defines
column delimiters, 'sink.properties.escape_delimiters' = 'true' special
characters as delimiters, '\x01' will be converted to binary 0x01
<br/><br/>JSON format import<br/>'sink.properties.format' = 'json'
'sink.properties. read_json_by_line' = 'true'<br/>Detailed parameters refer to
[here](../data-operate/import/import-way/stream-load-ma [...]
-| sink.enable-delete | TRUE | N | Whether to enable delete.
This option requires the Doris table to enable the batch delete function (Doris
0.15+ version is enabled by default), and only supports the Unique model. |
-| sink.enable-2pc | TRUE | N | Whether to enable two-phase
commit (2pc), the default is true, to ensure Exactly-Once semantics. For
two-phase commit, please refer to
[here](../data-operate/import/import-way/stream-load-manual.md). |
-| sink.buffer-size | 1MB | N | The size of the write data
cache buffer, in bytes. It is not recommended to modify, the default
configuration is enough |
-| sink.buffer-count | 3 | N | The number of write data
buffers. It is not recommended to modify, the default configuration is enough |
-| sink.max-retries | 3 | N | Maximum number of retries
after Commit failure, default 3 |
+| Key | Default Value | Required | Comment
|
+| --------------------------- | ------------- | -------- |
------------------------------------------------------------ |
+| sink.label-prefix | -- | Y | The label prefix
used by Stream load import. In the 2pc scenario, global uniqueness is required
to ensure Flink's EOS semantics. |
+| sink.properties.* | -- | N | Import parameters
for Stream Load. <br/>For example: 'sink.properties.column_separator' = ', '
defines column delimiters, 'sink.properties.escape_delimiters' = 'true' special
characters as delimiters, '\x01' will be converted to binary 0x01
<br/><br/>JSON format import<br/>'sink.properties.format' = 'json'
'sink.properties. read_json_by_line' = 'true'<br/>Detailed parameters refer to
[here](../data-operate/import/import-way/strea [...]
+| sink.enable-delete | TRUE | N | Whether to enable
delete. This option requires the Doris table to enable the batch delete
function (Doris 0.15+ version is enabled by default), and only supports the
Unique model. |
+| sink.enable-2pc | TRUE | N | Whether to enable
two-phase commit (2pc), the default is true, to ensure Exactly-Once semantics.
For two-phase commit, please refer to
[here](../data-operate/import/import-way/stream-load-manual.md). |
+| sink.buffer-size | 1MB | N | The size of the
write data cache buffer, in bytes. It is not recommended to modify, the default
configuration is enough |
+| sink.buffer-count | 3 | N | The number of write
data buffers. It is not recommended to modify, the default configuration is
enough |
+| sink.max-retries | 3 | N | Maximum number of
retries after Commit failure, default 3 |
+| sink.use-cache | false | N | In case of an
exception, whether to use the memory cache for recovery. When enabled, the data
during the Checkpoint period will be retained in the cache. |
+| sink.enable.batch-mode | false | N | Whether to use the
batch mode to write to Doris. After it is enabled, the writing timing does not
depend on Checkpoint. The writing is controlled through the
sink.buffer-flush.max-rows/sink.buffer-flush.max-bytes/sink.buffer-flush.interval
parameter. Enter the opportunity. <br />After being turned on at the same
time, Exactly-once semantics will not be guaranteed. Uniq model can be used to
achieve idempotence. |
+| sink.flush.queue-size | 2 | N | In batch mode, the
cached column size. |
+| sink.buffer-flush.max-rows | 50000 | N | In batch mode, the
maximum number of data rows written in a single batch. |
+| sink.buffer-flush.max-bytes | 10MB | N | In batch mode, the
maximum number of bytes written in a single batch. |
+| sink.buffer-flush.interval | 10s | N | In batch mode, the
interval for asynchronously refreshing the cache |
+| sink.ignore.update-before | true | N | Whether to ignore
the update-before event, ignored by default. |
### Lookup Join configuration item
| Key | Default Value | Required | Comment
|
| --------------------------------- | ------------- | -------- |
------------------------------------------------------------ |
-| jdbc-url | -- | Y | jdbc
connection information |
| lookup.cache.max-rows | -1 | N | The maximum
number of rows in the lookup cache, the default value is -1, and the cache is
not enabled |
| lookup.cache.ttl | 10s | N | The maximum
time of lookup cache, the default is 10s |
| lookup.max-retries | 1 | N | The number of
retries after a lookup query fails |
@@ -486,20 +495,25 @@ insert into doris_sink select id,name,bank,age from
cdc_mysql_source;
[--table-conf <doris-table-conf> [--table-conf <doris-table-conf> ...]]
```
-- **--job-name** Flink job name, not required.
-- **--database** Synchronize to the database name of Doris.
-- **--table-prefix** Doris table prefix name, for example --table-prefix ods_.
-- **--table-suffix** Same as above, the suffix name of the Doris table.
-- **--including-tables** MySQL tables that need to be synchronized, you can
use "|" to separate multiple tables, and support regular expressions. For
example --including-tables table1|tbl.* is to synchronize table1 and all tables
beginning with tbl.
-- **--excluding-tables** Tables that do not need to be synchronized, the usage
is the same as above.
-- **--mysql-conf** MySQL CDCSource configuration, eg --mysql-conf
hostname=127.0.0.1 , you can see all configuration MySQL-CDC in
[here](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/mysql-cdc.html),
where hostname/username/password/database-name is required.To synchronize
tables without primary keys, you must configure
`scan.incremental.snapshot.chunk.key-column` the option, and specify only one
non-null field. For example, `scan.incremental.snapshot.chunk.k [...]
-- **--oracle-conf** Oracle CDCSource configuration, for example --oracle-conf
hostname=127.0.0.1, you can view all configurations of Oracle-CDC in
[here](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/oracle-cdc.html),
where hostname/username/password/database-name/schema-name is required.
-- **--postgres-conf** Postgres CDCSource configuration,for example
--postgres-conf hostname=127.0.0.1 ,you can see all configuration of
Postgres-CDC in
[here](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/postgres-cdc.html),where
hostname/username/password/database-name/schema-name/slot.name is required.
-- **--sqlserver-conf** SQLServer CDCSource configuration,for example
--sqlserver-conf hostname=127.0.0.1 ,you can see all configuration of
SQLServer-CDC in
[here](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/sqlserver-cdc.html),where
hostname/username/password/database-name/schema-name is required.
-- **--sink-conf** All configurations of Doris Sink, you can view the complete
configuration items in
[here](https://doris.apache.org/zh-CN/docs/dev/ecosystem/flink-doris-connector/#%E9%80%9A%E7%94%A8%E9%85%8D%E7%BD%AE%E9%A1%B9).
-- **--table-conf** The configuration item of the Doris table, that is, the
content contained in properties. For example --table-conf replication_num=1
-- **--ignore-default-value** Turn off the default for synchronizing mysql
table structures. It is suitable for synchronizing mysql data to doris, the
field has a default value, but the actual inserted data is null. refer
to[#152](https://github.com/apache/doris-flink-connector/pull/152)
-- **--use-new-schema-change** The new schema change supports synchronous mysql
multi-column changes and default values. refer
to[#167](https://github.com/apache/doris-flink-connector/pull/167)
+| Key | Comment
|
+| ----------------------- |
------------------------------------------------------------ |
+| --job-name | Flink task name, optional
|
+| --database | Database name synchronized to Doris
|
+| --table-prefix | Doris table prefix name, such as --table-prefix
ods_. |
+| --table-suffix | Same as above, the suffix name of the Doris table.
|
+| --including-tables | For MySQL tables that need to be synchronized, you
can use "|" to separate multiple tables and support regular expressions. For
example --including-tables table1 |
+| --excluding-tables | For tables that do not need to be synchronized,
the usage is the same as above. |
+| --mysql-conf | MySQL CDCSource configuration, for example
--mysql-conf hostname=127.0.0.1, you can find it
[here](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/mysql-cdc.html)
View all configurations MySQL-CDC, where
hostname/username/password/database-name is required. When the synchronized
library table contains a non-primary key table,
`scan.incremental.snapshot.chunk.key-column` must be set, and only one field of
non-null type can be selecte [...]
+| --oracle-conf | Oracle CDCSource configuration, for example
--oracle-conf hostname=127.0.0.1, you can find
[here](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/oracle-cdc.html)
View all configurations Oracle-CDC, where
hostname/username/password/database-name/schema-name is required. |
+| --postgres-conf | Postgres CDCSource configuration, e.g.
--postgres-conf hostname=127.0.0.1, you can find
[here](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/postgres-cdc.html)
View all configurations Postgres-CDC where
hostname/username/password/database-name/schema-name/slot.name is required. |
+| --sqlserver-conf | SQLServer CDCSource configuration, for example
--sqlserver-conf hostname=127.0.0.1, you can find it
[here](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/sqlserver-cdc.html)
View all configurations SQLServer-CDC, where
hostname/username/password/database-name/schema-name is required. |
+| --sink-conf | All configurations of Doris Sink can be found
[here](https://doris.apache.org/zh-CN/docs/dev/ecosystem/flink-doris-connector/#%E9%80%9A%E7%94%A8%E9%85%8D%E7%BD%AE%E9%A1%B9)
View the complete configuration items. |
+| --table-conf | The configuration items of the Doris table, that
is, the content contained in properties. For example --table-conf
replication_num=1 |
+| --ignore-default-value | Turn off the default value of synchronizing mysql
table structure. It is suitable for synchronizing mysql data to doris when the
field has a default value but the actual inserted data is null. Reference
[here](https://github.com/apache/doris-flink-connector/pull/152) |
+| --use-new-schema-change | Whether to use the new schema change to support
synchronization of MySQL multi-column changes and default values. Reference
[here](https://github.com/apache/doris-flink-connector/pull/167) |
+| --single-sink | Whether to use a single Sink to synchronize all
tables. When turned on, newly created tables in the upstream can also be
automatically recognized and tables automatically created. |
+| --multi-to-one-origin | When writing multiple upstream tables into the
same table, the configuration of the source table, for example:
--multi-to-one-origin="a\_.\*|b_.\*", Reference
[here](https://github.com/apache/doris-flink-connector/pull/208) |
+| --multi-to-one-target | Used with multi-to-one-origin, the configuration
of the target table, such as:--multi-to-one-target="a\|b" |
>Note: When synchronizing, you need to add the corresponding Flink CDC
>dependencies in the $FLINK_HOME/lib directory, such as
>flink-sql-connector-mysql-cdc-${version}.jar,
>flink-sql-connector-oracle-cdc-${version}.jar
diff --git a/docs/zh-CN/docs/ecosystem/flink-doris-connector.md
b/docs/zh-CN/docs/ecosystem/flink-doris-connector.md
index b1d5338fda9..6222e4e8281 100644
--- a/docs/zh-CN/docs/ecosystem/flink-doris-connector.md
+++ b/docs/zh-CN/docs/ecosystem/flink-doris-connector.md
@@ -46,6 +46,7 @@ under the License.
| 1.2.1 | 1.15 | 1.0+ | 8 | - |
| 1.3.0 | 1.16 | 1.0+ | 8 | - |
| 1.4.0 | 1.15,1.16,1.17 | 1.0+ | 8 |- |
+| 1.5.0 | 1.15,1.16,1.17,1.18 | 1.0+ | 8 |- |
## 使用
@@ -312,16 +313,18 @@ ON a.city = c.city
### 通用配置项
-| Key | Default Value | Required | Comment
|
-|----------------------------------|---------------|----------|----------------------------------------------------------------------------------------------------|
-| fenodes | -- | Y | Doris FE http
地址, 支持多个地址,使用逗号分隔
|
+| Key | Default Value | Required | Comment
|
+| -------------------------------- | ------------- | -------- |
------------------------------------------------------------ |
+| fenodes | -- | Y | Doris FE http
地址, 支持多个地址,使用逗号分隔 |
| benodes | -- | N | Doris BE http
地址,
支持多个地址,使用逗号分隔,参考[#187](https://github.com/apache/doris-flink-connector/pull/187)
|
-| table.identifier | -- | Y | Doris
表名,如:db.tbl
|
-| username | -- | Y | 访问 Doris 的用户名
|
-| password | -- | Y | 访问 Doris 的密码
|
-| doris.request.retries | 3 | N | 向 Doris
发送请求的重试次数
|
-| doris.request.connect.timeout.ms | 30000 | N | 向 Doris
发送请求的连接超时时间
|
-| doris.request.read.timeout.ms | 30000 | N | 向 Doris
发送请求的读取超时时间
|
+| jdbc-url | -- | N | jdbc连接信息,如:
jdbc:mysql://127.0.0.1:9030 |
+| table.identifier | -- | Y | Doris
表名,如:db.tbl |
+| username | -- | Y | 访问 Doris 的用户名
|
+| password | -- | Y | 访问 Doris 的密码
|
+| auto-redirect | false | N |
是否重定向StreamLoad请求。开启后StreamLoad将通过FE写入,不再显示获取BE信息,同时也可通过开启该参数写入SelectDB Cloud |
+| doris.request.retries | 3 | N | 向 Doris
发送请求的重试次数 |
+| doris.request.connect.timeout.ms | 30000 | N | 向 Doris
发送请求的连接超时时间 |
+| doris.request.read.timeout.ms | 30000 | N | 向 Doris
发送请求的读取超时时间 |
### Source 配置项
@@ -338,21 +341,27 @@ ON a.city = c.city
### Sink 配置项
-| Key | Default Value | Required | Comment
|
-| ------------------ | ------------- | -------- |
------------------------------------------------------------ |
-| sink.label-prefix | -- | Y | Stream
load导入使用的label前缀。2pc场景下要求全局唯一 ,用来保证Flink的EOS语义。 |
-| sink.properties.* | -- | N | Stream Load 的导入参数。<br/>例如:
'sink.properties.column_separator' = ', ' 定义列分隔符,
'sink.properties.escape_delimiters' = 'true' 特殊字符作为分隔符,'\x01'会被转换为二进制的0x01
<br/><br/>JSON格式导入<br/>'sink.properties.format' = 'json'
'sink.properties.read_json_by_line' =
'true'<br/>详细参数参考[这里](../data-operate/import/import-way/stream-load-manual.md)。
|
-| sink.enable-delete | TRUE | N | 是否启用删除。此选项需要 Doris
表开启批量删除功能(Doris0.15+版本默认开启),只支持 Unique 模型。 |
-| sink.enable-2pc | TRUE | N |
是否开启两阶段提交(2pc),默认为true,保证Exactly-Once语义。关于两阶段提交可参考[这里](../data-operate/import/import-way/stream-load-manual.md)。
|
-| sink.buffer-size | 1MB | N |
写数据缓存buffer大小,单位字节。不建议修改,默认配置即可 |
-| sink.buffer-count | 3 | N | 写数据缓存buffer个数。不建议修改,默认配置即可
|
-| sink.max-retries | 3 | N | Commit失败后的最大重试次数,默认3次
|
+| Key | Default Value | Required | Comment
|
+| --------------------------- | ------------- | -------- |
------------------------------------------------------------ |
+| sink.label-prefix | -- | Y | Stream
load导入使用的label前缀。2pc场景下要求全局唯一 ,用来保证Flink的EOS语义。 |
+| sink.properties.* | -- | N | Stream Load
的导入参数。<br/>例如: 'sink.properties.column_separator' = ', ' 定义列分隔符,
'sink.properties.escape_delimiters' = 'true' 特殊字符作为分隔符,'\x01'会被转换为二进制的0x01
<br/><br/>JSON格式导入<br/>'sink.properties.format' = 'json'
'sink.properties.read_json_by_line' =
'true'<br/>详细参数参考[这里](../data-operate/import/import-way/stream-load-manual.md)。
|
+| sink.enable-delete | TRUE | N | 是否启用删除。此选项需要 Doris
表开启批量删除功能(Doris0.15+版本默认开启),只支持 Unique 模型。 |
+| sink.enable-2pc | TRUE | N |
是否开启两阶段提交(2pc),默认为true,保证Exactly-Once语义。关于两阶段提交可参考[这里](../data-operate/import/import-way/stream-load-manual.md)。
|
+| sink.buffer-size | 1MB | N |
写数据缓存buffer大小,单位字节。不建议修改,默认配置即可 |
+| sink.buffer-count | 3 | N |
写数据缓存buffer个数。不建议修改,默认配置即可 |
+| sink.max-retries | 3 | N |
Commit失败后的最大重试次数,默认3次 |
+| sink.use-cache | false | N |
异常时,是否使用内存缓存进行恢复,开启后缓存中会保留Checkpoint期间的数据 |
+| sink.enable.batch-mode | false | N |
是否使用攒批模式写入Doris,开启后写入时机不依赖Checkpoint,通过sink.buffer-flush.max-rows/sink.buffer-flush.max-bytes/sink.buffer-flush.interval
参数来控制写入时机。<br />同时开启后将不保证Exactly-once语义,可借助Uniq模型做到幂等 |
+| sink.flush.queue-size | 2 | N | 攒批模式下,缓存的对列大小。
|
+| sink.buffer-flush.max-rows | 50000 | N |
攒批模式下,单个批次最多写入的数据行数。 |
+| sink.buffer-flush.max-bytes | 10MB | N | 攒批模式下,单个批次最多写入的字节数。
|
+| sink.buffer-flush.interval | 10s | N | 攒批模式下,异步刷新缓存的间隔
|
+| sink.ignore.update-before | true | N |
是否忽略update-before事件,默认忽略。 |
### Lookup Join 配置项
| Key | Default Value | Required | Comment
|
| --------------------------------- | ------------- | -------- |
------------------------------------------ |
-| jdbc-url | -- | Y | jdbc连接信息
|
| lookup.cache.max-rows | -1 | N |
lookup缓存的最大行数,默认值-1,不开启缓存 |
| lookup.cache.ttl | 10s | N |
lookup缓存的最大时间,默认10s |
| lookup.max-retries | 1 | N |
lookup查询失败后的重试次数 |
@@ -487,21 +496,27 @@ insert into doris_sink select id,name,bank,age from
cdc_mysql_source;
[--table-conf <doris-table-conf> [--table-conf <doris-table-conf> ...]]
```
-- **--job-name** Flink任务名称, 非必需。
-- **--database** 同步到Doris的数据库名。
-- **--table-prefix** Doris表前缀名,例如 --table-prefix ods_。
-- **--table-suffix** 同上,Doris表的后缀名。
-- **--including-tables** 需要同步的MySQL表,可以使用"|" 分隔多个表,并支持正则表达式。
比如--including-tables table1|tbl.*就是同步table1和所有以tbl开头的表。
-- **--excluding-tables** 不需要同步的表,用法同上。
-- **--mysql-conf** MySQL CDCSource 配置,例如--mysql-conf hostname=127.0.0.1
,您可以在[这里](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/mysql-cdc.html)查看所有配置MySQL-CDC,其中hostname/username/password/database-name
是必需的。同步的库表中含有非主键表时,必须设置
`scan.incremental.snapshot.chunk.key-column`,且只能选择非空类型的一个字段。
-例如:`scan.incremental.snapshot.chunk.key-column=database.table:column,database.table1:column...`,不同的库表列之间用`,`隔开。
-- **--oracle-conf** Oracle CDCSource 配置,例如--oracle-conf
hostname=127.0.0.1,您可以在[这里](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/oracle-cdc.html)查看所有配置Oracle-CDC,其中hostname/username/password/database-name/schema-name
是必需的。
-- **--postgres-conf** Postgres CDCSource 配置,例如--postgres-conf
hostname=127.0.0.1
,您可以在[这里](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/postgres-cdc.html)查看所有配置Postgres-CDC,其中hostname/username/password/database-name/schema-name/slot.name
是必需的。
-- **--sqlserver-conf** SQLServer CDCSource 配置,例如--sqlserver-conf
hostname=127.0.0.1
,您可以在[这里](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/sqlserver-cdc.html)查看所有配置SQLServer-CDC,其中hostname/username/password/database-name/schema-name
是必需的。
-- **--sink-conf** Doris Sink
的所有配置,可以在[这里](https://doris.apache.org/zh-CN/docs/dev/ecosystem/flink-doris-connector/#%E9%80%9A%E7%94%A8%E9%85%8D%E7%BD%AE%E9%A1%B9)查看完整的配置项。
-- **--table-conf** Doris表的配置项,即properties中包含的内容。 例如 --table-conf
replication_num=1
-- **--ignore-default-value**
关闭同步mysql表结构的默认值。适用于同步mysql数据到doris时,字段有默认值,但实际插入数据为null情况。参考[#152](https://github.com/apache/doris-flink-connector/pull/152)
-- **--use-new-schema-change** 新的schema
change支持同步mysql多列变更、默认值。参考[#167](https://github.com/apache/doris-flink-connector/pull/167)
+
+
+| Key | Comment
|
+| ----------------------- |
------------------------------------------------------------ |
+| --job-name | Flink任务名称, 非必需
|
+| --database | 同步到Doris的数据库名
|
+| --table-prefix | Doris表前缀名,例如 --table-prefix ods_。
|
+| --table-suffix | 同上,Doris表的后缀名。
|
+| --including-tables | 需要同步的MySQL表,可以使用"\|" 分隔多个表,并支持正则表达式。
比如--including-tables table1 |
+| --excluding-tables | 不需要同步的表,用法同上。 |
+| --mysql-conf | MySQL CDCSource 配置,例如--mysql-conf
hostname=127.0.0.1
,您可以在[这里](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/mysql-cdc.html)查看所有配置MySQL-CDC,其中hostname/username/password/database-name
是必需的。同步的库表中含有非主键表时,必须设置
`scan.incremental.snapshot.chunk.key-column`,且只能选择非空类型的一个字段。<br/>例如:`scan.incremental.snapshot.chunk.key-column=database.table:column,database.table1:column...`,不同的库表列之间用`,`隔开。
|
+| --oracle-conf | Oracle CDCSource 配置,例如--oracle-conf
hostname=127.0.0.1,您可以在[这里](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/oracle-cdc.html)查看所有配置Oracle-CDC,其中hostname/username/password/database-name/schema-name
是必需的。 |
+| --postgres-conf | Postgres CDCSource 配置,例如--postgres-conf
hostname=127.0.0.1
,您可以在[这里](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/postgres-cdc.html)查看所有配置Postgres-CDC,其中hostname/username/password/database-name/schema-name/slot.name
是必需的。 |
+| --sqlserver-conf | SQLServer CDCSource 配置,例如--sqlserver-conf
hostname=127.0.0.1
,您可以在[这里](https://ververica.github.io/flink-cdc-connectors/master/content/connectors/sqlserver-cdc.html)查看所有配置SQLServer-CDC,其中hostname/username/password/database-name/schema-name
是必需的。 |
+| --sink-conf | Doris Sink
的所有配置,可以在[这里](https://doris.apache.org/zh-CN/docs/dev/ecosystem/flink-doris-connector/#%E9%80%9A%E7%94%A8%E9%85%8D%E7%BD%AE%E9%A1%B9)查看完整的配置项。
|
+| --table-conf | Doris表的配置项,即properties中包含的内容。 例如 --table-conf
replication_num=1 |
+| --ignore-default-value |
关闭同步mysql表结构的默认值。适用于同步mysql数据到doris时,字段有默认值,但实际插入数据为null情况。参考[#152](https://github.com/apache/doris-flink-connector/pull/152)
|
+| --use-new-schema-change | 是否使用新的schema
change,支持同步mysql多列变更、默认值。参考[#167](https://github.com/apache/doris-flink-connector/pull/167)
|
+| --single-sink | 是否使用单个Sink同步所有表,开启后也可自动识别上游新创建的表,自动创建表。 |
+| --multi-to-one-origin |
将上游多张表写入同一张表时,源表的配置,比如:--multi-to-one-origin="a\_.\*\|b_.\*",
具体参考[这里](https://github.com/apache/doris-flink-connector/pull/208) |
+| --multi-to-one-target |
与multi-to-one-origin搭配使用,目标表的配置,比如:--multi-to-one-target="a\|b" |
>注:同步时需要在$FLINK_HOME/lib 目录下添加对应的Flink CDC依赖,比如
>flink-sql-connector-mysql-cdc-${version}.jar,flink-sql-connector-oracle-cdc-${version}.jar
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]