This is an automated email from the ASF dual-hosted git repository.
wanghailin pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/seatunnel.git
The following commit(s) were added to refs/heads/dev by this push:
new e51076b71a [Improve][Docs] Add tuning guide for doris sink in
streaming mode (#7732)
e51076b71a is described below
commit e51076b71a3afed8e08102d56f5047a1ce3cfa19
Author: dailai <[email protected]>
AuthorDate: Wed Sep 25 21:22:26 2024 +0800
[Improve][Docs] Add tuning guide for doris sink in streaming mode (#7732)
---
docs/en/connector-v2/sink/Doris.md | 7 +++++++
docs/zh/connector-v2/sink/Doris.md | 7 +++++++
2 files changed, 14 insertions(+)
diff --git a/docs/en/connector-v2/sink/Doris.md
b/docs/en/connector-v2/sink/Doris.md
index 18915ac7b8..0e89087adf 100644
--- a/docs/en/connector-v2/sink/Doris.md
+++ b/docs/en/connector-v2/sink/Doris.md
@@ -150,6 +150,13 @@ You can use the following placeholders
The supported formats include CSV and JSON
+## Tuning Guide
+
+Appropriately increasing the value of `sink.buffer-size` and
`doris.batch.size` can increase the write performance. <br>
+In stream mode, if the `doris.batch.size` and `checkpoint.interval` are both
configured with a large value, The last data to arrive may have a large
delay(The delay time is the checkpoint interval). <br>
+This is because the total amount of data arriving at the end may not exceed
the threshold specified by `doris.batch.size`. Therefore, commit can only be
triggered by checkpoint before the volume of received data does not exceed this
threshold. Therefore, you should select an appropriate
`checkpoint.interval`.<br>
+Otherwise if you enable the 2pc by the property `sink.enable-2pc=true`.The
`sink.buffer-size` will have no effect. So only the checkpoint can trigger the
commit.
+
## Task Example
### Simple:
diff --git a/docs/zh/connector-v2/sink/Doris.md
b/docs/zh/connector-v2/sink/Doris.md
index b35cd63f4e..9262af987c 100644
--- a/docs/zh/connector-v2/sink/Doris.md
+++ b/docs/zh/connector-v2/sink/Doris.md
@@ -147,6 +147,13 @@ CREATE TABLE IF NOT EXISTS `${database}`.`${table_name}`
支持的格式包括 CSV 和 JSON。
+## 调优指南
+
+适当增加`sink.buffer-size`和`doris.batch.size`的值可以提高写性能。<br>
+在流模式下,如果`doris.batch.size`和`checkpoint.interval`都配置为较大的值,最后到达的数据可能会有较大的延迟(延迟的时间就是检查点间隔的时间)。<br>
+这是因为最后到达的数据总量可能不会超过doris.batch.size指定的阈值。因此,在接收到数据的数据量没有超过该阈值之前只有检查点才会触发提交操作。因此,需要选择一个合适的检查点间隔。<br>
+此外,如果你通过`sink.enable-2pc=true`属性启用2pc。`sink.buffer-size`将会失去作用,只有检查点才能触发提交。
+
## 任务示例
### 简单示例: