This is an automated email from the ASF dual-hosted git repository.
dataroaring pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new c1f3235192 [Opt](load) add load availability doc and fix some errors
(#1125)
c1f3235192 is described below
commit c1f3235192389c076fde02bdb0433a2c0a6495ca
Author: Xin Liao <[email protected]>
AuthorDate: Fri Sep 20 16:11:37 2024 +0800
[Opt](load) add load availability doc and fix some errors (#1125)
# Versions
- [x] dev
- [ ] 3.0
- [ ] 2.1
- [ ] 2.0
# Languages
- [x] Chinese
- [x] English
---
.../import/import-way/group-commit-manual.md | 2 +-
.../import/import-way/stream-load-manual.md | 14 +--
docs/data-operate/import/load-high-availability.md | 123 +++++++++++++++++++++
docs/data-operate/import/min-load-replica-num.md | 100 -----------------
.../import/import-way/group-commit-manual.md | 6 +-
.../import/import-way/mysql-load-manual.md | 2 +-
.../import/import-way/stream-load-manual.md | 14 +--
.../data-operate/import/load-data-convert.md | 2 +-
.../data-operate/import/load-high-availability.md | 122 ++++++++++++++++++++
.../data-operate/import/min-load-replica-num.md | 102 -----------------
sidebars.json | 2 +-
11 files changed, 266 insertions(+), 223 deletions(-)
diff --git a/docs/data-operate/import/import-way/group-commit-manual.md
b/docs/data-operate/import/import-way/group-commit-manual.md
index 53718d83fd..c879eaba98 100644
--- a/docs/data-operate/import/import-way/group-commit-manual.md
+++ b/docs/data-operate/import/import-way/group-commit-manual.md
@@ -275,7 +275,7 @@ curl --location-trusted -u {user}:{passwd} -T data.csv -H
"group_commit:sync_mod
# The retured label is start with 'group_commit', which is the label of the
real load job
```
-See [Stream Load](stream-load-manual.md) for more detailed syntax used by
**Stream Load**.
+See [Stream Load](./stream-load-manual.md) for more detailed syntax used by
**Stream Load**.
### Http Stream
diff --git a/docs/data-operate/import/import-way/stream-load-manual.md
b/docs/data-operate/import/import-way/stream-load-manual.md
index 2ee45352bd..a5fc94fcb3 100644
--- a/docs/data-operate/import/import-way/stream-load-manual.md
+++ b/docs/data-operate/import/import-way/stream-load-manual.md
@@ -664,7 +664,7 @@ mysql> DESC table1;
The original table data is:
-```SQL
+```sql
+-------+--------+------+
| name | gender | age |
+-------+--------+------+
@@ -678,13 +678,13 @@ The original table data is:
loading data as:
-```SQL
+```sql
li,male,10
```
Since `function_column.sequence_col` is specified as `age`, and the `age`
value is larger than or equal to the existing column in the table, the original
table data is deleted. The table data becomes:
-```SQL
+```sql
+-------+--------+------+
| name | gender | age |
+-------+--------+------+
@@ -697,7 +697,7 @@ Since `function_column.sequence_col` is specified as `age`,
and the `age` value
loading data as:
-```SQL
+```sql
li,male,9
```
@@ -935,7 +935,7 @@ curl --location-trusted -u <doris_user>:<doris_password> \
When the imported data contains a map type, as in the following example:
-```SQL
+```sql
[
{"user_id":1,"namemap":{"Emily":101,"age":25}},
{"user_id":2,"namemap":{"Benjamin":102,"age":35}},
@@ -978,7 +978,7 @@ During the import process, when encountering Bitmap type
data, you can use to_bi
For example, with the following data:
-```SQL
+```sql
1|koga|17723
2|nijg|146285
3|lojn|347890
@@ -1017,7 +1017,7 @@ curl --location-trusted -u <doris_user>:<doris_password> \
You can use the hll_hash function to convert data into the hll type, as in the
following example:
-```SQL
+```sql
1001|koga
1002|nijg
1003|lojn
diff --git a/docs/data-operate/import/load-high-availability.md
b/docs/data-operate/import/load-high-availability.md
new file mode 100644
index 0000000000..a689f3d464
--- /dev/null
+++ b/docs/data-operate/import/load-high-availability.md
@@ -0,0 +1,123 @@
+---
+{
+ "title": "Load High Availability",
+ "language": "en-US"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Load High Availability
+
+## Overview
+
+Doris provides various mechanisms to ensure high availability during data
import. This article will detail Doris's default import behavior and additional
options for improving import availability, especially the minimum write replica
number feature.
+
+## Majority Write
+
+By default, Doris adopts a majority write strategy to ensure data reliability
and consistency:
+
+- An import is considered successful when the number of successfully written
replicas exceeds half of the total number of replicas.
+- For example, for a table with three replicas, at least two replicas must be
successfully written for the import to be considered successful.
+
+### How It Works
+
+1. Data Distribution: The import task first distributes data to all relevant
BE nodes.
+
+2. Parallel Writing: Each BE node processes data writing operations in
parallel.
+
+3. Write Confirmation: After completing the data write, each BE node sends a
confirmation to the FE.
+
+4. Majority Judgment: The FE counts the number of successfully written
replicas, and considers the import successful when a majority is reached.
+
+5. Transaction Commit: The FE commits the import transaction, making the data
visible externally.
+
+6. Asynchronous Replication: For replicas that were not successfully written,
the system will asynchronously replicate data in the background to ensure
eventual consistency across all replicas.
+
+The majority write strategy is Doris's balance between data reliability and
system availability. For scenarios with special requirements, Doris provides
other options such as the minimum write replica number to further enhance
system flexibility.
+
+## Minimum Write Replica Number
+
+While the majority write strategy ensures data reliability, it may affect
system availability in certain scenarios. For example, in a two-replica
situation, both replicas must be successfully written to complete the import,
meaning no replica is allowed to be unavailable during the import process.
+
+To address this issue and improve import availability, Doris provides the Min
Load Replica Num option.
+
+### Feature Description
+
+The minimum write replica number allows users to specify the minimum number of
replicas that need to be successfully written during data import. The import is
considered successful when the number of successfully written replicas is
greater than or equal to this value.
+
+### Use Cases
+
+- When some nodes are unavailable, but data import still needs to be
guaranteed.
+
+- When there are high requirements for data import speed, and users are
willing to sacrifice some reliability for higher availability.
+
+### Configuration Methods
+
+#### 1. Single Table Configuration
+
+a. Set when creating a table:
+
+```sql
+CREATE TABLE example_table
+(
+id INT,
+name STRING
+)
+DUPLICATE KEY(id)
+DISTRIBUTED BY HASH(id) BUCKETS 10
+PROPERTIES
+(
+'replication_num' = '3',
+'min_load_replica_num' = '2'
+);
+```
+
+b. Modify an existing table:
+
+```sql
+ALTER TABLE example_table
+SET ( 'min_load_replica_num' = '2' );
+```
+
+#### 2. Global Configuration
+Set through the FE configuration item `min_load_replica_num`.
+
+- Valid values: greater than 0
+
+- Default value: -1 (indicating that the global minimum write replica number
is not enabled)
+
+Priority: Table property > Global configuration > Default majority rule
+
+If the table property is not set or invalid, and the global configuration is
valid, the minimum write replica number for the table is:
+`min(FE configured min_load_replica_num, table's replica number/2 + 1)`
+
+For viewing and modifying FE configuration items, please refer to the [FE
Configuration Document](../../admin-manual/config/fe-config.md).
+
+## Other High Availability Mechanisms
+
+In addition to the minimum write replica number option, Doris also adopts the
following mechanisms to improve import availability:
+
+1. Import Retry: Automatically retry failed import tasks caused by temporary
failures.
+
+2. Load Balancing: Distribute import tasks to different BE nodes to avoid
excessive pressure on a single point.
+
+3. Transaction Mechanism: Ensure data consistency, automatically rollback in
case of failure.
+
diff --git a/docs/data-operate/import/min-load-replica-num.md
b/docs/data-operate/import/min-load-replica-num.md
deleted file mode 100644
index ca995ac5cf..0000000000
--- a/docs/data-operate/import/min-load-replica-num.md
+++ /dev/null
@@ -1,100 +0,0 @@
----
-{
- "title": "Minimum Number of Replicas Loading",
- "language": "en"
-}
----
-
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements. See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership. The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied. See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-
-Importing data requires more than half of the replicas to be written
successfully. However, it is not flexible enough and may cause inconvenience in
some scenarios.
-
-For example, in the case of two replicas, to import data, both replicas need
to be written successfully. This means that no replica is allowed to be
unavailable during the data import process. This greatly affects the
availability of the cluster.
-
-In order to solve the above problems, Doris allows users to set the minimum
number of write replicas. For the task of importing data, when the number of
replicas it successfully writes is greater than or equal to the minimum number
of replicas written, the import is successful.
-
-## Usage
-
-### Min load replica num for single table
-
-You can set the table property `min_load_replica_num` for a single olap table.
The valid value of this property must be greater than 0 and not exceed
`replication_num`(the number of replicas of the table). Its default value is
-1, indicating that the property is not enabled.
-
-The `min_load_replica_num` of the table can be set when creating the table.
-
-```sql
-CREATE TABLE test_table1
-(
- k1 INT,
- k2 INT
-)
-DUPLICATE KEY(k1)
-DISTRIBUTED BY HASH(k1) BUCKETS 5
-PROPERTIES
-(
- 'replication_num' = '2',
- 'min_load_replica_num' = '1'
-);
-```
-
-For an existing table, you can use `ALTER TABLE` to modify its
`min_load_replica_num`.
-
-```sql
-ALTER TABLE test_table1
-SET ( 'min_load_replica_num' = '1');
-```
-
-You can use `SHOW CREATE TABLE` to view the table property
`min_load_replica_num`.
-
-```SQL
-SHOW CREATE TABLE test_table1;
-```
-
-The PROPERTIES of the output will contain `min_load_replica_num`. e.g.
-
-```text
-Create Table: CREATE TABLE `test_table1` (
- `k1` int(11) NULL,
- `k2` int(11) NULL
-) ENGINE=OLAP
-DUPLICATE KEY(`k1`)
-COMMENT 'OLAP'
-DISTRIBUTED BY HASH(`k1`) BUCKETS 5
-PROPERTIES (
-"replication_allocation" = "tag.location.default: 2",
-"min_load_replica_num" = "1",
-"storage_format" = "V2",
-"light_schema_change" = "true",
-"disable_auto_compaction" = "false",
-"enable_single_replica_compaction" = "false"
-);
-```
-
-### Global min load replica num for all tables
-
-You can set FE configuration item `min_load_replica_num` for all olap tables.
The valid value of this configuration item must be greater than 0. Its default
value is -1, which means that the global minimum number of load replicas is not
enabled.
-
-For a table, if the table property `min_load_replica_num` is valid (>0), then
the table will ignore the global configuration `min_load_replica_num`.
Otherwise, if the global configuration `min_load_replica_num` is valid (>0),
then the minimum number of load replicas for the table will be equal to
`min(FE.conf.min_load_replica_num, table.replication_num/2 + 1)`.
-
-For viewing and modification of FE configuration items, you can refer to
[here](../../admin-manual/config/fe-config.md).
-
-### Other cases
-
-If the table property `min_load_replica_num` is not enabled (<=0), and the
global configuration `min_load_replica_num` is not enabled(<=0), then the data
import still needs to be successfully written to the majority replica. At this
point, the minimum number of write replicas for the table is equal to
`table.replication_num/2 + 1`.
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/group-commit-manual.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/group-commit-manual.md
index 5d53a150b5..4ce64fa6e6 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/group-commit-manual.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/group-commit-manual.md
@@ -147,7 +147,7 @@ private static void groupCommitInsertBatch() throws
Exception {
}
```
-关于 **JDBC** 的更多用法,参考[使用 Insert 方式同步数据](../../import/insert-into-manual)。
+关于 **JDBC** 的更多用法,参考[使用 Insert 方式同步数据](./insert-into-manual.md)。
### INSERT INTO VALUES
@@ -279,7 +279,7 @@ private static void groupCommitInsertBatch() throws
Exception {
# 返回的 Label 是 group_commit 开头的,是真正消费数据的导入关联的 label
```
- 关于 Stream Load 使用的更多详细语法及最佳实践,请参阅 [Stream
Load](../../../data-operate/import/stream-load-manual)。
+ 关于 Stream Load 使用的更多详细语法及最佳实践,请参阅 [Stream Load](./stream-load-manual.md)。
### Http Stream
@@ -339,7 +339,7 @@ private static void groupCommitInsertBatch() throws
Exception {
# 返回的 Label 是 group_commit 开头的,是真正消费数据的导入关联的 label
```
- 关于 Http Stream 使用的更多详细语法及最佳实践,请参阅 [Stream
Load](../../../data-operate/import/stream-load-manual)。
+ 关于 Http Stream 使用的更多详细语法及最佳实践,请参阅 [Stream
Load](./stream-load-manual.md#tvf-在-stream-load-中的应用---http_stream-模式)。
## 自动提交条件
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/mysql-load-manual.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/mysql-load-manual.md
index c030d4348f..16d3a48fde 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/mysql-load-manual.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/mysql-load-manual.md
@@ -185,7 +185,7 @@ INTO TABLE [<db_name>.]<table_name>
| timezone | 指定本次导入所使用的时区。默认为东八区。该参数会影响所有导入涉及的和时区有关的函数结果。 |
| exec_mem_limit | 导入内存限制。默认为 2GB。单位为字节。 |
| trim_double_quotes | 布尔类型,默认值为 false,为 true 时表示裁剪掉导入文件每个字段最外层的双引号。 |
-| enclose | 指定包围符。当 csv
数据字段中含有行分隔符或列分隔符时,为防止意外截断,可指定单字节字符作为包围符起到保护作用。例如列分隔符为 ",",包围符为 "'",数据为
"a,'b,c'",则 "b,c" 会被解析为一个字段。 |
+| enclose | 指定包围符。当 CSV
数据字段中含有行分隔符或列分隔符时,为防止意外截断,可指定单字节字符作为包围符起到保护作用。例如列分隔符为 ",",包围符为 "'",数据为
"a,'b,c'",则 "b,c" 会被解析为一个字段。 |
| escape | 指定转义符。用于转义在字段中出现的与包围符相同的字符。例如数据为 "a,'b,'c'",包围符为 "'",希望
"b,'c 被作为一个字段解析,则需要指定单字节转义符,例如"\",将数据修改为 "a,'b,\'c'"。 |
## 导入举例
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
index 89ade93029..218cfb793e 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
@@ -670,7 +670,7 @@ mysql> DESC table1;
假设原表中数据为:
-```SQL
+```sql
+-------+--------+------+
| name | gender | age |
+-------+--------+------+
@@ -684,13 +684,13 @@ mysql> DESC table1;
导入数据为:
-```SQL
+```sql
li,male,10
```
由于指定了 function_column.sequence_col: age,并且 age 大于等于表中原有的列,原表数据被删除,表中数据变为:
-```SQL
+```sql
+-------+--------+------+
| name | gender | age |
+-------+--------+------+
@@ -703,7 +703,7 @@ li,male,10
导入数据为:
-```SQL
+```sql
li,male,9
```
@@ -941,7 +941,7 @@ curl --location-trusted -u <doris_user>:<doris_password> \
当导入数据中包含 map 类型,如以下的例子中:
-```SQL
+```sql
[
{"user_id":1,"namemap":{"Emily":101,"age":25}},
{"user_id":2,"namemap":{"Benjamin":102,"age":35}},
@@ -984,7 +984,7 @@ curl --location-trusted -u <doris_user>:<doris_password> \
如导入数据如下:
-```SQL
+```sql
1|koga|17723
2|nijg|146285
3|lojn|347890
@@ -1023,7 +1023,7 @@ curl --location-trusted -u <doris_user>:<doris_password> \
通过 hll_hash 函数可以将数据转换成 hll 类型,如下数据:
-```SQL
+```sql
1001|koga
1002|nijg
1003|lojn
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/load-data-convert.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/load-data-convert.md
index 127acd397b..a2015e04ee 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/load-data-convert.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/load-data-convert.md
@@ -1,6 +1,6 @@
---
{
- "title": "数据转化",
+ "title": "数据转换",
"language": "zh-CN"
}
---
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/load-high-availability.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/load-high-availability.md
new file mode 100644
index 0000000000..d4c4c447e1
--- /dev/null
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/load-high-availability.md
@@ -0,0 +1,122 @@
+---
+{
+ "title": "导入高可用性",
+ "language": "zh-CN"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# 导入高可用性
+
+## 概述
+
+Doris 在数据导入过程中提供了多种机制来确保高可用性。本文将详细介绍 Doris
的默认导入行为以及为提高导入可用性而提供的额外选项,特别是最小写入副本数功能。
+
+## 多数派写入
+
+默认情况下,Doris 采用多数派写入策略来确保数据的可靠性和一致性:
+
+- 当成功写入的副本数超过总副本数的一半时,导入被视为成功。
+- 例如,对于三副本的表,至少需要两个副本写入成功才算导入成功。
+
+### 工作原理
+
+1. 数据分发:导入任务首先将数据分发到所有相关的 BE 节点。
+
+2. 并行写入:各个 BE 节点并行处理数据写入操作。
+
+3. 写入确认:每个 BE 节点在完成数据写入后,会向 FE 发送确认信息。
+
+4. 多数派判断:FE 统计成功写入的副本数,当达到多数派时,认为导入成功。
+
+5. 事务提交:FE 提交导入事务,使数据对外可见。
+
+6. 异步复制:对于未成功写入的副本,系统会在后台异步进行数据复制,以确保最终所有副本的数据一致性。
+
+多数派写入策略是 Doris 在数据可靠性和系统可用性之间的一个平衡。对于有特殊需求的场景,Doris
提供了最小写入副本数等其他选项来进一步提高系统的灵活性。
+
+## 最小写入副本数
+
+多数派写入策略在保证数据可靠性的同时,也可能在某些场景下影响系统的可用性。例如,在两副本的情况下,必须两个副本都写入成功才能完成导入,这意味着在导入过程中不允许任何一个副本不可用。
+
+为了解决上述问题并提高导入的可用性,Doris 提供了最小写入副本数(Min Load Replica Num)选项。
+
+### 功能说明
+
+最小写入副本数允许用户指定导入数据时需要成功写入的最少副本数。当成功写入的副本数大于或等于这个值时,导入即视为成功。
+
+### 使用场景
+
+- 在部分节点不可用时,仍需要保证数据能够成功导入。
+
+- 对数据导入速度有较高要求,愿意在一定程度上牺牲可靠性来换取更高的可用性。
+
+### 配置方法
+
+#### 1. 单表配置
+
+a. 创建表时设置:
+
+```sql
+CREATE TABLE example_table
+(
+id INT,
+name STRING
+)
+DUPLICATE KEY(id)
+DISTRIBUTED BY HASH(id) BUCKETS 10
+PROPERTIES
+(
+'replication_num' = '3',
+'min_load_replica_num' = '2'
+);
+```
+
+b. 修改现有表:
+
+```sql
+ALTER TABLE example_table
+SET ( 'min_load_replica_num' = '2' );
+```
+
+#### 2. 全局配置
+通过 FE 配置项 `min_load_replica_num` 设置。
+
+- 有效值:大于 0
+
+- 默认值:-1(表示不开启全局最小写入副本数)
+
+优先级:表属性 > 全局配置 > 默认多数派规则
+
+如果表属性未设置或无效,且全局配置有效,则表的最小写入副本数为:
+`min(FE配置的min_load_replica_num,表的副本数/2 + 1)`
+
+关于 FE 配置项的查看和修改,请参考[FE 配置项文档](../../admin-manual/config/fe-config.md)。
+
+## 其他高可用性机制
+
+除了最小写入副本数选项,Doris 还采用了以下机制来提高导入的可用性:
+
+1. 导入重试:自动重试因临时故障导致的失败导入任务。
+
+2. 负载均衡:将导入任务分散到不同的 BE 节点,避免单点压力过大。
+
+3. 事务机制:确保数据的一致性,失败时自动回滚。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/min-load-replica-num.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/min-load-replica-num.md
deleted file mode 100644
index 6cb17ccbb0..0000000000
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/min-load-replica-num.md
+++ /dev/null
@@ -1,102 +0,0 @@
----
-{
- "title": "最小写入副本数",
- "language": "zh-CN"
-}
----
-
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements. See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership. The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied. See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-# 最小写入副本数
-
-默认情况下,数据导入要求至少有超过半数的副本写入成功,导入才算成功。然而,这种方式不够灵活,在某些场景会带来不便。
-
-举个例子,对于两副本情况,按上面的多数派原则,要想导入数据,则需要这两个副本都写入成功。这意味着,在导入数据过程中,不允许任意一个副本不可用。这极大影响了集群的可用性。
-
-为了解决以上问题,Doris 允许用户设置最小写入副本数 (Min Load Replica
Num)。对导入数据任务,当它成功写入的副本数大于或等于最小写入副本数时,导入即成功。
-
-## 用法
-
-### 单个表的最小写入副本数
-
-可以对单个 olap 表,设置最小写入副本数,并用表属性`min_load_replica_num`来表示。该属性的有效值要求大于 0
且不超过表的副本数。其默认值为 -1,表示不启用该属性。
-
-可以在创建表时设置表的`min_load_replica_num`。
-
-```sql
-CREATE TABLE test_table1
-(
- k1 INT,
- k2 INT
-)
-DUPLICATE KEY(k1)
-DISTRIBUTED BY HASH(k1) BUCKETS 5
-PROPERTIES
-(
- 'replication_num' = '2',
- 'min_load_replica_num' = '1'
-);
-```
-
-对一个已存在的表,可以使用语句`ALTER TABLE`来修改它的`min_load_replica_num`。
-
-```sql
-ALTER TABLE test_table1
-SET ( 'min_load_replica_num' = '1');
-```
-
-可以使用语句`SHOW CREATE TABLE`来查看表的属性`min_load_replica_num`。
-
-```SQL
-SHOW CREATE TABLE test_table1;
-```
-
-输出结果的 PROPERTIES 中将包含`min_load_replica_num`。例如:
-
-```text
-Create Table: CREATE TABLE `test_table1` (
- `k1` int(11) NULL,
- `k2` int(11) NULL
-) ENGINE=OLAP
-DUPLICATE KEY(`k1`)
-COMMENT 'OLAP'
-DISTRIBUTED BY HASH(`k1`) BUCKETS 5
-PROPERTIES (
-"replication_allocation" = "tag.location.default: 2",
-"min_load_replica_num" = "1",
-"storage_format" = "V2",
-"light_schema_change" = "true",
-"disable_auto_compaction" = "false",
-"enable_single_replica_compaction" = "false"
-);
-```
-
-### 全局最小写入副本数
-
-可以对所有 olap 表,设置全局最小写入副本数,并用 FE 的配置项`min_load_replica_num`来表示。该配置项的有效值要求大于
0。其默认值为 -1,表示不开启全局最小写入副本数。
-
-对一个表,如果表属性`min_load_replica_num`有效(即大于
0),那么该表将会忽略全局配置`min_load_replica_num`。否则,如果全局配置`min_load_replica_num`有效(即大于
0),那么该表的最小写入副本数将等于`min(FE.conf.min_load_replica_num,table.replication_num/2 +
1)`。
-
-对于 FE 配置项的查看和修改,可以参考[这里](../../admin-manual/config/fe-config)。
-
-### 其余情况
-
-如果没有开启表属性`min_load_replica_num`(即小于或者等于
0),也没有设置全局配置`min_load_replica_num`(即小于或等于
0),那么数据的导入仍需多数派副本写入成功才算成功。此时,表的最小写入副本数等于`table.replicatition_num/2 + 1`。
-
diff --git a/sidebars.json b/sidebars.json
index ffeeb98e9c..59af3872dd 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -149,7 +149,7 @@
"data-operate/import/load-data-format",
"data-operate/import/error-data-handling",
"data-operate/import/load-data-convert",
- "data-operate/import/min-load-replica-num",
+ "data-operate/import/load-high-availability",
"data-operate/import/migrate-data-from-other-olap"
]
},
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]