[incubator-doris] branch master updated: Fix docs sequence error (#2814)

zhaoc Mon, 20 Jan 2020 06:36:34 -0800

This is an automated email from the ASF dual-hosted git repository.

zhaoc pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git



The following commit(s) were added to refs/heads/master by this push:
     new acc8941  Fix docs sequence error (#2814)
acc8941 is described below

commit acc89411dcc5f42dde33b3884bdd34c66bc29771
Author: yangzhg <[email protected]>
AuthorDate: Mon Jan 20 22:35:40 2020 +0800

    Fix docs sequence error (#2814)
---
 .../cn/getting-started/best-practice.md            | 26 ++++++++++-----------
 .../en/getting-started/best-practice_EN.md         | 27 +++++++++++-----------
 2 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/docs/documentation/cn/getting-started/best-practice.md 
b/docs/documentation/cn/getting-started/best-practice.md
index 96f624e..3b58699 100644
--- a/docs/documentation/cn/getting-started/best-practice.md
+++ b/docs/documentation/cn/getting-started/best-practice.md
@@ -25,7 +25,7 @@ under the License.
 
 Doris 数据模型上目前分为三类: AGGREGATE KEY, UNIQUE KEY, DUPLICATE KEY。三种模型中数据都是按KEY进行排序。
 
-1. AGGREGATE KEY
+1.1.1 AGGREGATE KEY
 
     AGGREGATE KEY相同时，新旧记录进行聚合，目前支持的聚合函数有SUM, MIN, MAX, REPLACE。
     
@@ -43,7 +43,7 @@ Doris 数据模型上目前分为三类: AGGREGATE KEY, UNIQUE KEY, DUPLICATE KE
     DISTRIBUTED BY HASH(siteid) BUCKETS 10;
     ```
     
-2. UNIQUE KEY
+1.1.2. UNIQUE KEY
 
     UNIQUE KEY 相同时，新记录覆盖旧记录。目前 UNIQUE KEY 实现上和 AGGREGATE KEY 的 REPLACE 
聚合方法一样，二者本质上相同。适用于有更新需求的分析业务。
     
@@ -59,7 +59,7 @@ Doris 数据模型上目前分为三类: AGGREGATE KEY, UNIQUE KEY, DUPLICATE KE
     DISTRIBUTED BY HASH(orderid) BUCKETS 10;
     ```
     
-3. DUPLICATE KEY
+1.1.3. DUPLICATE KEY
 
     只指定排序列，相同的行不会合并。适用于数据无需提前聚合的分析业务。
     
@@ -88,11 +88,11 @@ Doris 数据模型上目前分为三类: AGGREGATE KEY, UNIQUE KEY, DUPLICATE KE
 
 使用过程中，建议用户尽量使用 Star Schema 区分维度表和指标表。频繁更新的维度表也可以放在 MySQL 外部表中。而如果只有少量更新, 
可以直接放在 Doris 中。在 Doris 中存储维度表时，可对维度表设置更多的副本，提升 Join 的性能。
  
-### 1.4 分区和分桶
+### 1.3 分区和分桶
 
 Doris 支持两级分区存储, 第一层为 RANGE 分区(partition), 第二层为 HASH 分桶(bucket)。
 
-1. RANGE分区(partition)
+1.3.1. RANGE分区(partition)
 
     RANGE分区用于将数据划分成不同区间, 逻辑上可以理解为将原始表划分成了多个子表。业务上，多数用户会选择采用按时间进行partition, 
让时间进行partition有以下好处：
     
@@ -100,14 +100,14 @@ Doris 支持两级分区存储, 第一层为 RANGE 分区(partition), 第二层
     * 可用上Doris分级存储(SSD + SATA)的功能
     * 按分区删除数据时，更加迅速
 
-2. HASH分桶(bucket)
+1.3.2. HASH分桶(bucket)
 
     根据hash值将数据划分成不同的 bucket。
     
     * 建议采用区分度大的列做分桶, 避免出现数据倾斜
     * 为方便数据恢复, 建议单个 bucket 的 size 不要太大, 保持在 10GB 以内, 所以建表或增加 partition 时请合理考虑 
bucket 数目, 其中不同 partition 可指定不同的 buckets 数。
 
-### 1.5 稀疏索引和 Bloom Filter
+### 1.4 稀疏索引和 Bloom Filter
 
 Doris对数据进行有序存储, 在数据有序的基础上为其建立稀疏索引,索引粒度为 block(1024行)。
 
@@ -117,13 +117,13 @@ Doris对数据进行有序存储, 在数据有序的基础上为其建立稀疏
 * 这其中有一个特殊的地方,就是 varchar 类型的字段。varchar 类型字段只能作为稀疏索引的最后一个字段。索引会在 varchar 处截断, 
因此 varchar 如果出现在前面，可能索引的长度可能不足 36 个字节。具体可以参阅 [数据模型、ROLLUP 
及前缀索引](./data-model-rollup.md)。
 * 除稀疏索引之外, Doris还提供bloomfilter索引, bloomfilter索引对区分度比较大的列过滤效果明显。 
如果考虑到varchar不能放在稀疏索引中, 可以建立bloomfilter索引。
 
-### 1.6 物化视图(rollup)
+### 1.5 物化视图(rollup)
 
 Rollup 本质上可以理解为原始表(Base Table)的一个物化索引。建立 Rollup 时可只选取 Base Table 中的部分列作为 
Schema。Schema 中的字段顺序也可与 Base Table 不同。
 
 下列情形可以考虑建立 Rollup：
 
-1. Base Table 中数据聚合度不高。
+1.5.1. Base Table 中数据聚合度不高。
 
 这一般是因 Base Table 有区分度比较大的字段而导致。此时可以考虑选取部分列，建立 Rollup。
     
@@ -139,7 +139,7 @@ siteid 可能导致数据聚合度不高，如果业务方经常根据城市统
 ALTER TABLE site_visit ADD ROLLUP rollup_city(city, pv);
 ```
     
-2. Base Table 中的前缀索引无法命中
+1.5.2. Base Table 中的前缀索引无法命中
 
 这一般是 Base Table 的建表方式无法覆盖所有的查询模式。此时可以考虑调整列顺序，建立 Rollup。
 
@@ -159,7 +159,7 @@ ALTER TABLE session_data ADD ROLLUP 
rollup_brower(brower,province,ip,url) DUPLIC
 
 Doris中目前进行 Schema Change 的方式有三种：Sorted Schema Change，Direct Schema Change, 
Linked Schema Change。
 
-1. Sorted Schema Change
+2.1. Sorted Schema Change
 
     改变了列的排序方式，需对数据进行重新排序。例如删除排序列中的一列, 字段重排序。
     
@@ -167,13 +167,13 @@ Doris中目前进行 Schema Change 的方式有三种：Sorted Schema Change，D
     ALTER TABLE site_visit DROP COLUMN city;
     ```
     
-2. Direct Schema Change: 无需重新排序，但是需要对数据做一次转换。例如修改列的类型，在稀疏索引中加一列等。
+2.2. Direct Schema Change: 无需重新排序，但是需要对数据做一次转换。例如修改列的类型，在稀疏索引中加一列等。
 
     ```
     ALTER TABLE site_visit MODIFY COLUMN username varchar(64);
     ```
     
-3. Linked Schema Change: 无需转换数据，直接完成。例如加列操作。
+2.3. Linked Schema Change: 无需转换数据，直接完成。例如加列操作。
     
     ```
     ALTER TABLE site_visit ADD COLUMN click bigint SUM default '0';
diff --git a/docs/documentation/en/getting-started/best-practice_EN.md 
b/docs/documentation/en/getting-started/best-practice_EN.md
index 0ddc251..7001bce 100644
--- a/docs/documentation/en/getting-started/best-practice_EN.md
+++ b/docs/documentation/en/getting-started/best-practice_EN.md
@@ -26,7 +26,7 @@ under the License.
 
 Doris data model is currently divided into three categories: AGGREGATE KEY, 
UNIQUE KEY, DUPLICATE KEY. Data in all three models are sorted by KEY.
 
-1. AGGREGATE KEY
+1.1.1. AGGREGATE KEY
 
 When AGGREGATE KEY is the same, old and new records are aggregated. The 
aggregation functions currently supported are SUM, MIN, MAX, REPLACE.
 
@@ -44,7 +44,7 @@ AGGREGATE KEY(siteid, city, username)
 DISTRIBUTED BY HASH(siteid) BUCKETS 10;
 ```
 
-2. KEY UNIQUE
+1.1.2. KEY UNIQUE
 
 When UNIQUE KEY is the same, the new record covers the old record. At present, 
UNIQUE KEY implements the same RPLACE aggregation method as GGREGATE KEY, and 
they are essentially the same. Suitable for analytical business with updated 
requirements.
 
@@ -60,7 +60,7 @@ KEY (orderid) UNIT
 DISTRIBUTED BY HASH(orderid) BUCKETS 10;
 ```
 
-3. DUPLICATE KEY
+1.1.3. DUPLICATE KEY
 
 Only sort columns are specified, and the same rows are not merged. It is 
suitable for the analysis business where data need not be aggregated in advance.
 
@@ -89,11 +89,11 @@ In order to adapt to the front-end business, business side 
often does not distin
 
 In the process of using Star Schema, users are advised to use Star Schema to 
distinguish dimension tables from indicator tables as much as possible. 
Frequently updated dimension tables can also be placed in MySQL external 
tables. If there are only a few updates, they can be placed directly in Doris. 
When storing dimension tables in Doris, more copies of dimension tables can be 
set up to improve Join's performance.
 
-### 1.4 Partitions and Barrels
+### 1.3 Partitions and Barrels
 
 Doris supports two-level partitioned storage. The first layer is RANGE 
partition and the second layer is HASH bucket.
 
-1. RANGE分区(partition)
+1.3.1. RANGE分区(partition)
 
 The RANGE partition is used to divide data into different intervals, which can 
be logically understood as dividing the original table into multiple 
sub-tables. In business, most users will choose to partition on time, which has 
the following advantages:
 
@@ -101,14 +101,14 @@ The RANGE partition is used to divide data into different 
intervals, which can b
 * Availability of Doris Hierarchical Storage (SSD + SATA)
 * Delete data by partition more quickly
 
-2. HASH分桶(bucket)
+1.3.2. HASH分桶(bucket)
 
 The data is divided into different buckets according to the hash value.
 
 * It is suggested that columns with large differentiation should be used as 
buckets to avoid data skew.
 * In order to facilitate data recovery, it is suggested that the size of a 
single bucket should not be too large and should be kept within 10GB. 
Therefore, the number of buckets should be considered reasonably when building 
tables or increasing partitions, among which different partitions can specify 
different buckets.
 
-### 1.5 Sparse Index and Bloom Filter
+### 1.4 Sparse Index and Bloom Filter
 
 Doris stores the data in an orderly manner, and builds a sparse index for 
Doris on the basis of ordered data. The index granularity is block (1024 rows).
 
@@ -118,13 +118,13 @@ Sparse index chooses fixed length prefix in schema as 
index content, and Doris c
 * One particular feature of this is the varchar type field. The varchar type 
field can only be used as the last field of the sparse index. The index is 
truncated at varchar, so if varchar appears in front, the length of the index 
may be less than 36 bytes. Specifically, you can refer to [data model, ROLLUP 
and prefix index] (. / data-model-rollup. md).
 * In addition to sparse index, Doris also provides bloomfilter index. 
Bloomfilter index has obvious filtering effect on columns with high 
discrimination. If you consider that varchar cannot be placed in a sparse 
index, you can create a bloomfilter index.
 
-### 1.6 Physical and Chemical View (rollup)
+### 1.5 Physical and Chemical View (rollup)
 
 Rollup can essentially be understood as a physical index of the original 
table. When creating Rollup, only some columns in Base Table can be selected as 
Schema. The order of fields in Schema can also be different from that in Base 
Table.
 
 Rollup can be considered in the following cases:
 
-1. Base Table 中数据聚合度不高。
+1.5.1. Base Table 中数据聚合度不高。
 
 This is usually due to the fact that Base Table has more differentiated 
fields. At this point, you can consider selecting some columns and establishing 
Rollup.
 
@@ -140,7 +140,7 @@ Siteid may lead to a low degree of data aggregation. If 
business parties often b
 ALTER TABLE site_visit ADD ROLLUP rollup_city(city, pv);
 ```
 
-2. The prefix index in Base Table cannot be hit
+1.5.2. The prefix index in Base Table cannot be hit
 
 Generally, the way Base Table is constructed cannot cover all query modes. At 
this point, you can consider adjusting the column order and establishing Rollup.
 
@@ -160,7 +160,7 @@ ALTER TABLE session_data ADD ROLLUP 
rollup_brower(brower,province,ip,url) DUPLIC
 
 Doris中目前进行 Schema Change 的方式有三种：Sorted Schema Change，Direct Schema Change, 
Linked Schema Change。
 
-1. Sorted Schema Change
+2.1. Sorted Schema Change
 
 The sorting of columns has been changed and the data needs to be reordered. 
For example, delete a column in a sorted column and reorder the fields.
 
@@ -168,13 +168,14 @@ The sorting of columns has been changed and the data 
needs to be reordered. For
 ALTER TABLE site_visit DROP COLUMN city;
 ```
 
-2. Direct Schema Change: There is no need to reorder, but there is a need to 
convert the data. For example, modify the type of column, add a column to the 
sparse index, etc.
+2.2. Direct Schema Change: There is no need to reorder, but there is a need to 
convert the data. For example, modify
+ the type of column, add a column to the sparse index, etc.
 
 ```
 ALTER TABLE site_visit MODIFY COLUMN username varchar(64);
 ```
 
-3. Linked Schema Change: 无需转换数据，直接完成。例如加列操作。
+2.3. Linked Schema Change: 无需转换数据，直接完成。例如加列操作。
 
 ```
 ALTER TABLE site_visit ADD COLUMN click bigint SUM default '0';


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[incubator-doris] branch master updated: Fix docs sequence error (#2814)

Reply via email to