This is an automated email from the ASF dual-hosted git repository. hui pushed a commit to branch lmh/docUpdate in repository https://gitbox.apache.org/repos/asf/iotdb.git
commit fe1c229d1f6e9da411347fdaeac37d642b1f5be6 Author: Minghui Liu <[email protected]> AuthorDate: Wed Nov 2 16:59:29 2022 +0800 update Select-Into --- docs/UserGuide/Process-Data/Select-Into.md | 22 ++++++++++++---------- docs/zh/UserGuide/Process-Data/Select-Into.md | 24 +++++++++++++----------- 2 files changed, 25 insertions(+), 21 deletions(-) diff --git a/docs/UserGuide/Process-Data/Select-Into.md b/docs/UserGuide/Process-Data/Select-Into.md index 96e01bf783..4893543693 100644 --- a/docs/UserGuide/Process-Data/Select-Into.md +++ b/docs/UserGuide/Process-Data/Select-Into.md @@ -26,7 +26,7 @@ The `SELECT INTO` statement copies data from query result set into target time s The application scenarios are as follows: - **Implement IoTDB internal ETL**: ETL the original data and write a new time series. - **Query result storage**: Persistently store the query results, which acts like a materialized view. -- **Non-aligned time series to aligned time series**: Rewrite non-aligned time series into aligned time series. +- **Non-aligned time series to aligned time series**: Rewrite non-aligned time series into another aligned time series. ## SQL Syntax @@ -130,8 +130,6 @@ It costs 0.625s ``` This statement also writes the query results of the four time series under the `root.sg` storage group to the four specified time series under the `root.sg_copy` storage group. However, in ALIGN BY DEVICE, the number of `intoItem` must be the same as the number of queried devices, and each queried device corresponds to one `intoItem`. -It is easy to see that in the case of ALIGN BY DEVICE query, columns under the same device can only be written to the same target device. - > When aligning the query by device, the result set displayed by `CLI` has one > more column, the `source device` column indicating the queried device. - **Example 4** (aligned by device) @@ -156,13 +154,11 @@ In particular, We can use variable placeholders to describe the correspondence b - Suffix duplication character `::`: Copy the suffix (or measurement) of the query device, indicating that from this layer to the last layer (or measurement) of the device, the node name (or measurement) of the target device corresponds to the queried device The node name (or measurement) is the same. - Single-level node matcher `${i}`: Indicates that the current level node name of the target sequence is the same as the i-th level node name of the query sequence. For example, for the path `root.sg1.d1.s1`, `${1}` means `sg1`, `${2}` means `d1`, and `${3}` means `s1`. -Note: The variable placeholder **can only describe the correspondence between time series**, and cannot be used for functions, expressions, etc. - When using variable placeholders, there must be no ambiguity in the correspondence between `intoItem` and the columns of the query result set. The specific cases are classified as follows: #### ALIGN BY TIME (default) -> Note: If the query includes aggregation and expression calculation, the columns in the query result cannot correspond to a time series, so neither the target device nor the measurement can use variable placeholders. +> Note: The variable placeholder **can only describe the correspondence between time series**. If the query includes aggregation and expression calculation, the columns in the query result cannot correspond to a time series, so neither the target device nor the measurement can use variable placeholders. ##### (1) The target device does not use variable placeholders & the target measurement list uses variable placeholders @@ -176,13 +172,13 @@ When using variable placeholders, there must be no ambiguity in the corresponden ```sql select s1, s2 -into root.sg_copy.d1(::), rot.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::), +into root.sg_copy.d1(::), root.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::) from root.sg.d1, root.sg.d2; ```` This statement is equivalent to: ```sql select s1, s2 -into root.sg_copy.d1(s1), rot.sg_copy.d2(s1), root.sg_copy.d1(s2), root.sg_copy.d2(s2), +into root.sg_copy.d1(s1), root.sg_copy.d2(s1), root.sg_copy.d1(s2), root.sg_copy.d2(s2) from root.sg.d1, root.sg.d2; ```` As you can see, the statement is not very simplified in this case. @@ -213,7 +209,7 @@ Write the query results of all time series under `root.sg` to `root.sg_bk`, the #### ALIGN BY DEVICE -> Note: If the query includes aggregation and expression calculation, the columns in the query result cannot correspond to a specific physical quantity, so the target measurement cannot use variable placeholders. +> Note: The variable placeholder **can only describe the correspondence between time series**. If the query includes aggregation and expression calculation, the columns in the query result cannot correspond to a specific physical quantity, so the target measurement cannot use variable placeholders. ##### (1) The target device does not use variable placeholders & the target measurement list uses variable placeholders @@ -259,6 +255,12 @@ Write the query result of each time series in `root.sg` to the same device, and We can use the `ALIGNED` keyword to specify the target device for writing to be aligned, and each `intoItem` can be set independently. +**Example:** +```sql +select s1, s2 into root.sg_copy.d1(t1, t2), aligned root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; +``` +This statement specifies that `root.sg_copy.d1` is an unaligned device and `root.sg_copy.d2` is an aligned device. + ### Unsupported query clauses - `SLIMIT`, `SOFFSET`: The query columns are uncertain, so they are not supported. @@ -306,7 +308,7 @@ It costs 0.115s ``` ### Non-aligned time series to aligned time series -Rewrite non-aligned time series into aligned time series. +Rewrite non-aligned time series into another aligned time series. **Note:** It is recommended to use the `LIMIT & OFFSET` clause or the `WHERE` clause (time filter) to batch data to prevent excessive data volume in a single operation. diff --git a/docs/zh/UserGuide/Process-Data/Select-Into.md b/docs/zh/UserGuide/Process-Data/Select-Into.md index 3d0a57299a..2bac79cd8c 100644 --- a/docs/zh/UserGuide/Process-Data/Select-Into.md +++ b/docs/zh/UserGuide/Process-Data/Select-Into.md @@ -26,7 +26,7 @@ 应用场景如下: - **实现 IoTDB 内部 ETL**:对原始数据进行 ETL 处理后写入新序列。 - **查询结果存储**:将查询结果进行持久化存储,起到类似物化视图的作用。 -- **非对齐序列转对齐序列**:对齐序列从0.13版本开始支持,可以通过该功能将历史的非对齐序列重新写成对齐序列。 +- **非对齐序列转对齐序列**:对齐序列从0.13版本开始支持,可以通过该功能将非对齐序列的数据写入新的对齐序列中。 ## 语法定义 @@ -129,8 +129,6 @@ It costs 0.625s 该语句同样是将 `root.sg` 存储组下四条序列的查询结果写入到 `root.sg_copy` 存储组下指定的四条序列中。但在按设备对齐中,`intoItem` 的数量必须和查询的设备数量一致,每个查询设备对应一个 `intoItem`。 -容易看出,在按设备对齐查询的情况下,同一个设备下的列只能写入相同的目标设备。 - > 按设备对齐查询时,`CLI` 展示的结果集多出一列 `source device` 列表示查询的设备。 - **示例 4**(按设备对齐) @@ -156,13 +154,11 @@ It costs 0.532s - 后缀复制符 `::`:复制查询设备后缀(或物理量),表示从该层开始一直到设备的最后一层(或物理量),目标设备的节点名(或物理量名)与查询的设备对应的节点名(或物理量名)相同。 - 单层节点匹配符 `${i}`:表示目标序列当前层节点名与查询序列的第`i`层节点名相同。比如,对于路径`root.sg1.d1.s1`而言,`${1}`表示`sg1`,`${2}`表示`d1`,`${3}`表示`s1`。 -注意:变量占位符**只能描述序列与序列之间的对应关系**,不能用于函数、表达式等。 - 在使用变量占位符时,`intoItem`与查询结果集列的对应关系不能存在歧义,具体情况分类讨论如下: #### 按时间对齐(默认) -> 注:如果查询中包含聚合、表达式计算,此时查询结果中的列无法与某个序列对应,因此目标设备和目标物理量都不能使用变量占位符。 +> 注:变量占位符**只能描述序列与序列之间的对应关系**,如果查询中包含聚合、表达式计算,此时查询结果中的列无法与某个序列对应,因此目标设备和目标物理量都不能使用变量占位符。 ##### (1)目标设备不使用变量占位符 & 目标物理量列表使用变量占位符 @@ -176,13 +172,13 @@ It costs 0.532s ```sql select s1, s2 -into root.sg_copy.d1(::), rot.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::), +into root.sg_copy.d1(::), root.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::) from root.sg.d1, root.sg.d2; ``` 该语句等价于: ```sql select s1, s2 -into root.sg_copy.d1(s1), rot.sg_copy.d2(s1), root.sg_copy.d1(s2), root.sg_copy.d2(s2), +into root.sg_copy.d1(s1), root.sg_copy.d2(s1), root.sg_copy.d1(s2), root.sg_copy.d2(s2) from root.sg.d1, root.sg.d2; ``` 可以看到,在这种情况下,语句并不能得到很好地简化。 @@ -214,7 +210,7 @@ select * into root.sg_bk.::(::) from root.sg.**; #### 按设备对齐(使用 `ALIGN BY DEVICE`) -> 注:如果查询中包含聚合、表达式计算,此时查询结果中的列无法与某个物理量对应,因此目标物理量不能使用变量占位符。 +> 注:变量占位符**只能描述序列与序列之间的对应关系**,如果查询中包含聚合、表达式计算,此时查询结果中的列无法与某个物理量对应,因此目标物理量不能使用变量占位符。 ##### (1)目标设备不使用变量占位符 & 目标物理量列表使用变量占位符 @@ -260,10 +256,16 @@ select * into ::(backup_${4}) from root.sg.** align by device; 通过 `ALIGNED` 关键词可以指定写入的目标设备为对齐写入,每个 `intoItem` 可以独立设置。 +**示例:** +```sql +select s1, s2 into root.sg_copy.d1(t1, t2), aligned root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; +``` +该语句指定了 `root.sg_copy.d1` 是非对齐设备,`root.sg_copy.d2`是对齐设备。 + ### 不支持使用的查询子句 - `SLIMIT`、`SOFFSET`:查询出来的列不确定,功能不清晰,因此不支持。 -- `LAST`、`GROUP BY TAGS`、`DISABLE ALIGN`:表结构和写入结构不一致,因此不支持。 +- `LAST`查询、`GROUP BY TAGS`、`DISABLE ALIGN`:表结构和写入结构不一致,因此不支持。 ### 其他要注意的点 @@ -309,7 +311,7 @@ It costs 0.115s 以上语句将降采样查询的结果持久化存储到新序列中。 ### 非对齐序列转对齐序列 -对齐序列从 0.13 版本开始支持,可以通过该功能将历史的非对齐序列重新写成对齐序列。 +对齐序列从 0.13 版本开始支持,可以通过该功能将非对齐序列的数据写入新的对齐序列中。 **注意:** 建议配合使用 `LIMIT & OFFSET` 子句或 `WHERE` 子句(时间过滤条件)对数据进行分批,防止单次操作的数据量过大。
