This is an automated email from the ASF dual-hosted git repository. haonan pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/iotdb-docs.git
The following commit(s) were added to refs/heads/main by this push:
new b306e854 add extract function to table model (#953)
b306e854 is described below
commit b306e854a42e8b3b1639604fd917833f5f2a61da
Author: leto-b <[email protected]>
AuthorDate: Mon Jan 5 10:39:28 2026 +0800
add extract function to table model (#953)
---
.../Master/Table/SQL-Manual/Basis-Function.md | 103 ++++++++++++++++-----
.../latest-Table/SQL-Manual/Basis-Function.md | 55 ++++++++++-
.../Master/Table/SQL-Manual/Basis-Function.md | 57 +++++++++++-
.../latest-Table/SQL-Manual/Basis-Function.md | 52 +++++++++++
4 files changed, 238 insertions(+), 29 deletions(-)
diff --git a/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
b/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
index d8c96720..30380976 100644
--- a/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
+++ b/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
@@ -156,30 +156,30 @@ SELECT LEAST(temperature,humidity) FROM table2;
### 2.2 Supported Aggregate Functions
-| Function Name | Description
| Allowed Input Types
[...]
-|:-----------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------
[...]
-| COUNT | Counts the number of data points.
| All types
[...]
-| COUNT_IF | COUNT_IF(exp) counts the number of rows that
satisfy a specified boolean expression.
| `exp` must be
a boolean expression [...]
-| APPROX_COUNT_DISTINCT | The APPROX_COUNT_DISTINCT(x[, maxStandardError])
function provides an approximation of COUNT(DISTINCT x), returning the
estimated number of distinct input values.
| `x`: The
target column to be calcu [...]
-| APPROX_MOST_FREQUENT | The APPROX_MOST_FREQUENT(x, k, capacity) function is
used to approximately calculate the top k most frequent elements in a dataset.
It returns a JSON-formatted string where the keys are the element values and
the values are their corresponding approximate frequencies. (Available since
V2.0.5.1)
| `x` : The
column to be calculated, s [...]
-| SUM | Calculates the sum.
| INT32 INT64 FLOAT
DOUBLE [...]
-| AVG | Calculates the average.
| INT32 INT64 FLOAT
DOUBLE [...]
-| MAX | Finds the maximum value.
| All types
[...]
-| MIN | Finds the minimum value.
| All types
[...]
-| FIRST | Finds the value with the smallest timestamp that is
not NULL.
| All types
[...]
-| LAST | Finds the value with the largest timestamp that is
not NULL.
| All types
[...]
-| STDDEV | Alias for STDDEV_SAMP, calculates the sample
standard deviation.
| INT32 INT64
FLOAT DOUBLE [...]
-| STDDEV_POP | Calculates the population standard deviation.
| INT32 INT64 FLOAT
DOUBLE [...]
-| STDDEV_SAMP | Calculates the sample standard deviation.
| INT32 INT64 FLOAT
DOUBLE [...]
-| VARIANCE | Alias for VAR_SAMP, calculates the sample
variance.
| INT32 INT64
FLOAT DOUBLE [...]
-| VAR_POP | Calculates the population variance.
| INT32 INT64 FLOAT
DOUBLE [...]
-| VAR_SAMP | Calculates the sample variance.
| INT32 INT64 FLOAT
DOUBLE [...]
-| EXTREME | Finds the value with the largest absolute value. If
the largest absolute values of positive and negative values are equal, returns
the positive value.
| INT32 INT64 FLOAT
DOUBLE [...]
-| MODE | Finds the mode. Note: 1. There is a risk of memory
exception when the number of distinct values in the input sequence is too
large; 2. If all elements have the same frequency, i.e., there is no mode, a
random element is returned; 3. If there are multiple modes, a random mode is
returned; 4. NULL values are also counted in frequency, so even if not all
values in the input sequence are NULL, the final result may still be NULL. |
All types [...]
-| MAX_BY | MAX_BY(x, y) finds the value of x corresponding to
the maximum y in the binary input x and y. MAX_BY(time, x) returns the
timestamp when x is at its maximum.
| x and y
can be of any type [...]
-| MIN_BY | MIN_BY(x, y) finds the value of x corresponding to
the minimum y in the binary input x and y. MIN_BY(time, x) returns the
timestamp when x is at its minimum.
| x and y
can be of any type [...]
-| FIRST_BY | FIRST_BY(x, y) finds the value of x in the same row
when y is the first non-null value.
| x and y can be of
any type [...]
-| LAST_BY | LAST_BY(x, y) finds the value of x in the same row
when y is the last non-null value.
| x and y can be of
any type [...]
+| Function Name | Description
[...]
+|:-----------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[...]
+| COUNT | Counts the number of data points.
[...]
+| COUNT_IF | COUNT_IF(exp) counts the number of rows that
satisfy a specified boolean expression.
[...]
+| APPROX_COUNT_DISTINCT | The APPROX_COUNT_DISTINCT(x[, maxStandardError])
function provides an approximation of COUNT(DISTINCT x), returning the
estimated number of distinct input values.
[...]
+| APPROX_MOST_FREQUENT | The APPROX_MOST_FREQUENT(x, k, capacity) function is
used to approximately calculate the top k most frequent elements in a dataset.
It returns a JSON-formatted string where the keys are the element values and
the values are their corresponding approximate frequencies. (Available since
V2.0.5.1) | `x` : The column to be calculated, supporting all existing data
types in IoTDB;<br> `k`: The number of top-k most frequent values to
return;<br>`capacity`: The number of [...]
+| SUM | Calculates the sum.
[...]
+| AVG | Calculates the average.
[...]
+| MAX | Finds the maximum value.
[...]
+| MIN | Finds the minimum value.
[...]
+| FIRST | Finds the value with the smallest timestamp that is
not NULL.
[...]
+| LAST | Finds the value with the largest timestamp that is
not NULL.
[...]
+| STDDEV | Alias for STDDEV_SAMP, calculates the sample
standard deviation.
[...]
+| STDDEV_POP | Calculates the population standard deviation.
[...]
+| STDDEV_SAMP | Calculates the sample standard deviation.
[...]
+| VARIANCE | Alias for VAR_SAMP, calculates the sample
variance.
[...]
+| VAR_POP | Calculates the population variance.
[...]
+| VAR_SAMP | Calculates the sample variance.
[...]
+| EXTREME | Finds the value with the largest absolute value. If
the largest absolute values of positive and negative values are equal, returns
the positive value.
[...]
+| MODE | Finds the mode. Note: 1. There is a risk of memory
exception when the number of distinct values in the input sequence is too
large; 2. If all elements have the same frequency, i.e., there is no mode, a
random element is returned; 3. If there are multiple modes, a random mode is
returned; 4. NULL values are also counted in frequency, so even if not all
values in the input sequence are NULL, the final result may still be NULL.
[...]
+| MAX_BY | MAX_BY(x, y) finds the value of x corresponding to
the maximum y in the binary input x and y. MAX_BY(time, x) returns the
timestamp when x is at its maximum.
[...]
+| MIN_BY | MIN_BY(x, y) finds the value of x corresponding to
the minimum y in the binary input x and y. MIN_BY(time, x) returns the
timestamp when x is at its minimum.
[...]
+| FIRST_BY | FIRST_BY(x, y) finds the value of x in the same row
when y is the first non-null value.
[...]
+| LAST_BY | LAST_BY(x, y) finds the value of x in the same row
when y is the last non-null value.
[...]
### 2.3 Examples
@@ -686,7 +686,7 @@ FROM
table1;
```
-Result**:**
+Result:
```Plain
+-----------------------------+-----------------------------+
@@ -715,6 +715,59 @@ Total line number = 18
It costs 0.319s
```
+### 4.3 Extract Function
+
+This function is used to extract the value of a specific part of a date.
(Supported from version V2.0.6)
+
+#### 4.3.1 Syntax Definition
+
+```SQL
+EXTRACT (identifier FROM expression)
+```
+
+* Parameter Description
+ * **expression**: `TIMESTAMP` type or a time constant
+ * **identifier**: The valid ranges and corresponding return value types are
shown in the table below.
+
+ | Valid Range | Return Type | Return Range |
+ |----------------------|---------------|--------------------|
+ | `YEAR` | `INT64` | `/` |
+ | `QUARTER` | `INT64` | `1-4` |
+ | `MONTH` | `INT64` | `1-12` |
+ | `WEEK` | `INT64` | `1-53` |
+ | `DAY_OF_MONTH (DAY)` | `INT64` | `1-31` |
+ | `DAY_OF_WEEK (DOW)` | `INT64` | `1-7` |
+ | `DAY_OF_YEAR (DOY)` | `INT64` | `1-366` |
+ | `HOUR` | `INT64` | `0-23` |
+ | `MINUTE` | `INT64` | `0-59` |
+ | `SECOND` | `INT64` | `0-59` |
+ | `MS` | `INT64` | `0-999` |
+ | `US` | `INT64` | `0-999` |
+ | `NS` | `INT64` | `0-999` |
+
+
+#### 4.3.2 Usage Example
+
+Using table1 from the [Sample Data](../Reference/Sample-Data.md) as the source
data, query the average temperature for the first 12 hours of each day within a
certain period.
+
+```SQL
+IoTDB:database1> select format('%1$tY-%1$tm-%1$td',date_bin(1d,time)) as
fmtdate,avg(temperature) as avgtp from table1 where time >= 2024-11-26T00:00:00
and time <= 2024-11-30T23:59:59 and extract(hour from time) <= 12 group by
date_bin(1d,time) order by date_bin(1d,time)
++----------+-----+
+| fmtdate|avgtp|
++----------+-----+
+|2024-11-28| 86.0|
+|2024-11-29| 85.0|
+|2024-11-30| 90.0|
++----------+-----+
+Total line number = 3
+It costs 0.041s
+```
+
+Introduction to the `Format` function: [Format
Function](../SQL-Manual/Basis-Function.md#_7-2-format-function)
+
+Introduction to the `Date_bin` function: [Date_bin
Funtion](../SQL-Manual/Basis-Function.md#_4-2-date-bin-interval-timestamp-timestamp-timestamp)
+
+
## 5. Mathematical Functions and Operators
### 5.1 Mathematical Operators
diff --git a/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
b/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
index 65ba2014..30380976 100644
--- a/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
+++ b/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
@@ -686,7 +686,7 @@ FROM
table1;
```
-Result**:**
+Result:
```Plain
+-----------------------------+-----------------------------+
@@ -715,6 +715,59 @@ Total line number = 18
It costs 0.319s
```
+### 4.3 Extract Function
+
+This function is used to extract the value of a specific part of a date.
(Supported from version V2.0.6)
+
+#### 4.3.1 Syntax Definition
+
+```SQL
+EXTRACT (identifier FROM expression)
+```
+
+* Parameter Description
+ * **expression**: `TIMESTAMP` type or a time constant
+ * **identifier**: The valid ranges and corresponding return value types are
shown in the table below.
+
+ | Valid Range | Return Type | Return Range |
+ |----------------------|---------------|--------------------|
+ | `YEAR` | `INT64` | `/` |
+ | `QUARTER` | `INT64` | `1-4` |
+ | `MONTH` | `INT64` | `1-12` |
+ | `WEEK` | `INT64` | `1-53` |
+ | `DAY_OF_MONTH (DAY)` | `INT64` | `1-31` |
+ | `DAY_OF_WEEK (DOW)` | `INT64` | `1-7` |
+ | `DAY_OF_YEAR (DOY)` | `INT64` | `1-366` |
+ | `HOUR` | `INT64` | `0-23` |
+ | `MINUTE` | `INT64` | `0-59` |
+ | `SECOND` | `INT64` | `0-59` |
+ | `MS` | `INT64` | `0-999` |
+ | `US` | `INT64` | `0-999` |
+ | `NS` | `INT64` | `0-999` |
+
+
+#### 4.3.2 Usage Example
+
+Using table1 from the [Sample Data](../Reference/Sample-Data.md) as the source
data, query the average temperature for the first 12 hours of each day within a
certain period.
+
+```SQL
+IoTDB:database1> select format('%1$tY-%1$tm-%1$td',date_bin(1d,time)) as
fmtdate,avg(temperature) as avgtp from table1 where time >= 2024-11-26T00:00:00
and time <= 2024-11-30T23:59:59 and extract(hour from time) <= 12 group by
date_bin(1d,time) order by date_bin(1d,time)
++----------+-----+
+| fmtdate|avgtp|
++----------+-----+
+|2024-11-28| 86.0|
+|2024-11-29| 85.0|
+|2024-11-30| 90.0|
++----------+-----+
+Total line number = 3
+It costs 0.041s
+```
+
+Introduction to the `Format` function: [Format
Function](../SQL-Manual/Basis-Function.md#_7-2-format-function)
+
+Introduction to the `Date_bin` function: [Date_bin
Funtion](../SQL-Manual/Basis-Function.md#_4-2-date-bin-interval-timestamp-timestamp-timestamp)
+
+
## 5. Mathematical Functions and Operators
### 5.1 Mathematical Operators
diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
b/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
index 6ed554c8..e16267f4 100644
--- a/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
+++ b/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
@@ -159,8 +159,8 @@ SELECT LEAST(temperature,humidity) FROM table2;
|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|------------------|
| COUNT | 计算数据点数。
| 所有类型
| INT64 |
| COUNT_IF | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数
| exp 必须是一个布尔类型的表达式,例如 count_if(temperature>20)
| INT64 |
-| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供
COUNT(DISTINCT x) 的近似值,返回不同输入值的近似个数。
| `x`:待计算列,支持所有类型;<br>
`maxStandardError`:指定该函数应产生的最大标准误差,取值范围[0.0040625, 0.26],未指定值时默认0.023。 | INT64
|
-| APPROX_MOST_FREQUENT | APPROX_MOST_FREQUENT(x, k, capacity)
函数用于近似计算数据集中出现频率最高的前 k 个元素。它返回一个JSON 格式的字符串,其中键是该元素的值,值是该元素对应的近似频率。(V 2.0.5.1
及以后版本支持) | `x`:待计算列,支持 IoTDB 现有所有的数据类型;<br> `k`:返回出现频率最高的 k
个值;<br> `capacity`:
用于计算的桶的数量,跟内存占用相关:其值越大误差越小,但占用内存更大,反之capacity值越小误差越大,但占用内存更小。 | STRING |
+| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供
COUNT(DISTINCT x) 的近似值,返回不同输入值的近似个数。 |
`x`:待计算列,支持所有类型;<br> `maxStandardError`:指定该函数应产生的最大标准误差,取值范围[0.0040625,
0.26],未指定值时默认0.023。 | INT64 |
+| APPROX_MOST_FREQUENT | APPROX_MOST_FREQUENT(x, k, capacity)
函数用于近似计算数据集中出现频率最高的前 k 个元素。它返回一个JSON 格式的字符串,其中键是该元素的值,值是该元素对应的近似频率。(V 2.0.5.1
及以后版本支持) | `x`:待计算列,支持 IoTDB 现有所有的数据类型;<br> `k`:返回出现频率最高的 k 个值;<br>
`capacity`: 用于计算的桶的数量,跟内存占用相关:其值越大误差越小,但占用内存更大,反之capacity值越小误差越大,但占用内存更小。 |
STRING |
| SUM | 求和。
| INT32 INT64 FLOAT DOUBLE
| DOUBLE |
| AVG | 求平均值。
| INT32 INT64 FLOAT DOUBLE
| DOUBLE |
| MAX | 求最大值。
| 所有类型
| 与输入类型一致 |
@@ -251,7 +251,6 @@ Total line number = 1
It costs 0.022s
```
-
#### 2.3.5 Approx_most_frequent
查询 `table1` 中 `temperature` 列出现频次最高的2个值
@@ -716,6 +715,58 @@ Total line number = 18
It costs 0.319s
```
+### 4.3 Extract 函数
+
+该函数用于提取日期对应部分的值。(V2.0.6 版本起支持)
+
+#### 4.3.1 语法定义
+
+```SQL
+EXTRACT (identifier FROM expression)
+```
+* 参数说明
+ * **expression**: `TIMESTAMP` 类型或时间常量
+ * **identifier** :取值范围及对应的返回值见下表
+
+ | 取值范围 | 返回值类型 | 返回值范围 |
+ | -------------------------- | ------------- | ------------- |
+ | `YEAR` | `INT64` | `/` |
+ | `QUARTER` | `INT64` | `1-4` |
+ | `MONTH` | `INT64` | `1-12` |
+ | `WEEK` | `INT64` | `1-53` |
+ | `DAY_OF_MONTH (DAY)` | `INT64` | `1-31` |
+ | `DAY_OF_WEEK (DOW)` | `INT64` | `1-7` |
+ | `DAY_OF_YEAR (DOY)` | `INT64` | `1-366` |
+ | `HOUR` | `INT64` | `0-23` |
+ | `MINUTE` | `INT64` | `0-59` |
+ | `SECOND` | `INT64` | `0-59` |
+ | `MS` | `INT64` | `0-999` |
+ | `US` | `INT64` | `0-999` |
+ | `NS` | `INT64` | `0-999` |
+
+
+#### 4.3.2 使用示例
+
+以[示例数据](../Reference/Sample-Data.md)中的 table1 为源数据,查询某段时间每天前12个小时的温度平均值
+
+```SQL
+IoTDB:database1> select format('%1$tY-%1$tm-%1$td',date_bin(1d,time)) as
fmtdate,avg(temperature) as avgtp from table1 where time >= 2024-11-26T00:00:00
and time <= 2024-11-30T23:59:59 and extract(hour from time) <= 12 group by
date_bin(1d,time) order by date_bin(1d,time)
++----------+-----+
+| fmtdate|avgtp|
++----------+-----+
+|2024-11-28| 86.0|
+|2024-11-29| 85.0|
+|2024-11-30| 90.0|
++----------+-----+
+Total line number = 3
+It costs 0.041s
+```
+
+`Format` 函数介绍:[Format 函数](../SQL-Manual/Basis-Function.md#_7-2-format-函数)
+
+`Date_bin` 函数介绍:[Date_bin
函数](../SQL-Manual/Basis-Function.md#_4-2-date-bin-interval-timestamp-timestamp-timestamp)
+
+
## 5. 数学函数和运算符
### 5.1 数学运算符
diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
b/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
index 219d6820..e16267f4 100644
--- a/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
+++ b/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
@@ -715,6 +715,58 @@ Total line number = 18
It costs 0.319s
```
+### 4.3 Extract 函数
+
+该函数用于提取日期对应部分的值。(V2.0.6 版本起支持)
+
+#### 4.3.1 语法定义
+
+```SQL
+EXTRACT (identifier FROM expression)
+```
+* 参数说明
+ * **expression**: `TIMESTAMP` 类型或时间常量
+ * **identifier** :取值范围及对应的返回值见下表
+
+ | 取值范围 | 返回值类型 | 返回值范围 |
+ | -------------------------- | ------------- | ------------- |
+ | `YEAR` | `INT64` | `/` |
+ | `QUARTER` | `INT64` | `1-4` |
+ | `MONTH` | `INT64` | `1-12` |
+ | `WEEK` | `INT64` | `1-53` |
+ | `DAY_OF_MONTH (DAY)` | `INT64` | `1-31` |
+ | `DAY_OF_WEEK (DOW)` | `INT64` | `1-7` |
+ | `DAY_OF_YEAR (DOY)` | `INT64` | `1-366` |
+ | `HOUR` | `INT64` | `0-23` |
+ | `MINUTE` | `INT64` | `0-59` |
+ | `SECOND` | `INT64` | `0-59` |
+ | `MS` | `INT64` | `0-999` |
+ | `US` | `INT64` | `0-999` |
+ | `NS` | `INT64` | `0-999` |
+
+
+#### 4.3.2 使用示例
+
+以[示例数据](../Reference/Sample-Data.md)中的 table1 为源数据,查询某段时间每天前12个小时的温度平均值
+
+```SQL
+IoTDB:database1> select format('%1$tY-%1$tm-%1$td',date_bin(1d,time)) as
fmtdate,avg(temperature) as avgtp from table1 where time >= 2024-11-26T00:00:00
and time <= 2024-11-30T23:59:59 and extract(hour from time) <= 12 group by
date_bin(1d,time) order by date_bin(1d,time)
++----------+-----+
+| fmtdate|avgtp|
++----------+-----+
+|2024-11-28| 86.0|
+|2024-11-29| 85.0|
+|2024-11-30| 90.0|
++----------+-----+
+Total line number = 3
+It costs 0.041s
+```
+
+`Format` 函数介绍:[Format 函数](../SQL-Manual/Basis-Function.md#_7-2-format-函数)
+
+`Date_bin` 函数介绍:[Date_bin
函数](../SQL-Manual/Basis-Function.md#_4-2-date-bin-interval-timestamp-timestamp-timestamp)
+
+
## 5. 数学函数和运算符
### 5.1 数学运算符
