(iotdb-docs) branch main updated: add approx_count_distinct aggregate function in table model (#754)

critas Tue, 03 Jun 2025 18:41:44 -0700

This is an automated email from the ASF dual-hosted git repository.

critas pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iotdb-docs.git



The following commit(s) were added to refs/heads/main by this push:
     new e429339f add approx_count_distinct aggregate function in table model 
(#754)
e429339f is described below

commit e429339f8474598bc49f9da9cf8f84489cd529c8
Author: leto-b <[email protected]>
AuthorDate: Wed Jun 4 09:41:32 2025 +0800

    add approx_count_distinct aggregate function in table model (#754)
---
 .../Master/Table/SQL-Manual/Basis-Function.md      | 78 +++++++++++++--------
 .../latest-Table/SQL-Manual/Basis-Function.md      | 78 +++++++++++++--------
 .../Master/Table/SQL-Manual/Basis-Function.md      | 80 ++++++++++++++--------
 .../latest-Table/SQL-Manual/Basis-Function.md      | 80 ++++++++++++++--------
 4 files changed, 202 insertions(+), 114 deletions(-)

diff --git a/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md 
b/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
index 2a5dfcc2..44a0b67b 100644
--- a/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
+++ b/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
@@ -156,28 +156,29 @@ SELECT LEAST(temperature,humidity) FROM table2;
 
 ### 2.2 Supported Aggregate Functions                            
 
-| Function Name | Description                                                  
                                                                                
                                                                                
                                                                                
                                                                                
                                                | Allowed Input Types           
              [...]
-|:--------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------
 [...]
-| COUNT         | Counts the number of data points.                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                | All types                     
              [...]
-| COUNT_IF      | COUNT_IF(exp) counts the number of rows that satisfy a 
specified boolean expression.                                                   
                                                                                
                                                                                
                                                                                
                                                      | `exp` must be a boolean 
expression,(e.g. `c [...]
-| SUM           | Calculates the sum.                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| AVG           | Calculates the average.                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| MAX           | Finds the maximum value.                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                | All types                     
              [...]
-| MIN           | Finds the minimum value.                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                | All types                     
              [...]
-| FIRST         | Finds the value with the smallest timestamp that is not 
NULL.                                                                           
                                                                                
                                                                                
                                                                                
                                                     | All types                
                   [...]
-| LAST          | Finds the value with the largest timestamp that is not NULL. 
                                                                                
                                                                                
                                                                                
                                                                                
                                                | All types                     
              [...]
-| STDDEV        | Alias for STDDEV_SAMP,  calculates the sample standard 
deviation.                                                                      
                                                                                
                                                                                
                                                                                
                                                      | INT32 INT64 FLOAT 
DOUBLE                    [...]
-| STDDEV_POP    | Calculates the population standard deviation.                
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| STDDEV_SAMP   | Calculates the sample standard deviation.                    
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| VARIANCE      | Alias for VAR_SAMP,  calculates the sample variance.         
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| VAR_POP       | Calculates the population variance.                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| VAR_SAMP      | Calculates the sample variance.                              
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| EXTREME       | Finds the value with the largest absolute value. If the 
largest absolute values of positive and negative values are equal, returns the 
positive value.                                                                 
                                                                                
                                                                                
                                                      | INT32 INT64 FLOAT 
DOUBLE                    [...]
-| MODE          | Finds the mode. Note: 1. There is a risk of memory exception 
when the number of distinct values in the input sequence is too large; 2. If 
all elements have the same frequency, i.e., there is no mode, a random element 
is returned; 3. If there are multiple modes, a random mode is returned; 4. NULL 
values are also counted in frequency, so even if not all values in the input 
sequence are NULL, the final result may still be NULL. | All types              
                     [...]
-| MAX_BY        | MAX_BY(x, y) finds the value of x corresponding to the 
maximum y in the binary input x and y. MAX_BY(time, x) returns the timestamp 
when x is at its maximum.                                                       
                                                                                
                                                                                
                                                         | x and y can be of 
any type                  [...]
-| MIN_BY        | MIN_BY(x, y) finds the value of x corresponding to the 
minimum y in the binary input x and y. MIN_BY(time, x) returns the timestamp 
when x is at its minimum.                                                       
                                                                                
                                                                                
                                                         | x and y can be of 
any type                  [...]
-| FIRST_BY      | FIRST_BY(x, y) finds the value of x in the same row when y 
is the first non-null value.                                                    
                                                                                
                                                                                
                                                                                
                                                  | x and y can be of any type  
                [...]
-| LAST_BY       | LAST_BY(x, y) finds the value of x in the same row when y is 
the last non-null value.                                                        
                                                                                
                                                                                
                                                                                
                                                | x and y can be of any type    
              [...]
+| Function Name          | Description                                         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+|:-----------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| COUNT                  | Counts the number of data points.                   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| COUNT_IF               | COUNT_IF(exp) counts the number of rows that 
satisfy a specified boolean expression.                                         
                                                                                
                                                                                
                                                                                
                                                                                
                     [...]
+| APPROX_COUNT_DISTINCT  | The APPROX_COUNT_DISTINCT(x[, maxStandardError]) 
function provides an approximation of COUNT(DISTINCT x), returning the 
estimated number of distinct input values.                                      
                                                                                
                                                                                
                                                                                
                          [...]
+| SUM                    | Calculates the sum.                                 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| AVG                    | Calculates the average.                             
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| MAX                    | Finds the maximum value.                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| MIN                    | Finds the minimum value.                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| FIRST                  | Finds the value with the smallest timestamp that is 
not NULL.                                                                       
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| LAST                   | Finds the value with the largest timestamp that is 
not NULL.                                                                       
                                                                                
                                                                                
                                                                                
                                                                                
               [...]
+| STDDEV                 | Alias for STDDEV_SAMP,  calculates the sample 
standard deviation.                                                             
                                                                                
                                                                                
                                                                                
                                                                                
                    [...]
+| STDDEV_POP             | Calculates the population standard deviation.       
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| STDDEV_SAMP            | Calculates the sample standard deviation.           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| VARIANCE               | Alias for VAR_SAMP,  calculates the sample 
variance.                                                                       
                                                                                
                                                                                
                                                                                
                                                                                
                       [...]
+| VAR_POP                | Calculates the population variance.                 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| VAR_SAMP               | Calculates the sample variance.                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| EXTREME                | Finds the value with the largest absolute value. If 
the largest absolute values of positive and negative values are equal, returns 
the positive value.                                                             
                                                                                
                                                                                
                                                                                
               [...]
+| MODE                   | Finds the mode. Note: 1. There is a risk of memory 
exception when the number of distinct values in the input sequence is too 
large; 2. If all elements have the same frequency, i.e., there is no mode, a 
random element is returned; 3. If there are multiple modes, a random mode is 
returned; 4. NULL values are also counted in frequency, so even if not all 
values in the input sequence are NULL, the final result may still be NULL.      
                                [...]
+| MAX_BY                 | MAX_BY(x, y) finds the value of x corresponding to 
the maximum y in the binary input x and y. MAX_BY(time, x) returns the 
timestamp when x is at its maximum.                                             
                                                                                
                                                                                
                                                                                
                        [...]
+| MIN_BY                 | MIN_BY(x, y) finds the value of x corresponding to 
the minimum y in the binary input x and y. MIN_BY(time, x) returns the 
timestamp when x is at its minimum.                                             
                                                                                
                                                                                
                                                                                
                        [...]
+| FIRST_BY               | FIRST_BY(x, y) finds the value of x in the same row 
when y is the first non-null value.                                             
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| LAST_BY                | LAST_BY(x, y) finds the value of x in the same row 
when y is the last non-null value.                                              
                                                                                
                                                                                
                                                                                
                                                                                
               [...]
 
 
 ### 2.3 Examples
@@ -229,8 +230,29 @@ Total line number = 1
 It costs 0.047s
 ```
 
+#### 2.3.4 Approx_count_distinct
 
-#### 2.3.4 First
+Retrieve the number of distinct values in the `temperature` column from 
`table1`.
+
+```sql
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, 
APPROX_COUNT_DISTINCT(temperature) as approx FROM table1;
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, 
APPROX_COUNT_DISTINCT(temperature,0.006) as approx FROM table1;
+```
+
+The execution result is as follows:
+
+```sql
++------+------+
+|origin|approx|
++------+------+
+|     3|     3|
++------+------+
+Total line number = 1
+It costs 0.022s
+```
+
+
+#### 2.3.5 First
 
 Finds the values with the smallest timestamp that are not NULL in the 
`temperature` and `humidity` columns.
 
@@ -250,7 +272,7 @@ Total line number = 1
 It costs 0.170s
 ```
 
-#### 2.3.5 Last
+#### 2.3.6 Last
 
 Finds the values with the largest timestamp that are not NULL in the 
`temperature` and `humidity` columns.
 
@@ -270,7 +292,7 @@ Total line number = 1
 It costs 0.211s
 ```
 
-#### 2.3.6 First_by
+#### 2.3.7 First_by
 
 Finds the `time` value of the row with the smallest timestamp that is not NULL 
in the `temperature` column, and the `humidity` value of the row with the 
smallest timestamp that is not NULL in the `temperature` column.
 
@@ -290,7 +312,7 @@ Total line number = 1
 It costs 0.269s
 ```
 
-#### 2.3.7 Last_by
+#### 2.3.8 Last_by
 
 Queries the `time` value of the row with the largest timestamp that is not 
NULL in the `temperature` column, and the `humidity` value of the row with the 
largest timestamp that is not NULL in the `temperature` column.
 
@@ -310,7 +332,7 @@ Total line number = 1
 It costs 0.070s
 ```
 
-#### 2.3.8 Max_by
+#### 2.3.9 Max_by
 
 Queries the `time` value of the row where the `temperature` column is at its 
maximum, and the `humidity` value of the row where the `temperature` column is 
at its maximum.
 
@@ -330,7 +352,7 @@ Total line number = 1
 It costs 0.172s
 ```
 
-#### 2.3.9 Min_by
+#### 2.3.10 Min_by
 
 Queries the `time` value of the row where the `temperature` column is at its 
minimum, and the `humidity` value of the row where the `temperature` column is 
at its minimum.
 
diff --git a/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md 
b/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
index 2a5dfcc2..44a0b67b 100644
--- a/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
+++ b/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
@@ -156,28 +156,29 @@ SELECT LEAST(temperature,humidity) FROM table2;
 
 ### 2.2 Supported Aggregate Functions                            
 
-| Function Name | Description                                                  
                                                                                
                                                                                
                                                                                
                                                                                
                                                | Allowed Input Types           
              [...]
-|:--------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------
 [...]
-| COUNT         | Counts the number of data points.                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                | All types                     
              [...]
-| COUNT_IF      | COUNT_IF(exp) counts the number of rows that satisfy a 
specified boolean expression.                                                   
                                                                                
                                                                                
                                                                                
                                                      | `exp` must be a boolean 
expression,(e.g. `c [...]
-| SUM           | Calculates the sum.                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| AVG           | Calculates the average.                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| MAX           | Finds the maximum value.                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                | All types                     
              [...]
-| MIN           | Finds the minimum value.                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                | All types                     
              [...]
-| FIRST         | Finds the value with the smallest timestamp that is not 
NULL.                                                                           
                                                                                
                                                                                
                                                                                
                                                     | All types                
                   [...]
-| LAST          | Finds the value with the largest timestamp that is not NULL. 
                                                                                
                                                                                
                                                                                
                                                                                
                                                | All types                     
              [...]
-| STDDEV        | Alias for STDDEV_SAMP,  calculates the sample standard 
deviation.                                                                      
                                                                                
                                                                                
                                                                                
                                                      | INT32 INT64 FLOAT 
DOUBLE                    [...]
-| STDDEV_POP    | Calculates the population standard deviation.                
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| STDDEV_SAMP   | Calculates the sample standard deviation.                    
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| VARIANCE      | Alias for VAR_SAMP,  calculates the sample variance.         
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| VAR_POP       | Calculates the population variance.                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| VAR_SAMP      | Calculates the sample variance.                              
                                                                                
                                                                                
                                                                                
                                                                                
                                                | INT32 INT64 FLOAT DOUBLE      
              [...]
-| EXTREME       | Finds the value with the largest absolute value. If the 
largest absolute values of positive and negative values are equal, returns the 
positive value.                                                                 
                                                                                
                                                                                
                                                      | INT32 INT64 FLOAT 
DOUBLE                    [...]
-| MODE          | Finds the mode. Note: 1. There is a risk of memory exception 
when the number of distinct values in the input sequence is too large; 2. If 
all elements have the same frequency, i.e., there is no mode, a random element 
is returned; 3. If there are multiple modes, a random mode is returned; 4. NULL 
values are also counted in frequency, so even if not all values in the input 
sequence are NULL, the final result may still be NULL. | All types              
                     [...]
-| MAX_BY        | MAX_BY(x, y) finds the value of x corresponding to the 
maximum y in the binary input x and y. MAX_BY(time, x) returns the timestamp 
when x is at its maximum.                                                       
                                                                                
                                                                                
                                                         | x and y can be of 
any type                  [...]
-| MIN_BY        | MIN_BY(x, y) finds the value of x corresponding to the 
minimum y in the binary input x and y. MIN_BY(time, x) returns the timestamp 
when x is at its minimum.                                                       
                                                                                
                                                                                
                                                         | x and y can be of 
any type                  [...]
-| FIRST_BY      | FIRST_BY(x, y) finds the value of x in the same row when y 
is the first non-null value.                                                    
                                                                                
                                                                                
                                                                                
                                                  | x and y can be of any type  
                [...]
-| LAST_BY       | LAST_BY(x, y) finds the value of x in the same row when y is 
the last non-null value.                                                        
                                                                                
                                                                                
                                                                                
                                                | x and y can be of any type    
              [...]
+| Function Name          | Description                                         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+|:-----------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| COUNT                  | Counts the number of data points.                   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| COUNT_IF               | COUNT_IF(exp) counts the number of rows that 
satisfy a specified boolean expression.                                         
                                                                                
                                                                                
                                                                                
                                                                                
                     [...]
+| APPROX_COUNT_DISTINCT  | The APPROX_COUNT_DISTINCT(x[, maxStandardError]) 
function provides an approximation of COUNT(DISTINCT x), returning the 
estimated number of distinct input values.                                      
                                                                                
                                                                                
                                                                                
                          [...]
+| SUM                    | Calculates the sum.                                 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| AVG                    | Calculates the average.                             
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| MAX                    | Finds the maximum value.                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| MIN                    | Finds the minimum value.                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| FIRST                  | Finds the value with the smallest timestamp that is 
not NULL.                                                                       
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| LAST                   | Finds the value with the largest timestamp that is 
not NULL.                                                                       
                                                                                
                                                                                
                                                                                
                                                                                
               [...]
+| STDDEV                 | Alias for STDDEV_SAMP,  calculates the sample 
standard deviation.                                                             
                                                                                
                                                                                
                                                                                
                                                                                
                    [...]
+| STDDEV_POP             | Calculates the population standard deviation.       
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| STDDEV_SAMP            | Calculates the sample standard deviation.           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| VARIANCE               | Alias for VAR_SAMP,  calculates the sample 
variance.                                                                       
                                                                                
                                                                                
                                                                                
                                                                                
                       [...]
+| VAR_POP                | Calculates the population variance.                 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| VAR_SAMP               | Calculates the sample variance.                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| EXTREME                | Finds the value with the largest absolute value. If 
the largest absolute values of positive and negative values are equal, returns 
the positive value.                                                             
                                                                                
                                                                                
                                                                                
               [...]
+| MODE                   | Finds the mode. Note: 1. There is a risk of memory 
exception when the number of distinct values in the input sequence is too 
large; 2. If all elements have the same frequency, i.e., there is no mode, a 
random element is returned; 3. If there are multiple modes, a random mode is 
returned; 4. NULL values are also counted in frequency, so even if not all 
values in the input sequence are NULL, the final result may still be NULL.      
                                [...]
+| MAX_BY                 | MAX_BY(x, y) finds the value of x corresponding to 
the maximum y in the binary input x and y. MAX_BY(time, x) returns the 
timestamp when x is at its maximum.                                             
                                                                                
                                                                                
                                                                                
                        [...]
+| MIN_BY                 | MIN_BY(x, y) finds the value of x corresponding to 
the minimum y in the binary input x and y. MIN_BY(time, x) returns the 
timestamp when x is at its minimum.                                             
                                                                                
                                                                                
                                                                                
                        [...]
+| FIRST_BY               | FIRST_BY(x, y) finds the value of x in the same row 
when y is the first non-null value.                                             
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+| LAST_BY                | LAST_BY(x, y) finds the value of x in the same row 
when y is the last non-null value.                                              
                                                                                
                                                                                
                                                                                
                                                                                
               [...]
 
 
 ### 2.3 Examples
@@ -229,8 +230,29 @@ Total line number = 1
 It costs 0.047s
 ```
 
+#### 2.3.4 Approx_count_distinct
 
-#### 2.3.4 First
+Retrieve the number of distinct values in the `temperature` column from 
`table1`.
+
+```sql
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, 
APPROX_COUNT_DISTINCT(temperature) as approx FROM table1;
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, 
APPROX_COUNT_DISTINCT(temperature,0.006) as approx FROM table1;
+```
+
+The execution result is as follows:
+
+```sql
++------+------+
+|origin|approx|
++------+------+
+|     3|     3|
++------+------+
+Total line number = 1
+It costs 0.022s
+```
+
+
+#### 2.3.5 First
 
 Finds the values with the smallest timestamp that are not NULL in the 
`temperature` and `humidity` columns.
 
@@ -250,7 +272,7 @@ Total line number = 1
 It costs 0.170s
 ```
 
-#### 2.3.5 Last
+#### 2.3.6 Last
 
 Finds the values with the largest timestamp that are not NULL in the 
`temperature` and `humidity` columns.
 
@@ -270,7 +292,7 @@ Total line number = 1
 It costs 0.211s
 ```
 
-#### 2.3.6 First_by
+#### 2.3.7 First_by
 
 Finds the `time` value of the row with the smallest timestamp that is not NULL 
in the `temperature` column, and the `humidity` value of the row with the 
smallest timestamp that is not NULL in the `temperature` column.
 
@@ -290,7 +312,7 @@ Total line number = 1
 It costs 0.269s
 ```
 
-#### 2.3.7 Last_by
+#### 2.3.8 Last_by
 
 Queries the `time` value of the row with the largest timestamp that is not 
NULL in the `temperature` column, and the `humidity` value of the row with the 
largest timestamp that is not NULL in the `temperature` column.
 
@@ -310,7 +332,7 @@ Total line number = 1
 It costs 0.070s
 ```
 
-#### 2.3.8 Max_by
+#### 2.3.9 Max_by
 
 Queries the `time` value of the row where the `temperature` column is at its 
maximum, and the `humidity` value of the row where the `temperature` column is 
at its maximum.
 
@@ -330,7 +352,7 @@ Total line number = 1
 It costs 0.172s
 ```
 
-#### 2.3.9 Min_by
+#### 2.3.10 Min_by
 
 Queries the `time` value of the row where the `temperature` column is at its 
minimum, and the `humidity` value of the row where the `temperature` column is 
at its minimum.
 
diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md 
b/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
index 33ca019b..528d738e 100644
--- a/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
+++ b/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md
@@ -155,28 +155,29 @@ SELECT LEAST(temperature,humidity) FROM table2;
 
 ### 2.2 支持的聚合函数                          
 
-| 函数名      | 功能描述                                                     | 
允许的输入类型                                       | 输出类型             |
-| ----------- | ------------------------------------------------------------ 
|-----------------------------------------------|------------------|
-| COUNT       | 计算数据点数。                                               | 所有类型   
                                       | INT64            |
-| COUNT_IF    | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数  | exp 必须是一个布尔类型的表达式，例如 
count_if(temperature>20) | INT64            |
-| SUM         | 求和。                                                       | 
INT32 INT64 FLOAT DOUBLE                      | DOUBLE           |
-| AVG         | 求平均值。                                                   | 
INT32 INT64 FLOAT DOUBLE                      | DOUBLE           |
-| MAX         | 求最大值。                                                   | 所有类型 
                                         | 与输入类型一致          |
-| MIN         | 求最小值。                                                   | 所有类型 
                                         | 与输入类型一致          |
-| FIRST       | 求时间戳最小且不为 NULL 的值。                               | 所有类型        
                                  | 与输入类型一致          |
-| LAST        | 求时间戳最大且不为 NULL 的值。                               | 所有类型        
                                  | 与输入类型一致          |
-| STDDEV      | STDDEV_SAMP 的别名，求样本标准差。                           | INT32 
INT64 FLOAT DOUBLE                      | DOUBLE           |
-| STDDEV_POP  | 求总体标准差。                                               | INT32 
INT64 FLOAT DOUBLE                      | DOUBLE           |
-| STDDEV_SAMP | 求样本标准差。                                               | INT32 
INT64 FLOAT DOUBLE                      | DOUBLE           |
-| VARIANCE    | VAR_SAMP 的别名，求样本方差。                                | INT32 
INT64 FLOAT DOUBLE                      | DOUBLE           |
-| VAR_POP     | 求总体方差。                                                 | INT32 
INT64 FLOAT DOUBLE                      | DOUBLE           |
-| VAR_SAMP    | 求样本方差。                                                 | INT32 
INT64 FLOAT DOUBLE                      | DOUBLE           |
-| EXTREME     | 求具有最大绝对值的值。如果正值和负值的最大绝对值相等，则返回正值。 | INT32 INT64 FLOAT DOUBLE   
                   | 与输入类型一致          |
-| MODE        | 求众数。注意： 1.输入序列的不同值个数过多时会有内存异常风险; 
2.如果所有元素出现的频次相同，即没有众数，则随机返回一个元素; 3.如果有多个众数，则随机返回一个众数； 4. NULL 
值也会被统计频次，所以即使输入序列的值不全为 NULL，最终结果也可能为 NULL。 | 所有类型                               
           | 与输入类型一致          |
-| MAX_BY      | MAX_BY(x, y) 求二元输入 x 和 y 在 y 最大时对应的 x 的值。MAX_BY(time, x) 返回 x 
取最大值时对应的时间戳。 | x 和 y 可以是任意类型                                 | 与第一个输入 x 的数据类型一致 
|
-| MIN_BY      | MIN_BY(x, y) 求二元输入 x 和 y 在 y 最小时对应的 x 的值。MIN_BY(time, x) 返回 x 
取最小值时对应的时间戳。 | x 和 y 可以是任意类型                                 | 与第一个输入 x 的数据类型一致 
|
-| FIRST_BY    | FIRST_BY(x, y) 求当 y 为第一个不为 NULL 的值时，同一行里对应的 x 值。 | x 和 y 
可以是任意类型                                 | 与第一个输入 x 的数据类型一致 |
-| LAST_BY     | LAST_BY(x, y) 求当 y 为最后一个不为 NULL 的值时，同一行里对应的 x 值。 | x 和 y 
可以是任意类型                                 | 与第一个输入 x 的数据类型一致 |
+| 函数名                   | 功能描述                                                 
                                                                                
    | 允许的输入类型                                                                   
              | 输出类型             |
+|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|------------------|
+| COUNT                 | 计算数据点数。                                              
                                                                                
    | 所有类型                                                                      
              | INT64            |
+| COUNT_IF              | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数                     
                                                                                
    | exp 必须是一个布尔类型的表达式，例如 count_if(temperature>20)                             
              | INT64            |
+| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供 
COUNT(DISTINCT x) 的近似值，返回不同输入值的近似个数。                                 | 
x：待计算列，支持所有类型；<br> maxStandardError：指定该函数应产生的最大标准误差，取值范围[0.0040625, 
0.26]，未指定值时默认0.023。 | INT64            |
+| SUM                   | 求和。                                                  
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| AVG                   | 求平均值。                                                
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| MAX                   | 求最大值。                                                
                                                                                
    | 所有类型                                                                      
              | 与输入类型一致          |
+| MIN                   | 求最小值。                                                
                                                                                
    | 所有类型                                                                      
              | 与输入类型一致          |
+| FIRST                 | 求时间戳最小且不为 NULL 的值。                                   
                                                                                
    | 所有类型                                                                      
              | 与输入类型一致          |
+| LAST                  | 求时间戳最大且不为 NULL 的值。                                   
                                                                                
    | 所有类型                                                                      
              | 与输入类型一致          |
+| STDDEV                | STDDEV_SAMP 的别名，求样本标准差。                              
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| STDDEV_POP            | 求总体标准差。                                              
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| STDDEV_SAMP           | 求样本标准差。                                              
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| VARIANCE              | VAR_SAMP 的别名，求样本方差。                                  
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| VAR_POP               | 求总体方差。                                               
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| VAR_SAMP              | 求样本方差。                                               
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| EXTREME               | 求具有最大绝对值的值。如果正值和负值的最大绝对值相等，则返回正值。                    
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | 与输入类型一致          |
+| MODE                  | 求众数。注意： 1.输入序列的不同值个数过多时会有内存异常风险; 
2.如果所有元素出现的频次相同，即没有众数，则随机返回一个元素; 3.如果有多个众数，则随机返回一个众数； 4. NULL 
值也会被统计频次，所以即使输入序列的值不全为 NULL，最终结果也可能为 NULL。 | 所有类型                               
                                                     | 与输入类型一致          |
+| MAX_BY                | MAX_BY(x, y) 求二元输入 x 和 y 在 y 最大时对应的 x 
的值。MAX_BY(time, x) 返回 x 取最大值时对应的时间戳。                                            
                   | x 和 y 可以是任意类型                                              
                             | 与第一个输入 x 的数据类型一致 |
+| MIN_BY                | MIN_BY(x, y) 求二元输入 x 和 y 在 y 最小时对应的 x 
的值。MIN_BY(time, x) 返回 x 取最小值时对应的时间戳。                                            
                   | x 和 y 可以是任意类型                                              
                             | 与第一个输入 x 的数据类型一致 |
+| FIRST_BY              | FIRST_BY(x, y) 求当 y 为第一个不为 NULL 的值时，同一行里对应的 x 值。     
                                                                                
    | x 和 y 可以是任意类型                                                             
              | 与第一个输入 x 的数据类型一致 |
+| LAST_BY               | LAST_BY(x, y) 求当 y 为最后一个不为 NULL 的值时，同一行里对应的 x 值。     
                                                                                
    | x 和 y 可以是任意类型                                                             
              | 与第一个输入 x 的数据类型一致 |
 
 
 ### 2.3 示例
@@ -213,7 +214,7 @@ It costs 0.834s
 统计 `table2` 中 到达时间 `arrival_time` 不是 `null` 的记录行数。
 
 ```sql
-select count_if(arrival_time is not null) from table2;
+IoTDB> select count_if(arrival_time is not null) from table2;
 ```
 
 执行结果如下：
@@ -228,8 +229,29 @@ Total line number = 1
 It costs 0.047s
 ```
 
+#### 2.3.4 Approx_count_distinct
 
-#### 2.3.4 First
+查询 `table1` 中 `temperature` 列不同值的个数。
+
+```sql
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, 
APPROX_COUNT_DISTINCT(temperature) as approx FROM table1;
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, 
APPROX_COUNT_DISTINCT(temperature,0.006) as approx FROM table1;
+```
+
+执行结果如下：
+
+```sql
++------+------+
+|origin|approx|
++------+------+
+|     3|     3|
++------+------+
+Total line number = 1
+It costs 0.022s
+```
+
+
+#### 2.3.5 First
 
 查询`temperature`列、`humidity`列时间戳最小且不为 NULL 的值。
 
@@ -249,7 +271,7 @@ Total line number = 1
 It costs 0.170s
 ```
 
-#### 2.3.5 Last
+#### 2.3.6 Last
 
 查询`temperature`列、`humidity`列时间戳最大且不为 NULL 的值。
 
@@ -269,7 +291,7 @@ Total line number = 1
 It costs 0.211s
 ```
 
-#### 2.3.6 First_by
+#### 2.3.7 First_by
 
 查询 `temperature` 列中非 NULL 且时间戳最小的行的 `time` 值，以及 `temperature` 列中非 NULL 
且时间戳最小的行的 `humidity` 值。
 
@@ -289,7 +311,7 @@ Total line number = 1
 It costs 0.269s
 ```
 
-#### 2.3.7 Last_by
+#### 2.3.8 Last_by
 
 查询`temperature` 列中非 NULL 且时间戳最大的行的 `time` 值，以及 `temperature` 列中非 NULL 
且时间戳最大的行的 `humidity` 值。
 
@@ -309,7 +331,7 @@ Total line number = 1
 It costs 0.070s
 ```
 
-#### 2.3.8 Max_by
+#### 2.3.9 Max_by
 
 查询`temperature` 列中最大值所在行的 `time` 值，以及`temperature` 列中最大值所在行的 `humidity` 值。
 
@@ -329,7 +351,7 @@ Total line number = 1
 It costs 0.172s
 ```
 
-#### 2.3.9 Min_by
+#### 2.3.10 Min_by
 
 查询`temperature` 列中最小值所在行的 `time` 值，以及`temperature` 列中最小值所在行的 `humidity` 值。
 
diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md 
b/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
index 33ca019b..528d738e 100644
--- a/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
+++ b/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md
@@ -155,28 +155,29 @@ SELECT LEAST(temperature,humidity) FROM table2;
 
 ### 2.2 支持的聚合函数                          
 
-| 函数名      | 功能描述                                                     | 
允许的输入类型                                       | 输出类型             |
-| ----------- | ------------------------------------------------------------ 
|-----------------------------------------------|------------------|
-| COUNT       | 计算数据点数。                                               | 所有类型   
                                       | INT64            |
-| COUNT_IF    | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数  | exp 必须是一个布尔类型的表达式，例如 
count_if(temperature>20) | INT64            |
-| SUM         | 求和。                                                       | 
INT32 INT64 FLOAT DOUBLE                      | DOUBLE           |
-| AVG         | 求平均值。                                                   | 
INT32 INT64 FLOAT DOUBLE                      | DOUBLE           |
-| MAX         | 求最大值。                                                   | 所有类型 
                                         | 与输入类型一致          |
-| MIN         | 求最小值。                                                   | 所有类型 
                                         | 与输入类型一致          |
-| FIRST       | 求时间戳最小且不为 NULL 的值。                               | 所有类型        
                                  | 与输入类型一致          |
-| LAST        | 求时间戳最大且不为 NULL 的值。                               | 所有类型        
                                  | 与输入类型一致          |
-| STDDEV      | STDDEV_SAMP 的别名，求样本标准差。                           | INT32 
INT64 FLOAT DOUBLE                      | DOUBLE           |
-| STDDEV_POP  | 求总体标准差。                                               | INT32 
INT64 FLOAT DOUBLE                      | DOUBLE           |
-| STDDEV_SAMP | 求样本标准差。                                               | INT32 
INT64 FLOAT DOUBLE                      | DOUBLE           |
-| VARIANCE    | VAR_SAMP 的别名，求样本方差。                                | INT32 
INT64 FLOAT DOUBLE                      | DOUBLE           |
-| VAR_POP     | 求总体方差。                                                 | INT32 
INT64 FLOAT DOUBLE                      | DOUBLE           |
-| VAR_SAMP    | 求样本方差。                                                 | INT32 
INT64 FLOAT DOUBLE                      | DOUBLE           |
-| EXTREME     | 求具有最大绝对值的值。如果正值和负值的最大绝对值相等，则返回正值。 | INT32 INT64 FLOAT DOUBLE   
                   | 与输入类型一致          |
-| MODE        | 求众数。注意： 1.输入序列的不同值个数过多时会有内存异常风险; 
2.如果所有元素出现的频次相同，即没有众数，则随机返回一个元素; 3.如果有多个众数，则随机返回一个众数； 4. NULL 
值也会被统计频次，所以即使输入序列的值不全为 NULL，最终结果也可能为 NULL。 | 所有类型                               
           | 与输入类型一致          |
-| MAX_BY      | MAX_BY(x, y) 求二元输入 x 和 y 在 y 最大时对应的 x 的值。MAX_BY(time, x) 返回 x 
取最大值时对应的时间戳。 | x 和 y 可以是任意类型                                 | 与第一个输入 x 的数据类型一致 
|
-| MIN_BY      | MIN_BY(x, y) 求二元输入 x 和 y 在 y 最小时对应的 x 的值。MIN_BY(time, x) 返回 x 
取最小值时对应的时间戳。 | x 和 y 可以是任意类型                                 | 与第一个输入 x 的数据类型一致 
|
-| FIRST_BY    | FIRST_BY(x, y) 求当 y 为第一个不为 NULL 的值时，同一行里对应的 x 值。 | x 和 y 
可以是任意类型                                 | 与第一个输入 x 的数据类型一致 |
-| LAST_BY     | LAST_BY(x, y) 求当 y 为最后一个不为 NULL 的值时，同一行里对应的 x 值。 | x 和 y 
可以是任意类型                                 | 与第一个输入 x 的数据类型一致 |
+| 函数名                   | 功能描述                                                 
                                                                                
    | 允许的输入类型                                                                   
              | 输出类型             |
+|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|------------------|
+| COUNT                 | 计算数据点数。                                              
                                                                                
    | 所有类型                                                                      
              | INT64            |
+| COUNT_IF              | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数                     
                                                                                
    | exp 必须是一个布尔类型的表达式，例如 count_if(temperature>20)                             
              | INT64            |
+| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供 
COUNT(DISTINCT x) 的近似值，返回不同输入值的近似个数。                                 | 
x：待计算列，支持所有类型；<br> maxStandardError：指定该函数应产生的最大标准误差，取值范围[0.0040625, 
0.26]，未指定值时默认0.023。 | INT64            |
+| SUM                   | 求和。                                                  
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| AVG                   | 求平均值。                                                
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| MAX                   | 求最大值。                                                
                                                                                
    | 所有类型                                                                      
              | 与输入类型一致          |
+| MIN                   | 求最小值。                                                
                                                                                
    | 所有类型                                                                      
              | 与输入类型一致          |
+| FIRST                 | 求时间戳最小且不为 NULL 的值。                                   
                                                                                
    | 所有类型                                                                      
              | 与输入类型一致          |
+| LAST                  | 求时间戳最大且不为 NULL 的值。                                   
                                                                                
    | 所有类型                                                                      
              | 与输入类型一致          |
+| STDDEV                | STDDEV_SAMP 的别名，求样本标准差。                              
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| STDDEV_POP            | 求总体标准差。                                              
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| STDDEV_SAMP           | 求样本标准差。                                              
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| VARIANCE              | VAR_SAMP 的别名，求样本方差。                                  
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| VAR_POP               | 求总体方差。                                               
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| VAR_SAMP              | 求样本方差。                                               
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | DOUBLE           |
+| EXTREME               | 求具有最大绝对值的值。如果正值和负值的最大绝对值相等，则返回正值。                    
                                                                                
    | INT32 INT64 FLOAT DOUBLE                                                  
              | 与输入类型一致          |
+| MODE                  | 求众数。注意： 1.输入序列的不同值个数过多时会有内存异常风险; 
2.如果所有元素出现的频次相同，即没有众数，则随机返回一个元素; 3.如果有多个众数，则随机返回一个众数； 4. NULL 
值也会被统计频次，所以即使输入序列的值不全为 NULL，最终结果也可能为 NULL。 | 所有类型                               
                                                     | 与输入类型一致          |
+| MAX_BY                | MAX_BY(x, y) 求二元输入 x 和 y 在 y 最大时对应的 x 
的值。MAX_BY(time, x) 返回 x 取最大值时对应的时间戳。                                            
                   | x 和 y 可以是任意类型                                              
                             | 与第一个输入 x 的数据类型一致 |
+| MIN_BY                | MIN_BY(x, y) 求二元输入 x 和 y 在 y 最小时对应的 x 
的值。MIN_BY(time, x) 返回 x 取最小值时对应的时间戳。                                            
                   | x 和 y 可以是任意类型                                              
                             | 与第一个输入 x 的数据类型一致 |
+| FIRST_BY              | FIRST_BY(x, y) 求当 y 为第一个不为 NULL 的值时，同一行里对应的 x 值。     
                                                                                
    | x 和 y 可以是任意类型                                                             
              | 与第一个输入 x 的数据类型一致 |
+| LAST_BY               | LAST_BY(x, y) 求当 y 为最后一个不为 NULL 的值时，同一行里对应的 x 值。     
                                                                                
    | x 和 y 可以是任意类型                                                             
              | 与第一个输入 x 的数据类型一致 |
 
 
 ### 2.3 示例
@@ -213,7 +214,7 @@ It costs 0.834s
 统计 `table2` 中 到达时间 `arrival_time` 不是 `null` 的记录行数。
 
 ```sql
-select count_if(arrival_time is not null) from table2;
+IoTDB> select count_if(arrival_time is not null) from table2;
 ```
 
 执行结果如下：
@@ -228,8 +229,29 @@ Total line number = 1
 It costs 0.047s
 ```
 
+#### 2.3.4 Approx_count_distinct
 
-#### 2.3.4 First
+查询 `table1` 中 `temperature` 列不同值的个数。
+
+```sql
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, 
APPROX_COUNT_DISTINCT(temperature) as approx FROM table1;
+IoTDB> SELECT COUNT(DISTINCT temperature) as origin, 
APPROX_COUNT_DISTINCT(temperature,0.006) as approx FROM table1;
+```
+
+执行结果如下：
+
+```sql
++------+------+
+|origin|approx|
++------+------+
+|     3|     3|
++------+------+
+Total line number = 1
+It costs 0.022s
+```
+
+
+#### 2.3.5 First
 
 查询`temperature`列、`humidity`列时间戳最小且不为 NULL 的值。
 
@@ -249,7 +271,7 @@ Total line number = 1
 It costs 0.170s
 ```
 
-#### 2.3.5 Last
+#### 2.3.6 Last
 
 查询`temperature`列、`humidity`列时间戳最大且不为 NULL 的值。
 
@@ -269,7 +291,7 @@ Total line number = 1
 It costs 0.211s
 ```
 
-#### 2.3.6 First_by
+#### 2.3.7 First_by
 
 查询 `temperature` 列中非 NULL 且时间戳最小的行的 `time` 值，以及 `temperature` 列中非 NULL 
且时间戳最小的行的 `humidity` 值。
 
@@ -289,7 +311,7 @@ Total line number = 1
 It costs 0.269s
 ```
 
-#### 2.3.7 Last_by
+#### 2.3.8 Last_by
 
 查询`temperature` 列中非 NULL 且时间戳最大的行的 `time` 值，以及 `temperature` 列中非 NULL 
且时间戳最大的行的 `humidity` 值。
 
@@ -309,7 +331,7 @@ Total line number = 1
 It costs 0.070s
 ```
 
-#### 2.3.8 Max_by
+#### 2.3.9 Max_by
 
 查询`temperature` 列中最大值所在行的 `time` 值，以及`temperature` 列中最大值所在行的 `humidity` 值。
 
@@ -329,7 +351,7 @@ Total line number = 1
 It costs 0.172s
 ```
 
-#### 2.3.9 Min_by
+#### 2.3.10 Min_by
 
 查询`temperature` 列中最小值所在行的 `time` 值，以及`temperature` 列中最小值所在行的 `humidity` 值。

(iotdb-docs) branch main updated: add approx_count_distinct aggregate function in table model (#754)

Reply via email to