This is an automated email from the ASF dual-hosted git repository.
yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 1c931006981 update corr/count/covar series doc (#2712)
1c931006981 is described below
commit 1c931006981307d1b67326b9244fe89d7817a6f5
Author: Pxl <[email protected]>
AuthorDate: Thu Aug 7 16:25:04 2025 +0800
update corr/count/covar series doc (#2712)
## Versions
- [ ] dev
- [ ] 3.0
- [ ] 2.1
- [ ] 2.0
## Languages
- [ ] Chinese
- [ ] English
## Docs Checklist
- [ ] Checked by AI
- [ ] Test Cases Built
---
.../aggregate-functions/corr-welford.md | 31 +++++++++--
.../sql-functions/aggregate-functions/corr.md | 62 ++++++++++++++--------
.../aggregate-functions/count-by-enum.md | 19 +++----
.../sql-functions/aggregate-functions/count.md | 50 +++++++++++------
.../aggregate-functions/covar-samp.md | 58 ++++++++++++++++----
.../sql-functions/aggregate-functions/covar.md | 57 +++++++++++++++-----
.../aggregate-functions/corr-welford.md | 56 ++++++++++++-------
.../sql-functions/aggregate-functions/corr.md | 54 ++++++++++++-------
.../aggregate-functions/count-by-enum.md | 15 +++---
.../sql-functions/aggregate-functions/count.md | 50 +++++++++++------
.../aggregate-functions/covar-samp.md | 54 +++++++++++++++----
.../sql-functions/aggregate-functions/covar.md | 56 +++++++++++++++----
12 files changed, 400 insertions(+), 162 deletions(-)
diff --git a/docs/sql-manual/sql-functions/aggregate-functions/corr-welford.md
b/docs/sql-manual/sql-functions/aggregate-functions/corr-welford.md
index 77cbac9d0ff..fe968d0b17e 100755
--- a/docs/sql-manual/sql-functions/aggregate-functions/corr-welford.md
+++ b/docs/sql-manual/sql-functions/aggregate-functions/corr-welford.md
@@ -7,7 +7,7 @@
## Description
-Calculate the Pearson coefficient of two random variables using the
[Welford](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm)
algorithm, which can effectively reduce calculation errors.
+Calculates the Pearson correlation coefficient between two random variables
using the
[Welford](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm)
algorithm, which effectively reduces calculation errors.
## Syntax
@@ -17,20 +17,41 @@ CORR_WELFORD(<expr1>, <expr2>)
## Parameters
-| Parameter | Description |
+| Parameters | Description |
| -- | -- |
| `<expr1>` | Double expression (column) |
| `<expr2>` | Double expression (column) |
## Return Value
-The return value is of type DOUBLE, the covariance of expr1 and expr2, except
the product of the standard deviation of expr1 and expr2, special case:
+Returns a DOUBLE type value, which is the covariance of expr1 and expr2
divided by the product of their standard deviations. Special cases:
-- If the standard deviation of expr1 or expr2 is 0, 0 will be returned.
-- If a column of expr1 or expr2 is NULL, the row data will not be counted in
the final result.
+- If the standard deviation of expr1 or expr2 is 0, returns 0.
+- If expr1 or expr2 contains NULL values, those rows are excluded from the
final result.
## Example
+```sql
+-- setup
+create table test_corr(
+ id int,
+ k1 double,
+ k2 double
+) distributed by hash (id) buckets 1
+properties ("replication_num"="1");
+
+insert into test_corr values
+ (1, 20, 22),
+ (1, 10, 20),
+ (2, 36, 21),
+ (2, 30, 22),
+ (2, 25, 20),
+ (3, 25, NULL),
+ (4, 25, 21),
+ (4, 25, 22),
+ (4, 25, 20);
+```
+
```sql
select * from test_corr;
```
diff --git a/docs/sql-manual/sql-functions/aggregate-functions/corr.md
b/docs/sql-manual/sql-functions/aggregate-functions/corr.md
index ec6a4744412..0d1bb4c2d7f 100644
--- a/docs/sql-manual/sql-functions/aggregate-functions/corr.md
+++ b/docs/sql-manual/sql-functions/aggregate-functions/corr.md
@@ -7,7 +7,7 @@
## Description
-Calculate the Pearson coefficient of two random variables.
+Calculates the Pearson correlation coefficient between two random variables.
## Syntax
@@ -17,38 +17,40 @@ CORR(<expr1>, <expr2>)
## Parameters
-| Parameter | Description |
+| Parameters | Description |
| -- | -- |
-| `<expr1>` | Numeric expression (column) |
-| `<expr2>` | Numeric expression (column) |
+| `<expr1>` | Expression for calculation. Supported type is Double. |
+| `<expr2>` | Expression for calculation. Supported type is Double. |
## Return Value
-The return value is of type DOUBLE, the covariance of expr1 and expr2, except
the product of the standard deviation of expr1 and expr2, special case:
+Returns a DOUBLE type value, which is the covariance of expr1 and expr2
divided by the product of their standard deviations. Special cases:
-- If the standard deviation of expr1 or expr2 is 0, 0 will be returned.
-- If a column of expr1 or expr2 is NULL, the row data will not be counted in
the final result.
+- If the standard deviation of expr1 or expr2 is 0, returns 0.
+- If expr1 or expr2 contains NULL values, those rows are excluded from the
calculation.
+- If there is no valid data in the group, returns NULL.
## Example
```sql
-select * from test_corr;
-```
+-- setup
+create table test_corr(
+ id int,
+ k1 double,
+ k2 double
+) distributed by hash (id) buckets 1
+properties ("replication_num"="1");
-```text
-+------+------+------+
-| id | k1 | k2 |
-+------+------+------+
-| 1 | 20 | 22 |
-| 1 | 10 | 20 |
-| 2 | 36 | 21 |
-| 2 | 30 | 22 |
-| 2 | 25 | 20 |
-| 3 | 25 | NULL |
-| 4 | 25 | 21 |
-| 4 | 25 | 22 |
-| 4 | 25 | 20 |
-+------+------+------+
+insert into test_corr values
+ (1, 20, 22),
+ (1, 10, 20),
+ (2, 36, 21),
+ (2, 30, 22),
+ (2, 25, 20),
+ (3, 25, NULL),
+ (4, 25, 21),
+ (4, 25, 22),
+ (4, 25, 20);
```
```sql
@@ -65,3 +67,17 @@ select id,corr(k1,k2) from test_corr group by id;
| 2 | 0.4539206495016019 |
+------+--------------------+
```
+
+```sql
+select corr(k1,k2) from test_corr where id=999;
+```
+
+When the query result is empty, returns NULL.
+
+```text
++-------------+
+| corr(k1,k2) |
++-------------+
+| NULL |
++-------------+
+```
diff --git a/docs/sql-manual/sql-functions/aggregate-functions/count-by-enum.md
b/docs/sql-manual/sql-functions/aggregate-functions/count-by-enum.md
index 0b0dc746ec8..bbbecb63667 100644
--- a/docs/sql-manual/sql-functions/aggregate-functions/count-by-enum.md
+++ b/docs/sql-manual/sql-functions/aggregate-functions/count-by-enum.md
@@ -7,7 +7,7 @@
## Description
-Treat the data in the column as enumeration values and count the number of
each enumeration value. Returns the number of enumeration values for each
column, as well as the number of non-null values and the number of null values.
+Treat the data in the column as enumeration values and count the number of
each enumeration value. Returns the number of enumeration values for each
column, as well as the number of non-NULL values and the number of NULL values.
## Syntax
@@ -19,11 +19,12 @@ COUNT_BY_ENUM(<expr1>, <expr2>, ... , <exprN>)
| Parameter | Description |
| -- | -- |
-| `<expr1>` | Fill in at least one input. Column whose value is of type STRING
|
+| `<expr1>` | At least one input is required, supports up to 1024 inputs.
Supported type is String. |
-## Returned value
+## Return Value
-Returns a JSONArray string.
+Returns results in JSONArray format.
+Return type is String.
For example:
```json
@@ -56,11 +57,11 @@ For example:
"all": 200
}]
```
-Description: The return value is a JSON array string and the order of the
internal objects is the order of the input parameters.
-* cbe: count of non-null values based on enumeration values
-* notnull: number of non-null values.
-* null: number of null values
-* all: total number, including both null and non-null values.
+Description: The return value is a JSON array string, and the order of
internal objects follows the order of input parameters.
+* cbe: Statistical results of non-NULL values based on enumeration values
+* notnull: Count of non-NULL values
+* null: Count of NULL values
+* all: Total count, including both NULL and non-NULL values.
## Example
diff --git a/docs/sql-manual/sql-functions/aggregate-functions/count.md
b/docs/sql-manual/sql-functions/aggregate-functions/count.md
index 42e9964ba09..071a1a66902 100644
--- a/docs/sql-manual/sql-functions/aggregate-functions/count.md
+++ b/docs/sql-manual/sql-functions/aggregate-functions/count.md
@@ -7,7 +7,7 @@
## Description
-Returns the number of non-NULL records in the specified column, or the total
number of records.
+Returns the number of non-NULL records for the specified column, or the total
number of records.
## Syntax
@@ -21,29 +21,45 @@ COUNT(<expr>)
| Parameter | Description |
| -- | -- |
-| `<expr>` | Conditional expression (column name) |
+| `<expr>` | If an expression is specified, counts the number of non-NULL
records; otherwise, counts the total number of rows. |
## Return Value
-The return value is of numeric type. If expr is NULL, there will be no
parameter statistics.
+The return type is Bigint. If expr is NULL, it is not counted.
## Example
```sql
-select * from test_count;
-```
-
-```text
-+------+------+------+
-| id | name | sex |
-+------+------+------+
-| 1 | 1 | 1 |
-| 2 | 2 | 1 |
-| 3 | 3 | 1 |
-| 4 | 0 | 1 |
-| 4 | 4 | 1 |
-| 5 | NULL | 1 |
-+------+------+------+
+-- setup
+create table test_count(
+ id int,
+ name varchar(20),
+ sex int
+) distributed by hash(id) buckets 1
+properties ("replication_num"="1");
+
+insert into test_count values
+ (1, '1', 1),
+ (2, '2', 1),
+ (3, '3', 1),
+ (4, '0', 1),
+ (4, '4', 1),
+ (5, NULL, 1);
+
+create table test_insert(
+ id int,
+ name varchar(20),
+ sex int
+) distributed by hash(id) buckets 1
+properties ("replication_num"="1");
+
+insert into test_insert values
+ (1, '1', 1),
+ (2, '2', 1),
+ (3, '3', 1),
+ (4, '0', 1),
+ (4, '4', 1),
+ (5, NULL, 1);
```
```sql
diff --git a/docs/sql-manual/sql-functions/aggregate-functions/covar-samp.md
b/docs/sql-manual/sql-functions/aggregate-functions/covar-samp.md
index 04955a785f5..b918752f73a 100644
--- a/docs/sql-manual/sql-functions/aggregate-functions/covar-samp.md
+++ b/docs/sql-manual/sql-functions/aggregate-functions/covar-samp.md
@@ -7,7 +7,7 @@
## Description
-Computes the sample covariance between two numeric variables.
+Calculates the sample covariance between two variables. If either input
variable is NULL, that row is not included in the calculation.
## Syntax
@@ -19,25 +19,61 @@ COVAR_SAMP(<expr1>, <expr2>)
| Parameter | Description |
| -- | -- |
-| `<expr1>` | Numeric expression or column |
-| `<expr2>` | Numeric expression or column |
+| `<expr1>` | One of the expressions to calculate, supported type is Double. |
+| `<expr2>` | One of the expressions to calculate, supported type is Double. |
## Return Value
-Returns the sample covariance of expr1 and expr2, special case:
-
-- If a column of expr1 or expr2 is NULL, the row data will not be counted in
the final result.
+Returns the sample covariance of expr1 and expr2, with return type Double.
+If there is no valid data in the group, returns NULL.
## Example
+```sql
+-- setup
+create table baseall(
+ id int,
+ x double,
+ y double
+) distributed by hash(id) buckets 1
+properties ("replication_num"="1");
+
+insert into baseall values
+ (1, 1.0, 2.0),
+ (2, 2.0, 3.0),
+ (3, 3.0, 4.0),
+ (4, 4.0, NULL),
+ (5, NULL, 5.0);
```
+
+```sql
select covar_samp(x,y) from baseall;
```
```text
-+---------------------+
-| covar_samp(x, y) |
-+---------------------+
-| 0.89442719099991586 |
-+---------------------+
++-----------------+
+| covar_samp(x,y) |
++-----------------+
+| 1 |
++-----------------+
+```
+
+```sql
+select id, covar_samp(x, y) from baseall group by id;
+```
+
+```text
++------+------------------+
+| id | covar_samp(x, y) |
++------+------------------+
+| 1 | 0 |
+| 2 | 0 |
+| 3 | 0 |
+| 4 | NULL |
+| 5 | NULL |
++------+------------------+
+```
+| 4 | NULL |
+| 5 | NULL |
++------+------------------+
```
diff --git a/docs/sql-manual/sql-functions/aggregate-functions/covar.md
b/docs/sql-manual/sql-functions/aggregate-functions/covar.md
index 24f95ab54a9..fad59738dfd 100644
--- a/docs/sql-manual/sql-functions/aggregate-functions/covar.md
+++ b/docs/sql-manual/sql-functions/aggregate-functions/covar.md
@@ -1,13 +1,13 @@
---
{
- "title": "COVAR,COVAR_POP",
+ "title": "COVAR",
"language": "en"
}
---
## Description
-Calculate the covariance between two numeric variables.
+Calculates the sample covariance between two variables. If either input
variable is NULL, that row is not included in the calculation.
## Alias
@@ -17,31 +17,64 @@ Calculate the covariance between two numeric variables.
```sql
COVAR(<expr1>, <expr2>)
+COVAR_POP(<expr1>, <expr2>)
```
## Parameters
| Parameter | Description |
| -- | -- |
-| `<expr1>` | Numeric expression or column |
-| `<expr2>` | Numeric expression or column |
+| `<expr1>` | One of the expressions to calculate, supported type is Double. |
+| `<expr2>` | One of the expressions to calculate, supported type is Double. |
## Return Value
-Returns the covariance value of expr1 and expr2, special case:
-
-- If a column of expr1 or expr2 is NULL, the row data will not be counted in
the final result.
+Returns the sample covariance of expr1 and expr2, with return type Double.
+If there is no valid data in the group, returns NULL.
## Example
+```sql
+-- setup
+create table baseall(
+ id int,
+ x double,
+ y double
+) distributed by hash(id) buckets 1
+properties ("replication_num"="1");
+
+insert into baseall values
+ (1, 1.0, 2.0),
+ (2, 2.0, 3.0),
+ (3, 3.0, 4.0),
+ (4, 4.0, NULL),
+ (5, NULL, 5.0);
```
+
+```sql
select covar(x,y) from baseall;
```
```text
-+---------------------+
-| covar(x, y) |
-+---------------------+
-| 0.89442719099991586 |
-+---------------------+
++-------------------+
+| covar(x,y) |
++-------------------+
+| 0.666666666666667 |
++-------------------+
+```
+
+```sql
+select id, covar(x, y) from baseall group by id;
+```
+
+```text
++------+-------------+
+| id | covar(x, y) |
++------+-------------+
+| 1 | 0 |
+| 2 | 0 |
+| 3 | 0 |
+| 4 | NULL |
+| 5 | NULL |
++------+-------------+
```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/corr-welford.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/corr-welford.md
index 38f0b1c1ace..85e9184ba9e 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/corr-welford.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/corr-welford.md
@@ -19,8 +19,8 @@ CORR_WELFORD(<expr1>, <expr2>)
| 参数 | 说明 |
| -- | -- |
-| `<expr1>` | Double 表达式(列) |
-| `<expr2>` | Double 表达式(列) |
+| `<expr1>` | 用于计算的表达式之一,支持类型为 Double 。 |
+| `<expr2>` | 用于计算的表达式之一,支持类型为 Double 。 |
## 返回值
@@ -28,27 +28,29 @@ CORR_WELFORD(<expr1>, <expr2>)
- 如果 expr1 或 expr2 的标准差为 0, 将返回 0。
- 如果 expr1 或者 expr2 某一列为 NULL 时,该行数据不会被统计到最终结果中。
+- 如果组内没有有效数据,返回 NULL 。
## 举例
```sql
-select * from test_corr;
-```
+-- setup
+create table test_corr(
+ id int,
+ k1 double,
+ k2 double
+) distributed by hash (id) buckets 1
+properties ("replication_num"="1");
-```text
-+------+------+------+
-| id | k1 | k2 |
-+------+------+------+
-| 1 | 20 | 22 |
-| 1 | 10 | 20 |
-| 2 | 36 | 21 |
-| 2 | 30 | 22 |
-| 2 | 25 | 20 |
-| 3 | 25 | NULL |
-| 4 | 25 | 21 |
-| 4 | 25 | 22 |
-| 4 | 25 | 20 |
-+------+------+------+
+insert into test_corr values
+ (1, 20, 22),
+ (1, 10, 20),
+ (2, 36, 21),
+ (2, 30, 22),
+ (2, 25, 20),
+ (3, 25, NULL),
+ (4, 25, 21),
+ (4, 25, 22),
+ (4, 25, 20);
```
```sql
@@ -59,9 +61,23 @@ select id,corr_welford(k1,k2) from test_corr group by id;
+------+---------------------+
| id | corr_welford(k1,k2) |
+------+---------------------+
+| 1 | 1 |
| 2 | 0.4539206495016017 |
-| 4 | 0 |
| 3 | NULL |
-| 1 | 1 |
+| 4 | 0 |
+------+---------------------+
```
+
+```sql
+select corr_welford(k1,k2) from test_corr where id=999;
+```
+
+组内没有有效数据。
+
+```text
++---------------------+
+| corr_welford(k1,k2) |
++---------------------+
+| NULL |
++---------------------+
+```
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/corr.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/corr.md
index 389580a2472..3981be42c2b 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/corr.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/corr.md
@@ -7,7 +7,7 @@
## 描述
-计算两个随机变量的皮尔逊系数
+计算两个随机变量的皮尔逊系数。
## 语法
@@ -19,8 +19,9 @@ CORR(<expr1>, <expr2>)
| 参数 | 说明 |
| -- | -- |
-| `<expr1>` | 数值型表达式(列) |
-| `<expr2>` | 数值型表达式(列) |
+| `<expr1>` | 用于计算的表达式之一,支持类型为 Double 。 |
+| `<expr2>` | 用于计算的表达式之一,支持类型为 Double 。 |
+
## 返回值
@@ -28,27 +29,29 @@ CORR(<expr1>, <expr2>)
- 如果 expr1 或 expr2 的标准差为 0, 将返回 0。
- 如果 expr1 或者 expr2 某一列为 NULL 时,该行数据不会被统计到最终结果中。
+- 如果组内没有有效数据,返回 NULL 。
## 举例
```sql
-select * from test_corr;
-```
+-- setup
+create table test_corr(
+ id int,
+ k1 double,
+ k2 double
+) distributed by hash (id) buckets 1
+properties ("replication_num"="1");
-```text
-+------+------+------+
-| id | k1 | k2 |
-+------+------+------+
-| 1 | 20 | 22 |
-| 1 | 10 | 20 |
-| 2 | 36 | 21 |
-| 2 | 30 | 22 |
-| 2 | 25 | 20 |
-| 3 | 25 | NULL |
-| 4 | 25 | 21 |
-| 4 | 25 | 22 |
-| 4 | 25 | 20 |
-+------+------+------+
+insert into test_corr values
+ (1, 20, 22),
+ (1, 10, 20),
+ (2, 36, 21),
+ (2, 30, 22),
+ (2, 25, 20),
+ (3, 25, NULL),
+ (4, 25, 21),
+ (4, 25, 22),
+ (4, 25, 20);
```
```sql
@@ -66,3 +69,16 @@ select id,corr(k1,k2) from test_corr group by id;
+------+--------------------+
```
+```sql
+select corr_welford(k1,k2) from test_corr where id=999;
+```
+
+组内没有有效数据。
+
+```text
++-------------+
+| corr(k1,k2) |
++-------------+
+| NULL |
++-------------+
+```
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/count-by-enum.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/count-by-enum.md
index 363124a3fae..d2037da7df4 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/count-by-enum.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/count-by-enum.md
@@ -7,7 +7,7 @@
## 描述
-将列中数据看作枚举值,统计每个枚举值的个数。返回各个列枚举值的个数,以及非 null 值的个数与 null 值的个数。
+将列中数据看作枚举值,统计每个枚举值的个数。返回各个列枚举值的个数,以及非 NULL 值的个数与 NULL 值的个数。
## 语法
@@ -19,11 +19,12 @@ COUNT_BY_ENUM(<expr1>, <expr2>, ... , <exprN>)
| 参数 | 说明 |
| -- | -- |
-| `<expr1>` | 至少填写一个输入。值为字符串(STRING)类型的列 |
+| `<expr1>` | 至少填写一个输入,至多支持 1024 个输入,支持类型为 String。 |
## 返回值
-返回一个 JSONArray 字符串。
+返回 JSONArray 格式的结果。
+返回类型为 String。
例如:
```json
@@ -57,10 +58,10 @@ COUNT_BY_ENUM(<expr1>, <expr2>, ... , <exprN>)
}]
```
说明:返回值为一个 JSON array 字符串,内部对象的顺序是输入参数的顺序。
-* cbe:根据枚举值统计非 null 值的统计结果
-* notnull:非 null 的个数
-* null:null 值个数
-* all:总数,包括 null 值与非 null 值
+* cbe:根据枚举值统计非 NULL 值的统计结果
+* notnull:非 NULL 的个数
+* null:NULL 值个数
+* all:总数,包括 NULL 值与非 NULL 值
## 举例
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/count.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/count.md
index c947c93851d..736cc4a7ef5 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/count.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/count.md
@@ -7,7 +7,7 @@
## 描述
-返回指定列的非 NULL 记录数,或者记录总数
+返回指定列的非 NULL 记录数,或者记录总数。
## 语法
@@ -21,29 +21,45 @@ COUNT(<expr>)
| 参数 | 说明 |
| -- | -- |
-| `<expr>` | 条件表达式(列名) |
+| `<expr>` | 如果填写表达式,则计算非 NULL 的记录数,否则计算总行数。 |
## 返回值
-返回值为数值类型。如果 expr 为 NULL,则不参数统计
+返回值的类型为 Bigint。如果 expr 为 NULL,则不参数统计。
## 举例
```sql
-select * from test_count;
-```
-
-```text
-+------+------+------+
-| id | name | sex |
-+------+------+------+
-| 1 | 1 | 1 |
-| 2 | 2 | 1 |
-| 3 | 3 | 1 |
-| 4 | 0 | 1 |
-| 4 | 4 | 1 |
-| 5 | NULL | 1 |
-+------+------+------+
+-- setup
+create table test_count(
+ id int,
+ name varchar(20),
+ sex int
+) distributed by hash(id) buckets 1
+properties ("replication_num"="1");
+
+insert into test_count values
+ (1, '1', 1),
+ (2, '2', 1),
+ (3, '3', 1),
+ (4, '0', 1),
+ (4, '4', 1),
+ (5, NULL, 1);
+
+create table test_insert(
+ id int,
+ name varchar(20),
+ sex int
+) distributed by hash(id) buckets 1
+properties ("replication_num"="1");
+
+insert into test_insert values
+ (1, '1', 1),
+ (2, '2', 1),
+ (3, '3', 1),
+ (4, '0', 1),
+ (4, '4', 1),
+ (5, NULL, 1);
```
```sql
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/covar-samp.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/covar-samp.md
index da046acbb9e..8d219adf7d9 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/covar-samp.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/covar-samp.md
@@ -7,7 +7,7 @@
## 描述
-计算两个数值型变量之间的样本协方差
+计算两个变量之间的样本协方差,如果输入变量存在 NULL,则该行不计入统计数据。
## 语法
@@ -19,25 +19,57 @@ COVAR_SAMP(<expr1>, <expr2>)
| 参数 | 说明 |
| -- | -- |
-| `<expr1>` | 数值型表达式或列 |
-| `<expr2>` | 数值型表达式或列 |
+| `<expr1>` | 用于计算的表达式之一,支持类型为 Double 。 |
+| `<expr2>` | 用于计算的表达式之一,支持类型为 Double 。 |
## 返回值
-返回 expr1 和 expr2 的样本协方差,特殊情况:
-
-- 如果 expr1 或者 expr2 某一列为 NULL 时,该行数据不会被统计到最终结果中。
+返回 expr1 和 expr2 的样本协方差,返回类型为 Double。
+如果组内没有有效数据,返回 NULL 。
## 举例
+```sql
+-- setup
+create table baseall(
+ id int,
+ x double,
+ y double
+) distributed by hash(id) buckets 1
+properties ("replication_num"="1");
+
+insert into baseall values
+ (1, 1.0, 2.0),
+ (2, 2.0, 3.0),
+ (3, 3.0, 4.0),
+ (4, 4.0, NULL),
+ (5, NULL, 5.0);
```
+
+```sql
select covar_samp(x,y) from baseall;
```
```text
-+---------------------+
-| covar_samp(x, y) |
-+---------------------+
-| 0.89442719099991586 |
-+---------------------+
++-----------------+
+| covar_samp(x,y) |
++-----------------+
+| 1 |
++-----------------+
+```
+
+```sql
+select id, covar_samp(x, y) from baseall group by id;
+```
+
+```text
++------+------------------+
+| id | covar_samp(x, y) |
++------+------------------+
+| 1 | 0 |
+| 2 | 0 |
+| 3 | 0 |
+| 4 | NULL |
+| 5 | NULL |
++------+------------------+
```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/covar.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/covar.md
index d914027eab4..951a80a77c6 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/covar.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/covar.md
@@ -7,7 +7,8 @@
## 描述
-计算两个数值型变量之间的协方差
+计算两个变量之间的样本协方差,如果输入变量存在 NULL,则该行不计入统计数据。
+
## 别名
@@ -17,32 +18,65 @@
```sql
COVAR(<expr1>, <expr2>)
+COVAR_POP(<expr1>, <expr2>)
```
## 参数
| 参数 | 说明 |
| -- | -- |
-| `<expr1>` | 数值型表达式或列 |
-| `<expr2>` | 数值型表达式或列 |
+| `<expr1>` | 用于计算的表达式之一,支持类型为 Double 。 |
+| `<expr2>` | 用于计算的表达式之一,支持类型为 Double 。 |
## 返回值
-返回 expr1 和 expr2 的协方差值,特殊情况:
-
-- 如果 expr1 或者 expr2 某一列为 NULL 时,该行数据不会被统计到最终结果中。
+返回 expr1 和 expr2 的样本协方差,返回类型为 Double。
+如果组内没有有效数据,返回 NULL 。
## 举例
+```sql
+-- setup
+create table baseall(
+ id int,
+ x double,
+ y double
+) distributed by hash(id) buckets 1
+properties ("replication_num"="1");
+
+insert into baseall values
+ (1, 1.0, 2.0),
+ (2, 2.0, 3.0),
+ (3, 3.0, 4.0),
+ (4, 4.0, NULL),
+ (5, NULL, 5.0);
```
+
+```sql
select covar(x,y) from baseall;
```
```text
-+---------------------+
-| covar(x, y) |
-+---------------------+
-| 0.89442719099991586 |
-+---------------------+
++-------------------+
+| covar(x,y) |
++-------------------+
+| 0.666666666666667 |
++-------------------+
```
+```sql
+select id, covar(x, y) from baseall group by id;
+```
+
+```text
+mysql> select id, covar(x, y) from baseall group by id;
++------+-------------+
+| id | covar(x, y) |
++------+-------------+
+| 1 | 0 |
+| 2 | 0 |
+| 3 | 0 |
+| 4 | NULL |
+| 5 | NULL |
++------+-------------+
+```
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]