This is an automated email from the ASF dual-hosted git repository.
zhangstar333 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new fdb5d108921 [doc](function)update some doc about retention function
(#2625)
fdb5d108921 is described below
commit fdb5d10892177ff8f6b3c3780cb93976a554547c
Author: zhangstar333 <[email protected]>
AuthorDate: Fri Aug 15 17:33:20 2025 +0800
[doc](function)update some doc about retention function (#2625)
update some doc about retention function
---
.../sql-functions/aggregate-functions/retention.md | 89 ++++++++++++++++++++--
.../docusaurus-plugin-content-docs/current.json | 2 +-
.../sql-functions/aggregate-functions/retention.md | 85 ++++++++++++++++++++-
.../sql-functions/aggregate-functions/retention.md | 88 +++++++++++++++++++--
.../sql-functions/aggregate-functions/retention.md | 85 ++++++++++++++++++++-
5 files changed, 330 insertions(+), 19 deletions(-)
diff --git a/docs/sql-manual/sql-functions/aggregate-functions/retention.md
b/docs/sql-manual/sql-functions/aggregate-functions/retention.md
index 9d6d8f4082a..df0af61e7a8 100644
--- a/docs/sql-manual/sql-functions/aggregate-functions/retention.md
+++ b/docs/sql-manual/sql-functions/aggregate-functions/retention.md
@@ -32,10 +32,15 @@ RETENTION(<event_1> [, <event_2>, ... , <event_n>]);
An array of 1 and 0 with a maximum length of 32, where the final output array
length matches the input parameter length.
If no data is involved in the aggregation, a NULL value will be returned.
+When multiple columns are involved in a calculation, if any column contains a
NULL value, the current row with the NULL value will not participate in the
aggregate calculation and will be directly discarded.
+
+You can use the IFNULL function on the calculation column to handle NULL
values. For details, refer to the subsequent examples.
+
## Examples
+1. Create sample table and Insert sample data
+
```sql
--- Create sample table
CREATE TABLE retention_test(
`uid` int COMMENT 'user id',
`date` datetime COMMENT 'date time'
@@ -45,7 +50,6 @@ PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);
--- Insert sample data
INSERT into retention_test values
(0, '2022-10-12'),
(0, '2022-10-13'),
@@ -53,8 +57,11 @@ INSERT into retention_test values
(1, '2022-10-12'),
(1, '2022-10-13'),
(2, '2022-10-12');
+```
+
+2. Calculate user retention
--- Calculate user retention
+```sql
SELECT
uid,
RETENTION(date = '2022-10-12') AS r,
@@ -75,8 +82,38 @@ ORDER BY uid ASC;
+------+------+--------+-----------+
```
+3. Handling NULL values in special cases, recreating the table and inserting
data
+
+```sql
+CREATE TABLE retention_test2(
+ `uid` int,
+ `flag` boolean,
+ `flag2` boolean
+) DUPLICATE KEY(uid)
+DISTRIBUTED BY HASH(uid) BUCKETS AUTO
+PROPERTIES (
+ "replication_allocation" = "tag.location.default: 1"
+);
+
+INSERT into retention_test2 values (0, false, false), (1, true, NULL);
+
+SELECT * from retention_test2;
+```
+
+```text
++------+------+-------+
+| uid | flag | flag2 |
++------+------+-------+
+| 0 | 1 | NULL |
+| 1 | 0 | 0 |
++------+------+-------+
+```
+
+
+4. When performing calculations on an empty table, no data participates in
aggregation, and NULL values are returned.
+
```sql
-SELECT RETENTION(date = '2022-10-12') AS r FROM retention_test where uid is
NULL;
+SELECT RETENTION(date = '2022-10-12') AS r FROM retention_test2 where uid is
NULL;
```
```text
@@ -85,4 +122,46 @@ SELECT RETENTION(date = '2022-10-12') AS r FROM
retention_test where uid is NULL
+------+
| NULL |
+------+
-```
\ No newline at end of file
+```
+
+5. Only the flag column is involved in the calculation. Since flag is true
when uid = 0, it returns 1.
+
+```sql
+select retention(flag) from retention_test2;
+```
+
+```text
++-----------------+
+| retention(flag) |
++-----------------+
+| [1] |
++-----------------+
+```
+
+6. When the columns flag and flag2 are involved in the calculation, the row
with uid = 0 is excluded from the aggregate computation because flag2 is NULL.
Only the row with uid = 1 participates in the aggregation, resulting in a
return value of 0.
+
+```sql
+select retention(flag,flag2) from retention_test2;
+```
+
+```text
++-----------------------+
+| retention(flag,flag2) |
++-----------------------+
+| [0, 0] |
++-----------------------+
+```
+
+7. To resolve NULL value issues, you can use the IFNULL function to convert
NULL to false, ensuring that both rows with uid = 0 and uid = 1 are included in
the aggregate calculation.
+
+```sql
+select retention(flag,IFNULL(flag2,false)) from retention_test2;;
+```
+
+```text
++-------------------------------------+
+| retention(flag,IFNULL(flag2,false)) |
++-------------------------------------+
+| [1, 0] |
++-------------------------------------+
+```
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
b/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
index 79c542ac6d3..85252898a68 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
@@ -572,7 +572,7 @@
"description": "The label for category IP Functions in sidebar docs"
},
"sidebar.docs.category.Bitmap Functions": {
- "message": "BITMAPs 函数",
+ "message": "BITMAP 函数",
"description": "The label for category Bitmap Functions in sidebar docs"
},
"sidebar.docs.category.HLL Functions": {
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/retention.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/retention.md
index 8bd00dfd625..576e28e5135 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/retention.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/aggregate-functions/retention.md
@@ -30,11 +30,14 @@ RETENTION(<event_1> [, <event_2>, ... , <event_n>]);
- 0: 条件不满足。
由 1 和 0 组成的最大长度为 32 位的数组,最终输出数组的长度与输入参数长度相同。
如果在没有任何数据参与聚合的情况下,会返回NULL值
+当有多个列参与计算时,如果任意一列出现了NULL值,则NULL值的当前行不会参与聚合计算,被直接丢弃
+可以在计算列上加IF NULL函数处理NULL值,详情见后续示例。
## 举例
+1. 创建示例表, 插入示例数据
+
```sql
--- 创建示例表
CREATE TABLE retention_test(
`uid` int COMMENT 'user id',
`date` datetime COMMENT 'date time'
@@ -44,7 +47,6 @@ PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);
--- 插入示例数据
INSERT into retention_test values
(0, '2022-10-12'),
(0, '2022-10-13'),
@@ -52,8 +54,11 @@ INSERT into retention_test values
(1, '2022-10-12'),
(1, '2022-10-13'),
(2, '2022-10-12');
+```
+
+2. 正常计算用户留存
--- 计算用户留存
+```sql
SELECT
uid,
RETENTION(date = '2022-10-12') AS r,
@@ -74,8 +79,38 @@ ORDER BY uid ASC;
+------+------+--------+-----------+
```
+3. 特殊情况NULL值处理,重新建表以及插入数据
+
```sql
-SELECT RETENTION(date = '2022-10-12') AS r FROM retention_test where uid is
NULL;
+CREATE TABLE retention_test2(
+ `uid` int,
+ `flag` boolean,
+ `flag2` boolean
+) DUPLICATE KEY(uid)
+DISTRIBUTED BY HASH(uid) BUCKETS AUTO
+PROPERTIES (
+ "replication_allocation" = "tag.location.default: 1"
+);
+
+INSERT into retention_test2 values (0, false, false), (1, true, NULL);
+
+SELECT * from retention_test2;
+```
+
+```text
++------+------+-------+
+| uid | flag | flag2 |
++------+------+-------+
+| 0 | 1 | NULL |
+| 1 | 0 | 0 |
++------+------+-------+
+```
+
+
+4. 空表计算时,没有任何数据参与聚合,返回 NULL 值
+
+```sql
+SELECT RETENTION(date = '2022-10-12') AS r FROM retention_test2 where uid is
NULL;
```
```text
@@ -85,3 +120,45 @@ SELECT RETENTION(date = '2022-10-12') AS r FROM
retention_test where uid is NULL
| NULL |
+------+
```
+
+5. 仅flag一列参与计算,由于 uid = 0 时, flag 为真,返回 1
+
+```sql
+select retention(flag) from retention_test2;
+```
+
+```text
++-----------------+
+| retention(flag) |
++-----------------+
+| [1] |
++-----------------+
+```
+
+6. 当flag,flag2 两列参与计算时,uid = 0 的行,由于flag2 为NULL值,所以这行未参与聚合计算, 仅uid = 1
参与聚合计算,返回结果为0
+
+```sql
+select retention(flag,flag2) from retention_test2;
+```
+
+```text
++-----------------------+
+| retention(flag,flag2) |
++-----------------------+
+| [0, 0] |
++-----------------------+
+```
+
+7. 如果需要解决NULL值问题,可以用IFNULL 函数将NULL转换成false,这样 uid = 0,1 两行都会参与聚合计算
+
+```sql
+select retention(flag,IFNULL(flag2,false)) from retention_test2;;
+```
+
+```text
++-------------------------------------+
+| retention(flag,IFNULL(flag2,false)) |
++-------------------------------------+
+| [1, 0] |
++-------------------------------------+
+```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-functions/aggregate-functions/retention.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-functions/aggregate-functions/retention.md
index 293151e5bf1..f463c9b98e6 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-functions/aggregate-functions/retention.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-functions/aggregate-functions/retention.md
@@ -13,6 +13,9 @@
简单来讲,返回值数组第 1
位表示`event_1`的真假,第二位表示`event_1`真假与`event_2`真假相与,第三位表示`event_1`真假与`event_3`真假相与,等等。如果`event_1`为假,则返回全是
0 的数组。
+当有多个列参与计算时,如果任意一列出现了NULL值,则NULL值的当前行不会参与聚合计算,被直接丢弃
+可以在计算列上加IF NULL函数处理NULL值,详情见后续示例。
+
## 语法
```sql
@@ -34,8 +37,9 @@ RETENTION(<event_1> [, <event_2>, ... , <event_n>]);
## 举例
+1. 创建示例表, 插入示例数据
+
```sql
--- 创建示例表
CREATE TABLE retention_test(
`uid` int COMMENT 'user id',
`date` datetime COMMENT 'date time'
@@ -45,7 +49,6 @@ PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);
--- 插入示例数据
INSERT into retention_test values
(0, '2022-10-12'),
(0, '2022-10-13'),
@@ -53,8 +56,11 @@ INSERT into retention_test values
(1, '2022-10-12'),
(1, '2022-10-13'),
(2, '2022-10-12');
+```
--- 计算用户留存
+2. 正常计算用户留存
+
+```sql
SELECT
uid,
RETENTION(date = '2022-10-12') AS r,
@@ -75,8 +81,38 @@ ORDER BY uid ASC;
+------+------+--------+-----------+
```
+3. 特殊情况NULL值处理,重新建表以及插入数据
+
+```sql
+CREATE TABLE retention_test2(
+ `uid` int,
+ `flag` boolean,
+ `flag2` boolean
+) DUPLICATE KEY(uid)
+DISTRIBUTED BY HASH(uid) BUCKETS AUTO
+PROPERTIES (
+ "replication_allocation" = "tag.location.default: 1"
+);
+
+INSERT into retention_test2 values (0, false, false), (1, true, NULL);
+
+SELECT * from retention_test2;
+```
+
+```text
++------+------+-------+
+| uid | flag | flag2 |
++------+------+-------+
+| 0 | 1 | NULL |
+| 1 | 0 | 0 |
++------+------+-------+
+```
+
+
+4. 空表计算时,没有任何数据参与聚合,返回 NULL 值
+
```sql
-SELECT RETENTION(date = '2022-10-12') AS r FROM retention_test where uid is
NULL;
+SELECT RETENTION(date = '2022-10-12') AS r FROM retention_test2 where uid is
NULL;
```
```text
@@ -85,4 +121,46 @@ SELECT RETENTION(date = '2022-10-12') AS r FROM
retention_test where uid is NULL
+------+
| NULL |
+------+
-```
\ No newline at end of file
+```
+
+5. 仅flag一列参与计算,由于 uid = 0 时, flag 为真,返回 1
+
+```sql
+select retention(flag) from retention_test2;
+```
+
+```text
++-----------------+
+| retention(flag) |
++-----------------+
+| [1] |
++-----------------+
+```
+
+6. 当flag,flag2 两列参与计算时,uid = 0 的行,由于flag2 为NULL值,所以这行未参与聚合计算, 仅uid = 1
参与聚合计算,返回结果为0
+
+```sql
+select retention(flag,flag2) from retention_test2;
+```
+
+```text
++-----------------------+
+| retention(flag,flag2) |
++-----------------------+
+| [0, 0] |
++-----------------------+
+```
+
+7. 如果需要解决NULL值问题,可以用IFNULL 函数将NULL转换成false,这样 uid = 0,1 两行都会参与聚合计算
+
+```sql
+select retention(flag,IFNULL(flag2,false)) from retention_test2;;
+```
+
+```text
++-------------------------------------+
+| retention(flag,IFNULL(flag2,false)) |
++-------------------------------------+
+| [1, 0] |
++-------------------------------------+
+```
diff --git
a/versioned_docs/version-3.0/sql-manual/sql-functions/aggregate-functions/retention.md
b/versioned_docs/version-3.0/sql-manual/sql-functions/aggregate-functions/retention.md
index 8c0f2f22126..11171222c66 100644
---
a/versioned_docs/version-3.0/sql-manual/sql-functions/aggregate-functions/retention.md
+++
b/versioned_docs/version-3.0/sql-manual/sql-functions/aggregate-functions/retention.md
@@ -32,11 +32,15 @@ RETENTION(<event_1> [, <event_2>, ... , <event_n>]);
An array of 1 and 0 with a maximum length of 32, where the final output array
length matches the input parameter length.
If no data is involved in the aggregation, a NULL value will be returned.
+When multiple columns are involved in a calculation, if any column contains a
NULL value, the current row with the NULL value will not participate in the
aggregate calculation and will be directly discarded.
+
+You can use the IFNULL function on the calculation column to handle NULL
values. For details, refer to the subsequent examples.
## Examples
+1. Create sample table and Insert sample data
+
```sql
--- Create sample table
CREATE TABLE retention_test(
`uid` int COMMENT 'user id',
`date` datetime COMMENT 'date time'
@@ -46,7 +50,6 @@ PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);
--- Insert sample data
INSERT into retention_test values
(0, '2022-10-12'),
(0, '2022-10-13'),
@@ -54,8 +57,11 @@ INSERT into retention_test values
(1, '2022-10-12'),
(1, '2022-10-13'),
(2, '2022-10-12');
+```
--- Calculate user retention
+2. Calculate user retention
+
+```sql
SELECT
uid,
RETENTION(date = '2022-10-12') AS r,
@@ -76,9 +82,38 @@ ORDER BY uid ASC;
+------+------+--------+-----------+
```
+3. Handling NULL values in special cases, recreating the table and inserting
data
```sql
-SELECT RETENTION(date = '2022-10-12') AS r FROM retention_test where uid is
NULL;
+CREATE TABLE retention_test2(
+ `uid` int,
+ `flag` boolean,
+ `flag2` boolean
+) DUPLICATE KEY(uid)
+DISTRIBUTED BY HASH(uid) BUCKETS AUTO
+PROPERTIES (
+ "replication_allocation" = "tag.location.default: 1"
+);
+
+INSERT into retention_test2 values (0, false, false), (1, true, NULL);
+
+SELECT * from retention_test2;
+```
+
+```text
++------+------+-------+
+| uid | flag | flag2 |
++------+------+-------+
+| 0 | 1 | NULL |
+| 1 | 0 | 0 |
++------+------+-------+
+```
+
+
+4. When performing calculations on an empty table, no data participates in
aggregation, and NULL values are returned.
+
+```sql
+SELECT RETENTION(date = '2022-10-12') AS r FROM retention_test2 where uid is
NULL;
```
```text
@@ -87,4 +122,46 @@ SELECT RETENTION(date = '2022-10-12') AS r FROM
retention_test where uid is NULL
+------+
| NULL |
+------+
+```
+
+5. Only the flag column is involved in the calculation. Since flag is true
when uid = 0, it returns 1.
+
+```sql
+select retention(flag) from retention_test2;
+```
+
+```text
++-----------------+
+| retention(flag) |
++-----------------+
+| [1] |
++-----------------+
+```
+
+6. When the columns flag and flag2 are involved in the calculation, the row
with uid = 0 is excluded from the aggregate computation because flag2 is NULL.
Only the row with uid = 1 participates in the aggregation, resulting in a
return value of 0.
+
+```sql
+select retention(flag,flag2) from retention_test2;
+```
+
+```text
++-----------------------+
+| retention(flag,flag2) |
++-----------------------+
+| [0, 0] |
++-----------------------+
+```
+
+7. To resolve NULL value issues, you can use the IFNULL function to convert
NULL to false, ensuring that both rows with uid = 0 and uid = 1 are included in
the aggregate calculation.
+
+```sql
+select retention(flag,IFNULL(flag2,false)) from retention_test2;;
+```
+
+```text
++-------------------------------------+
+| retention(flag,IFNULL(flag2,false)) |
++-------------------------------------+
+| [1, 0] |
++-------------------------------------+
```
\ No newline at end of file
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]