This is an automated email from the ASF dual-hosted git repository.

luzhijing pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 5776fb0fa4 [Doc](function) Support Hive HLL UDFs (#578)
5776fb0fa4 is described below

commit 5776fb0fa4ebfa710e2288151a35b6f79b17eb63
Author: Chester <[email protected]>
AuthorDate: Tue Apr 23 10:11:17 2024 +0800

    [Doc](function) Support Hive HLL UDFs (#578)
---
 docs/ecosystem/hive-hll-udf.md                     | 252 +++++++++++++++++++++
 .../sql-functions/hll-functions/hll-from-base64.md |  16 ++
 .../current/ecosystem/hive-hll-udf.md              | 251 ++++++++++++++++++++
 .../sql-functions/hll-functions/hll-from-base64.md |  15 ++
 sidebars.json                                      |   1 +
 5 files changed, 535 insertions(+)

diff --git a/docs/ecosystem/hive-hll-udf.md b/docs/ecosystem/hive-hll-udf.md
new file mode 100644
index 0000000000..c90f3db6f0
--- /dev/null
+++ b/docs/ecosystem/hive-hll-udf.md
@@ -0,0 +1,252 @@
+---
+{
+    "title": "Hive HLL UDF",
+    "language": "en"
+}
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Hive HLL UDF
+
+ The Hive HLL UDF provides a set of UDFs for generating HLL operations in Hive 
tables, which are identical to Doris HLL. Hive HLL can be imported into Doris 
through Spark HLL Load. For more information about HLL, please refer to Using 
HLL for Approximate Deduplication.:[Approximate Deduplication Using 
HLL](../advanced/using-hll.md)
+
+ Function Introduction:
+  1. UDAF
+ 
+    · to_hll: An aggregate function that returns a Doris HLL column, similar 
to the to_bitmap function
+ 
+    · hll_union:An aggregate function that calculates the union of groups, 
returning a Doris HLL column, similar to the bitmap_union function
+
+  2. UDF
+
+    · hll_cardinality: Returns the number of distinct elements added to the 
HLL, similar to the bitmap_count function
+
+ Main Purpose:
+  1. Reduce data import time to Doris by eliminating the need for dictionary 
construction and HLL pre-aggregation
+  2. Save Hive storage by compressing data using HLL, significantly reducing 
storage costs compared to Bitmap statistics
+  3. Provide flexible HLL operations in Hive, including union and cardinality 
statistics, and allow the resulting HLL to be directly imported into Doris
+ 
+ Note:
+ HLL statistics are approximate calculations with an error rate of around 1% 
to 2%.
+
+
+## Usage
+
+### Create a Hive table and insert test data
+
+```sql
+-- Create a test database, e.g., hive_test
+use hive_test;
+
+-- Create a Hive HLL table
+CREATE TABLE IF NOT EXISTS `hive_hll_table`(
+  `k1`   int       COMMENT '',
+  `k2`   String    COMMENT '',
+  `k3`   String    COMMENT '',
+  `uuid` binary    COMMENT 'hll'
+) comment  'comment'
+
+-- Create a normal Hive table and insert test data
+CREATE TABLE IF NOT EXISTS `hive_table`(
+    `k1`   int       COMMENT '',
+    `k2`   String    COMMENT '',
+    `k3`   String    COMMENT '',
+    `uuid` int       COMMENT ''
+) comment  'comment'
+
+insert into hive_table select 1, 'a', 'b', 12345;
+insert into hive_table select 1, 'a', 'c', 12345;
+insert into hive_table select 2, 'b', 'c', 23456;
+insert into hive_table select 3, 'c', 'd', 34567;
+```
+
+### Use Hive HLL UDF:
+
+Hive HLL UDF needs to be used in Hive/Spark. First, compile the FE to obtain 
the hive-udf.jar file.
+Compilation preparation: If you have compiled the ldb source code, you can 
directly compile the FE. If not, you need to manually install thrift, refer to 
[Setting Up Dec Env for FE - IntelliJ 
IDEA](/community/developer-guide/fe-idea-dev.md) for compilation and 
installation.
+
+```sql
+-- Clone the Doris source code
+git clone https://github.com/apache/doris.git
+cd doris
+git submodule update --init --recursive
+
+-- Install thrift (skip if already installed)
+-- Enter the FE directory
+cd fe
+
+-- Execute the Maven packaging command (all FE submodules will be packaged)
+mvn package -Dmaven.test.skip=true
+-- Or package only the hive-udf module
+mvn package -pl hive-udf -am -Dmaven.test.skip=true
+
+-- The packaged hive-udf.jar file will be generated in the target directory
+-- Upload the compiled hive-udf.jar file to HDFS, e.g., to the root directory
+hdfs dfs -put hive-udf/target/hive-udf.jar /
+
+```
+
+Then, enter Hive and execute the following SQL statements:
+
+```sql
+-- Load the hive hll udf jar package, modify the hostname and port according 
to your actual situation
+add jar hdfs://hostname:port/hive-udf.jar;
+
+-- Create UDAF functions
+create temporary function to_hll as 'org.apache.doris.udf.ToHllUDAF' USING JAR 
'hdfs://hostname:port/hive-udf.jar';
+create temporary function hll_union as 'org.apache.doris.udf.HllUnionUDAF' 
USING JAR 'hdfs://hostname:port/hive-udf.jar';
+
+
+-- Create UDF functions
+create temporary function hll_cardinality as 
'org.apache.doris.udf.HllCardinalityUDF' USING JAR 
'hdfs://node:9000/hive-udf.jar';
+
+
+-- Example: Use the to_hll UDAF to aggregate and generate HLL, and write it to 
the Hive HLL table
+insert into hive_hll_table
+select 
+    k1,
+    k2,
+    k3,
+    to_hll(uuid) as uuid
+from 
+    hive_table
+group by 
+    k1,
+    k2,
+    k3
+
+-- Example: Use hll_cardinality to calculate the number of elements in the HLL
+select k1, k2, k3, hll_cardinality(uuid) from hive_hll_table;
++-----+-----+-----+------+
+| k1  | k2  | k3  | _c3  |
++-----+-----+-----+------+
+| 1   | a   | b   | 1    |
+| 1   | a   | c   | 1    |
+| 2   | b   | c   | 1    |
+| 3   | c   | d   | 1    |
++-----+-----+-----+------+
+
+-- Example: Use hll_union to calculate the union of groups, returning 3 rows
+select k1, hll_union(uuid) from hive_hll_table group by k1;
+
+-- Example: Also can merge and then continue to statistics
+select k3, hll_cardinality(hll_union(uuid)) from hive_hll_table group by k3;
++-----+------+
+| k3  | _c1  |
++-----+------+
+| b   | 1    |
+| c   | 2    |
+| d   | 1    |
++-----+------+
+```
+
+###  Hive HLL UDF Explanation
+
+## Importing Hive HLL to Doris
+
+<version dev>
+
+### Method 1: Catalog (Recommended)
+
+</version>
+
+Create Hive table specified as TEXT format. For Binary type, Hive will save it 
as a base64 encoded string. At this time, you can use the Hive Catalog to 
directly import the HLL data into Doris using the 
[hll_from_base64](../sql-manual/sql-functions/hll-functions/hll-from-base64.md) 
function.
+
+Here is a complete example:
+
+1. Create a Hive table
+
+```sql
+CREATE TABLE IF NOT EXISTS `hive_hll_table`(
+`k1`   int       COMMENT '',
+`k2`   String    COMMENT '',
+`k3`   String    COMMENT '',
+`uuid` binary    COMMENT 'hll'
+) stored as textfile
+
+-- then reuse the previous steps to insert data from a normal table into it 
using the to_hll function
+```
+
+2. [Create a Doris catalog](../lakehouse/multi-catalog/hive)
+
+```sql
+CREATE CATALOG hive PROPERTIES (
+    'type'='hms',
+    'hive.metastore.uris' = 'thrift://127.0.0.1:9083'
+);
+```
+
+3. Create a Doris internal table
+
+```sql
+CREATE TABLE IF NOT EXISTS `doris_test`.`doris_hll_table`(
+    `k1`   int                   COMMENT '',
+    `k2`   varchar(10)           COMMENT '',
+    `k3`   varchar(10)           COMMENT '',
+    `uuid` HLL  HLL_UNION  COMMENT 'hll'
+)
+AGGREGATE KEY(k1, k2, k3)
+DISTRIBUTED BY HASH(`k1`) BUCKETS 1
+PROPERTIES (
+    "replication_allocation" = "tag.location.default: 1"
+);
+```
+
+4. Import data from Hive to Doris
+
+```sql
+insert into doris_hll_table select k1, k2, k3, hll_from_base64(uuid) from 
hive.hive_test.hive_hll_table;
+
+-- View the imported data, combining hll_to_base64 for decoding
+select *, hll_to_base64(uuid) from doris_hll_table;
++------+------+------+------+---------------------+
+| k1   | k2   | k3   | uuid | hll_to_base64(uuid) |
++------+------+------+------+---------------------+
+|    1 | a    | b    | NULL | AQFw+a9MhpKhoQ==    |
+|    1 | a    | c    | NULL | AQFw+a9MhpKhoQ==    |
+|    2 | b    | c    | NULL | AQGyB7kbWBxh+A==    |
+|    3 | c    | d    | NULL | AQFYbJB5VpNBhg==    |
++------+------+------+------+---------------------+
+
+-- Also can use Doris's native HLL functions for statistics, and see that the 
results are consistent with the previous statistics in Hive
+select k3, hll_cardinality(hll_union(uuid)) from doris_hll_table group by k3;
++------+----------------------------------+
+| k3   | hll_cardinality(hll_union(uuid)) |
++------+----------------------------------+
+| b    |                                1 |
+| d    |                                1 |
+| c    |                                2 |
++------+----------------------------------+
+
+-- At this time, querying the external table data, i.e., the data before 
import, can also verify the correctness of the data
+select k3, hll_cardinality(hll_union(hll_from_base64(uuid))) from 
hive.hive_test.hive_hll_table group by k3;
++------+---------------------------------------------------+
+| k3   | hll_cardinality(hll_union(hll_from_base64(uuid))) |
++------+---------------------------------------------------+
+| d    |                                                 1 |
+| b    |                                                 1 |
+| c    |                                                 2 |
++------+---------------------------------------------------+
+```
+
+### Method 2: Spark Load
+
+ See details: [Spark 
Load](../data-operate/import/import-way/spark-load-manual.md) -> Basic 
operation -> Creating Load (Example 3: when the upstream data source is hive 
binary type table)
diff --git a/docs/sql-manual/sql-functions/hll-functions/hll-from-base64.md 
b/docs/sql-manual/sql-functions/hll-functions/hll-from-base64.md
index c0efeb4644..8f6036fa11 100644
--- a/docs/sql-manual/sql-functions/hll-functions/hll-from-base64.md
+++ b/docs/sql-manual/sql-functions/hll-functions/hll-from-base64.md
@@ -33,6 +33,8 @@ Convert a base64 string(result of function `hll_to_base64`) 
into a hll. If input
 
 ### example
 
+#### query example 
+
 ```
 mysql> select hll_union_agg(hll_from_base64(hll_to_base64(pv))), 
hll_union_agg(pv) from test_hll;
 +---------------------------------------------------+-------------------+
@@ -66,6 +68,20 @@ mysql> select 
hll_cardinality(hll_from_base64(hll_to_base64(hll_hash(NULL))));
 +-----------------------------------------------------------------+
 1 row in set (0.02 sec)
 ```
+#### data import example 
+```
+Prerequisites:
+
+1.A Hive table named hive_test.hive_hll_table, with format textfile, has been 
created with fields: k1 int, k2 String, k3 String, uuid binary, and data has 
been inserted into the table using the to_hll UDF function from a regular table.
+
+2.A catalog named hive has been created in Doris to connect to Hive.
+
+3.A Doris internal table named doris_hll_table has been created with fields: 
k1 int, k2 varchar(10), k3 varchar(10), uuid HLL HLL_UNION.
+
+Then, you can use the hll_from_base64 function to import data from Hive to 
Doris:
+insert into doris_hll_table select k1, k2, k3, hll_from_base64(uuid) from 
hive.hive_test.hive_hll_table;
+```
+For more import details, please refer to: [Hive HLL 
UDF](../../../ecosystem/hive-hll-udf.md)
 
 ### keywords
 HLL_FROM_BASE64, HLL
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/ecosystem/hive-hll-udf.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/ecosystem/hive-hll-udf.md
new file mode 100644
index 0000000000..63d7177450
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/ecosystem/hive-hll-udf.md
@@ -0,0 +1,251 @@
+---
+{
+    "title": "Hive HLL UDF",
+    "language": "zh-CN"
+}
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Hive HLL UDF
+
+ Hive HLL UDF 提供了在 hive 表中生成 HLL 运算等 UDF,Hive 中的 HLL 与 Doris HLL 完全一致 ,Hive 中的 
HLL 可以通过 Spark HLL Load 导入 Doris 。关于 HLL 更多介绍可以参考:[使用 HLL 
近似去重](../advanced/using-hll.md)
+
+ 函数简介:
+  1. UDAF
+ 
+    · to_hll:聚合函数,返回一个 Doris Hll 列,类似于 to_bitmap 函数
+ 
+    · hll_union:聚合函数,功能同 Doris 的BE同名函数,计算分组的并集 ,返回一个 Doris HLL 
列,类似于bitmap_union 函数
+
+  2. UDF
+
+    · hll_cardinality:返回添加到 HLL 的不同元素的数量,类似于 bitmap_count 函数
+
+ 主要目的:
+  1. 减少数据导入 Doris 时间 , 除去了构建字典、HLL 预聚合等流程;
+  2. 节省 Hive 存储 ,使用 HLL 对数据压缩 ,极大减少了存储成本,相对于 Bitmap 的统计更加节省存储;
+  3. 提供在 Hive 中 HLL 的灵活运算:并集、基数统计 ,计算后的 HLL 也可以直接导入 Doris;
+ 
+ 注意事项:
+ HLL统计为近似计算有一定误差,大概 1%~2% 左右。
+
+## 使用方法
+
+### 在 Hive 中创建 HLL 类型和普通表,往普通表插入测试数据
+
+```sql
+-- 创建一个测试数据库,以 hive_test 为例:
+use hive_test;
+
+-- 例子:创建 Hive HLL 表
+CREATE TABLE IF NOT EXISTS `hive_hll_table`(
+  `k1`   int       COMMENT '',
+  `k2`   String    COMMENT '',
+  `k3`   String    COMMENT '',
+  `uuid` binary    COMMENT 'hll'
+) comment  'comment'
+
+-- 例子:创建普通 Hive 表,插入测试数据
+CREATE TABLE IF NOT EXISTS `hive_table`(
+    `k1`   int       COMMENT '',
+    `k2`   String    COMMENT '',
+    `k3`   String    COMMENT '',
+    `uuid` int       COMMENT ''
+) comment  'comment'
+
+insert into hive_table select 1, 'a', 'b', 12345;
+insert into hive_table select 1, 'a', 'c', 12345;
+insert into hive_table select 2, 'b', 'c', 23456;
+insert into hive_table select 3, 'c', 'd', 34567;
+```
+
+### Hive HLL UDF 使用:
+
+Hive HLL UDF 需要在 Hive/Spark 中使用,首先需要编译fe得到hive-udf.jar。
+编译准备工作:如果进行过ldb源码编译可直接编译fe,如果没有进行过ldb源码编译,则需要手动安装thrift,可参考:[FE开发环境搭建](/community/developer-guide/fe-idea-dev.md)
 中的编译与安装
+
+```sql
+--clone doris源码
+git clone https://github.com/apache/doris.git
+cd doris
+git submodule update --init --recursive
+
+--安装thrift,已安装可略过
+--进入fe目录
+cd fe
+
+--执行maven打包命令(fe的子module会全部打包)
+mvn package -Dmaven.test.skip=true
+--也可以只打hive-udf module
+mvn package -pl hive-udf -am -Dmaven.test.skip=true
+
+-- 打包编译完成进入hive-udf目录会有target目录,里面就会有打包完成的hive-udf.jar包
+-- 需要将编译好的 hive-udf.jar 包上传至 HDFS ,这里以传至hdfs的根目录为示例:
+hdfs dfs -put hive-udf/target/hive-udf.jar /
+
+```
+
+下面进入 Hive 中进行 SQL 语句操作:
+
+```sql
+-- 加载hive hll udf jar包,根据实际情况更改 hostname 和 port  
+add jar hdfs://hostname:port/hive-udf.jar;
+
+-- 创建UDAF函数
+create temporary function to_hll as 'org.apache.doris.udf.ToHllUDAF' USING JAR 
'hdfs://hostname:port/hive-udf.jar';
+create temporary function hll_union as 'org.apache.doris.udf.HllUnionUDAF' 
USING JAR 'hdfs://hostname:port/hive-udf.jar';
+
+
+-- 创建UDF函数
+create temporary function hll_cardinality as 
'org.apache.doris.udf.HllCardinalityUDF' USING JAR 
'hdfs://node:9000/hive-udf.jar';
+
+
+-- 例子:通过 to_hll 这个UDAF进行聚合生成 hll 写入 Hive HLL 表
+insert into hive_hll_table
+select 
+    k1,
+    k2,
+    k3,
+    to_hll(uuid) as uuid
+from 
+    hive_table
+group by 
+    k1,
+    k2,
+    k3
+
+-- 例子:hll_cardinality 计算 hll 中元素个数
+select k1, k2, k3, hll_cardinality(uuid) from hive_hll_table;
++-----+-----+-----+------+
+| k1  | k2  | k3  | _c3  |
++-----+-----+-----+------+
+| 1   | a   | b   | 1    |
+| 1   | a   | c   | 1    |
+| 2   | b   | c   | 1    |
+| 3   | c   | d   | 1    |
++-----+-----+-----+------+
+
+-- 例子:hll_union 用于计算分组后的 hll 并集,将返回3行
+select k1, hll_union(uuid) from hive_hll_table group by k1;
+
+-- 例子:也可以合并后继续统计
+select k3, hll_cardinality(hll_union(uuid)) from hive_hll_table group by k3;
++-----+------+
+| k3  | _c1  |
++-----+------+
+| b   | 1    |
+| c   | 2    |
+| d   | 1    |
++-----+------+
+```
+
+###  Hive HLL UDF  说明
+
+## Hive HLL 导入 doris
+
+<version dev>
+
+### 方法一:Catalog (推荐)
+
+</version>
+
+创建 Hive 表指定为 TEXT 格式,对于 Binary 类型,Hive 会以 base64 编码的字符串形式保存,此时可以通过 Hive 
Catalog 的形式,直接将 HLL 数据通过 
[hll_from_base64](../sql-manual/sql-functions/hll-functions/hll-from-base64.md) 
函数插入到 Doris 内部。
+
+以下是一个完整的例子:
+
+1. 在 Hive 中创建 Hive 表
+
+```sql
+CREATE TABLE IF NOT EXISTS `hive_hll_table`(
+`k1`   int       COMMENT '',
+`k2`   String    COMMENT '',
+`k3`   String    COMMENT '',
+`uuid` binary    COMMENT 'hll'
+) stored as textfile 
+
+-- 然后可以沿用前面的步骤基于普通表使用 to_hll 函数往 hive_hll_table 插入数据,这里不再赘述
+```
+
+2. [在 Doris 中创建 Catalog](../lakehouse/multi-catalog/hive)
+
+```sql
+CREATE CATALOG hive PROPERTIES (
+    'type'='hms',
+    'hive.metastore.uris' = 'thrift://127.0.0.1:9083'
+);
+```
+
+3. 创建 Doris 内表
+
+```sql
+CREATE TABLE IF NOT EXISTS `doris_test`.`doris_hll_table`(
+    `k1`   int                   COMMENT '',
+    `k2`   varchar(10)           COMMENT '',
+    `k3`   varchar(10)           COMMENT '',
+    `uuid` HLL  HLL_UNION  COMMENT 'hll'
+)
+AGGREGATE KEY(k1, k2, k3)
+DISTRIBUTED BY HASH(`k1`) BUCKETS 1
+PROPERTIES (
+    "replication_allocation" = "tag.location.default: 1"
+);
+```
+
+4. 从 Hive 插入数据到 Doris 中
+
+```sql
+insert into doris_hll_table select k1, k2, k3, hll_from_base64(uuid) from 
hive.hive_test.hive_hll_table;
+
+-- 可以查看导入后的数据,结合 hll_to_base64 进行解码
+select *, hll_to_base64(uuid) from doris_hll_table;
++------+------+------+------+---------------------+
+| k1   | k2   | k3   | uuid | hll_to_base64(uuid) |
++------+------+------+------+---------------------+
+|    1 | a    | b    | NULL | AQFw+a9MhpKhoQ==    |
+|    1 | a    | c    | NULL | AQFw+a9MhpKhoQ==    |
+|    2 | b    | c    | NULL | AQGyB7kbWBxh+A==    |
+|    3 | c    | d    | NULL | AQFYbJB5VpNBhg==    |
++------+------+------+------+---------------------+
+
+-- 也可以进一步使用Doris原生的 HLL 函数进行统计,可以看到和前面在 Hive 中统计的结果一致
+select k3, hll_cardinality(hll_union(uuid)) from doris_hll_table group by k3;
++------+----------------------------------+
+| k3   | hll_cardinality(hll_union(uuid)) |
++------+----------------------------------+
+| b    |                                1 |
+| d    |                                1 |
+| c    |                                2 |
++------+----------------------------------+
+
+-- 此时,查外表的数据,也就是查导入前的数据进行统计、对比也能校验数据正确性
+select k3, hll_cardinality(hll_union(hll_from_base64(uuid))) from 
hive.hive_test.hive_hll_table group by k3;
++------+---------------------------------------------------+
+| k3   | hll_cardinality(hll_union(hll_from_base64(uuid))) |
++------+---------------------------------------------------+
+| d    |                                                 1 |
+| b    |                                                 1 |
+| c    |                                                 2 |
++------+---------------------------------------------------+
+```
+
+### 方法二:Spark Load
+
+ 详见: [Spark Load](../data-operate/import/import-way/spark-load-manual.md) -> 
基本操作  -> 创建导入 (示例3:上游数据源是hive binary类型情况)
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/hll-functions/hll-from-base64.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/hll-functions/hll-from-base64.md
index be5b46ec81..3a3b02d699 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/hll-functions/hll-from-base64.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/hll-functions/hll-from-base64.md
@@ -33,6 +33,7 @@ under the License.
 
 ### example
 
+#### 查询示例
 ```
 mysql> select hll_union_agg(hll_from_base64(hll_to_base64(pv))), 
hll_union_agg(pv) from test_hll;
 +---------------------------------------------------+-------------------+
@@ -67,5 +68,19 @@ mysql> select 
hll_cardinality(hll_from_base64(hll_to_base64(hll_hash(NULL))));
 1 row in set (0.02 sec)
 ```
 
+#### 数据导入示例:
+```
+前置条件:
+1. 在 Hive 中已经创建好一个名为 hive_test.hive_hll_table 的 Hive 表(格式为 textfile ,字段为:`k1` 
int, `k2` String, `k3` String, `uuid` binary),并且已经基于普通表使用 to_hll 的UDF函数往该表插入数据。
+
+2. 在 Doris 中创建名为 hive 的 Catalog 用来连接。
+
+3. 创建好 Doris 内表,名为 doris_hll_table,字段有:`k1` int, `k2` varchar(10), `k3` 
varchar(10), `uuid` HLL HLL_UNION。
+
+那么,此时可以使用 hll_from_base64 函数从 Hive 插入数据到 Doris 中:
+insert into doris_hll_table select k1, k2, k3, hll_from_base64(uuid) from 
hive.hive_test.hive_hll_table;
+```
+更多导入细节可以参考:[Hive HLL UDF](../../../ecosystem/hive-hll-udf.md)
+
 ### keywords
 HLL_FROM_BASE64,HLL
diff --git a/sidebars.json b/sidebars.json
index 5c13b69a92..b3b4d847b4 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -294,6 +294,7 @@
                 "ecosystem/audit-plugin",
                 "ecosystem/cloudcanal",
                 "ecosystem/hive-bitmap-udf",
+                "ecosystem/hive-hll-udf",
                 "ecosystem/dbt-doris-adapter",
                 {
                     "type": "category",


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to