nooneuse commented on code in PR #3711:
URL: https://github.com/apache/doris-website/pull/3711#discussion_r3331938803


##########
docs/sql-manual/sql-functions/aggregate-functions/datasketches_hll_union_agg.md:
##########
@@ -0,0 +1,101 @@
+{
+"title": "DATASKETCHES_HLL_UNION_AGG",
+"language": "en",
+"description": "The datasketches_hll_union_agg function is an aggregate 
function used to union multiple Apache DataSketches HLL sketches and return the 
estimated cardinality of the union as a DOUBLE value."
+}
+---
+
+## Description
+
+`datasketches_hll_union_agg` is an aggregate function used to **union** 
multiple Apache DataSketches **HLL** (`hll_sketch`) serialized values and 
return the **estimated cardinality** (approximate distinct count / NDV) after 
union.
+
+This function expects the input to be **serialized bytes of a DataSketches HLL 
sketch** (for example, generated by `hll_sketch.serialize_compact()` in the 
DataSketches library). It does not accept arbitrary strings.
+
+Aliases:
+
+- `ds_hll_estimate`
+- `datasketches_hll_estimate`
+
+## Syntax
+
+```sql
+datasketches_hll_union_agg(<sketch>)
+```
+
+## Parameters
+
+| Parameter | Description |
+| -- | -- |
+| `<sketch>` | The serialized bytes of an Apache DataSketches HLL sketch. 
Supported types: STRING / VARCHAR / BINARY / VARBINARY. NULL values are 
ignored. Empty strings are treated as invalid input and will throw an error. |
+
+## Return Value
+
+Returns a DOUBLE (Float64) cardinality estimate value.  
+If there is no valid data in the group (or the input is empty), returns 0.  
+If the input bytes cannot be deserialized as a valid DataSketches HLL sketch 
(including empty string), an error is thrown (typically with error code 
`CORRUPTION`).
+
+## Example
+
+```sql
+-- setup
+CREATE TABLE test_datasketches_hll_union_agg_tbl (
+    id INT,
+    sk STRING
+)
+DISTRIBUTED BY HASH(id) BUCKETS 1
+PROPERTIES ("replication_num" = "1");
+
+-- The sketch bytes are inserted via Base64 decoding.
+INSERT INTO test_datasketches_hll_union_agg_tbl VALUES
+    (1, from_base64('AgEHCAMIBwjL18IEK/L7BoYv+Q11gWYHgbxdBntl5gj8LUIK')),
+    (2, 
from_base64('AwEHCAUIAAkKAAAAIjvrBcS1nwfGGWoEyHokBO8t9wc1qTEENkcJB7hWqQxZf9QNnuSbGA==')),
+    (3, NULL);
+```
+
+```sql
+-- The function returns DOUBLE, so use ROUND/CAST if you want an integer 
display.
+SELECT CAST(ROUND(datasketches_hll_union_agg(sk)) AS BIGINT)
+FROM test_datasketches_hll_union_agg_tbl;
+```
+
+```text
++------------------------------------------------------+
+| CAST(ROUND(datasketches_hll_union_agg(sk)) AS BIGINT) |
++------------------------------------------------------+
+|                                                   17 |
++------------------------------------------------------+
+```
+
+```sql
+-- aliases
+SELECT
+    CAST(ROUND(datasketches_hll_union_agg(sk)) AS BIGINT),
+    CAST(ROUND(ds_hll_estimate(sk)) AS BIGINT),
+    CAST(ROUND(datasketches_hll_estimate(sk)) AS BIGINT)
+FROM test_datasketches_hll_union_agg_tbl;

Review Comment:
   Okay, I have added the query results.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to