[ 
https://issues.apache.org/jira/browse/IMPALA-9633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab closed IMPALA-9633.
--------------------------------
    Fix Version/s: Impala 4.0
       Resolution: Fixed

> Implement ds_hll_union() builtin function
> -----------------------------------------
>
>                 Key: IMPALA-9633
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9633
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend, Frontend
>            Reporter: Gabor Kaszab
>            Assignee: Gabor Kaszab
>            Priority: Major
>             Fix For: Impala 4.0
>
>
> ds_hll_union() is an aggregating function that accepts sketches and produces 
> a single scratch that is the combination of the received scratches.
> Example from Hive:
> {code:java}
> create temporary table sketch_intermediate (category char(1), sketch binary);
> insert into sketch_intermediate select category, ds_hll_sketch(id) from 
> sketch_input group by category;
> select ds_hll_estimate(ds_hll_union(sketch)) from sketch_intermediate;
> {code}
> Some test data for the example:
> {code:java}
> create temporary table sketch_input (id int, category char(1));
> insert into table sketch_input values
>   (1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (5, 'a'), (6, 'a'), (7, 'a'), (8, 
> 'a'), (9, 'a'), (10, 'a'),
>   (6, 'b'), (7, 'b'), (8, 'b'), (9, 'b'), (10, 'b'), (11, 'b'), (12, 'b'), 
> (13, 'b'), (14, 'b'), (15, 'b');
> {code}
> Approximate result:
> {code:java}
> 15.000000521540663
> {code}
> Hive change that introduced the same: 
> https://issues.apache.org/jira/browse/HIVE-22940



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to