[
https://issues.apache.org/jira/browse/HIVE-28528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
yongzhi.shao updated HIVE-28528:
--------------------------------
Description:
Since we have introduced roaringbitmap dependency in hive-ql module.
Can we take this opportunity to introduce bitmap related UDFs, which can be
used to quickly compute intersection and merger differences, de-duplication
statistics, and other computational needs.
If so, I can do this.
DEMO:
{code:java}
CREATE TABLE IF NOT EXISTS `hive_bitmap_table`
(
k int,
uuid bigint,
bitmap binary
) comment
STORED AS ORC;
--demo
select count(distinct uuid) from hive_bitmap_table;
select bitmap_count(to_bitmap(uuid)) from hive_bitmap_table;
insert into table hive_bitmap_table select 2 as id,2 as uuid,to_bitmap(2) as
bitmap;{code}
|UDF|desc |demo |result|
|to_bitmap|Convert number (int or bigint) to bitmap|to_bitmap(num)|bitmap
(binary)|
|bitmap_union|Multiple bitmaps merged into one bitmap
(concatenation)|bitmap_union(bitmap)|bitmap|
|bitmap_count|Calculate the number of elements stored in the
bitmap|bitmap_count(bitmap)|long|
|bitmap_and|Calculate the intersection of two
bitmaps|bitmap_and(bitmap1,bitmap2)|bitmap|
|bitmap_or|Calculate the concatenation of two
bitmaps|bitmap_or(bitmap1,bitmap2)|bitmap|
|bitmap_xor|Calculate the difference between two
bitmaps|bitmap_xor(bitmap1,bitmap2)|bitmap|
|bitmap_from_array|Converting an array to a
bitmap|bitmap_from_array(array)|bitmap|
|bitmap_to_array|Convert bitmap to array|bitmap_to_array(bitmap)|array<bigint>|
|bitmap_contains|Determine if a bitmap contains all the elements of another
bitmap.|bitmap_contains(bitmap1,bitmap2)|boolean|
|bitmap_contains|Determine if a bitmap contains an
element|bitmap_contains(bitmap,num)|boolean|
was:
Since we have introduced roaringbitmap dependency in hive-ql module.
Can we take this opportunity to introduce bitmap related UDFs, which can be
used to quickly compute intersection and merger differences, de-duplication
statistics, and other computational needs.
If so, I can do this.
|UDF|desc |demo |result|
|to_bitmap|Convert number (int or bigint) to bitmap|to_bitmap(num)|bitmap|
|bitmap_union|Multiple bitmaps merged into one bitmap
(concatenation)|bitmap_union(bitmap)|bitmap|
|bitmap_count|Calculate the number of elements stored in the
bitmap|bitmap_count(bitmap)|long|
|bitmap_and|Calculate the intersection of two
bitmaps|bitmap_and(bitmap1,bitmap2)|bitmap|
|bitmap_or|Calculate the concatenation of two
bitmaps|bitmap_or(bitmap1,bitmap2)|bitmap|
|bitmap_xor|Calculate the difference between two
bitmaps|bitmap_xor(bitmap1,bitmap2)|bitmap|
|bitmap_from_array|Converting an array to a
bitmap|bitmap_from_array(array)|bitmap|
|bitmap_to_array|Convert bitmap to array|bitmap_to_array(bitmap)|array<bigint>|
|bitmap_contains|Determine if a bitmap contains all the elements of another
bitmap.|bitmap_contains(bitmap1,bitmap2)|boolean|
|bitmap_contains|Determine if a bitmap contains an
element|bitmap_contains(bitmap,num)|boolean|
> Support Bitmap function
> -----------------------
>
> Key: HIVE-28528
> URL: https://issues.apache.org/jira/browse/HIVE-28528
> Project: Hive
> Issue Type: Improvement
> Security Level: Public(Viewable by anyone)
> Components: UDF
> Affects Versions: 4.0.0
> Reporter: yongzhi.shao
> Priority: Major
>
> Since we have introduced roaringbitmap dependency in hive-ql module.
> Can we take this opportunity to introduce bitmap related UDFs, which can be
> used to quickly compute intersection and merger differences, de-duplication
> statistics, and other computational needs.
> If so, I can do this.
>
> DEMO:
> {code:java}
> CREATE TABLE IF NOT EXISTS `hive_bitmap_table`
> (
> k int,
> uuid bigint,
> bitmap binary
> ) comment
> STORED AS ORC;
> --demo
> select count(distinct uuid) from hive_bitmap_table;
> select bitmap_count(to_bitmap(uuid)) from hive_bitmap_table;
> insert into table hive_bitmap_table select 2 as id,2 as uuid,to_bitmap(2) as
> bitmap;{code}
>
>
>
> |UDF|desc |demo |result|
> |to_bitmap|Convert number (int or bigint) to bitmap|to_bitmap(num)|bitmap
> (binary)|
> |bitmap_union|Multiple bitmaps merged into one bitmap
> (concatenation)|bitmap_union(bitmap)|bitmap|
> |bitmap_count|Calculate the number of elements stored in the
> bitmap|bitmap_count(bitmap)|long|
> |bitmap_and|Calculate the intersection of two
> bitmaps|bitmap_and(bitmap1,bitmap2)|bitmap|
> |bitmap_or|Calculate the concatenation of two
> bitmaps|bitmap_or(bitmap1,bitmap2)|bitmap|
> |bitmap_xor|Calculate the difference between two
> bitmaps|bitmap_xor(bitmap1,bitmap2)|bitmap|
> |bitmap_from_array|Converting an array to a
> bitmap|bitmap_from_array(array)|bitmap|
> |bitmap_to_array|Convert bitmap to
> array|bitmap_to_array(bitmap)|array<bigint>|
> |bitmap_contains|Determine if a bitmap contains all the elements of another
> bitmap.|bitmap_contains(bitmap1,bitmap2)|boolean|
> |bitmap_contains|Determine if a bitmap contains an
> element|bitmap_contains(bitmap,num)|boolean|
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)