[ 
https://issues.apache.org/jira/browse/HIVE-23095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17076450#comment-17076450
 ] 

Zoltan Haindrich commented on HIVE-23095:
-----------------------------------------

[~prasanth_j] I've written some small perf test to see how things are looking 
(better or worse).
I suspected that the use of treeMap is not neccessary - and I was able to get 
to somewhat "similar" runtimes:
* removed the list
* replaced the treemap with fastutil's {{Int2ByteOpenHashMap}}
* there was also a bug in SparseRegister.add() - earlier when it merged the 
"list" - it forgot to add the actual element...which might have missed a new 
element.

I've attached some bench results [^hll-bench.md] - which I've recorded during 
making these changes.

> NDV might be overestimated for a table with ~70 value
> -----------------------------------------------------
>
>                 Key: HIVE-23095
>                 URL: https://issues.apache.org/jira/browse/HIVE-23095
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Zoltan Haindrich
>            Assignee: Zoltan Haindrich
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-23095.01.patch, HIVE-23095.02.patch, 
> HIVE-23095.03.patch, HIVE-23095.04.patch, HIVE-23095.04.patch, 
> HIVE-23095.04.patch, HIVE-23095.05.patch, hll-bench.md
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> uncovered during looking into HIVE-23082
> https://issues.apache.org/jira/browse/HIVE-23082?focusedCommentId=17067773&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17067773



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to