[
https://issues.apache.org/jira/browse/HIVE-23095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17076450#comment-17076450
]
Zoltan Haindrich commented on HIVE-23095:
-----------------------------------------
[~prasanth_j] I've written some small perf test to see how things are looking
(better or worse).
I suspected that the use of treeMap is not neccessary - and I was able to get
to somewhat "similar" runtimes:
* removed the list
* replaced the treemap with fastutil's {{Int2ByteOpenHashMap}}
* there was also a bug in SparseRegister.add() - earlier when it merged the
"list" - it forgot to add the actual element...which might have missed a new
element.
I've attached some bench results [^hll-bench.md] - which I've recorded during
making these changes.
> NDV might be overestimated for a table with ~70 value
> -----------------------------------------------------
>
> Key: HIVE-23095
> URL: https://issues.apache.org/jira/browse/HIVE-23095
> Project: Hive
> Issue Type: Bug
> Reporter: Zoltan Haindrich
> Assignee: Zoltan Haindrich
> Priority: Major
> Labels: pull-request-available
> Attachments: HIVE-23095.01.patch, HIVE-23095.02.patch,
> HIVE-23095.03.patch, HIVE-23095.04.patch, HIVE-23095.04.patch,
> HIVE-23095.04.patch, HIVE-23095.05.patch, hll-bench.md
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> uncovered during looking into HIVE-23082
> https://issues.apache.org/jira/browse/HIVE-23082?focusedCommentId=17067773&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17067773
--
This message was sent by Atlassian Jira
(v8.3.4#803005)