[
https://issues.apache.org/jira/browse/HIVE-24510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HIVE-24510:
----------------------------------
Labels: pull-request-available (was: )
> Vectorize compute_bit_vector
> ----------------------------
>
> Key: HIVE-24510
> URL: https://issues.apache.org/jira/browse/HIVE-24510
> Project: Hive
> Issue Type: Improvement
> Reporter: Mustafa İman
> Assignee: Mustafa İman
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> After https://issues.apache.org/jira/browse/HIVE-23530 , almost all compute
> stats functions are vectorizable. Only function that is not vectorizable is
> "compute_bit_vector" for ndv statistics computation. This causes "create
> table as select" and "insert overwrite select" queries to run in
> non-vectorized mode.
> Even a very naive implementation of vectorized compute_bit_vector gives about
> 50% performance improvement on simple "insert overwrite select" queries. That
> is because entire mapper or reducer can run in vectorized mode.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)