Mustafa İman created HIVE-24510:
-----------------------------------

             Summary: Vectorize compute_bit_vector
                 Key: HIVE-24510
                 URL: https://issues.apache.org/jira/browse/HIVE-24510
             Project: Hive
          Issue Type: Improvement
            Reporter: Mustafa İman
            Assignee: Mustafa İman


After https://issues.apache.org/jira/browse/HIVE-23530 , almost all compute 
stats functions are vectorizable. Only function that is not vectorizable is 
"compute_bit_vector" for ndv statistics computation. This causes "create table 
as select" and "insert overwrite select" queries to run in non-vectorized mode. 

Even a very naive implementation of vectorized compute_bit_vector gives about 
50% performance improvement on simple "insert overwrite select" queries. That 
is because entire mapper or reducer can run in vectorized mode.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to