[ 
https://issues.apache.org/jira/browse/HIVE-24510?focusedWorklogId=532991&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-532991
 ]

ASF GitHub Bot logged work on HIVE-24510:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Jan/21 12:15
            Start Date: 08/Jan/21 12:15
    Worklog Time Spent: 10m 
      Work Description: mustafaiman edited a comment on pull request #1824:
URL: https://github.com/apache/hive/pull/1824#issuecomment-756725412


   I made a quick fix to allow that in early versions of this patch. Then I 
decided to not pursue it because I did not see the need for allowing constant 
argument in runtime.
   
   > you can still do something like:
   > if compute_bit_vector: -> handle constant parameter
   
   We do exactly that. Not in vectorizer but earlier in 
`ColumnStatsSemanticAnalyzer.java `. I am reluctant to implement extra 
functionality or add special cases unless it is necessary. Note that 
compute_bit_vector is a newly added UDF in 4.0. So there is no backward 
compatibility concern either.
   Do you see any other benefit than preserving the earlier q.out outputs?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 532991)
    Time Spent: 2h 40m  (was: 2.5h)

> Vectorize compute_bit_vector
> ----------------------------
>
>                 Key: HIVE-24510
>                 URL: https://issues.apache.org/jira/browse/HIVE-24510
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Mustafa İman
>            Assignee: Mustafa İman
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> After https://issues.apache.org/jira/browse/HIVE-23530 , almost all compute 
> stats functions are vectorizable. Only function that is not vectorizable is 
> "compute_bit_vector" for ndv statistics computation. This causes "create 
> table as select" and "insert overwrite select" queries to run in 
> non-vectorized mode. 
> Even a very naive implementation of vectorized compute_bit_vector gives about 
> 50% performance improvement on simple "insert overwrite select" queries. That 
> is because entire mapper or reducer can run in vectorized mode.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to