[ 
https://issues.apache.org/jira/browse/HIVE-24205?focusedWorklogId=496055&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496055
 ]

ASF GitHub Bot logged work on HIVE-24205:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Oct/20 17:42
            Start Date: 06/Oct/20 17:42
    Worklog Time Spent: 10m 
      Work Description: mustafaiman closed pull request #1549:
URL: https://github.com/apache/hive/pull/1549


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 496055)
    Time Spent: 20m  (was: 10m)

> Optimise CuckooSetBytes
> -----------------------
>
>                 Key: HIVE-24205
>                 URL: https://issues.apache.org/jira/browse/HIVE-24205
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Assignee: Mustafa Iman
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: Screenshot 2020-09-28 at 4.29.24 PM.png, bench.png, 
> vectorized.patch
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{FilterStringColumnInList, StringColumnInList}}  etc use CuckooSetBytes for 
> lookup.
> !Screenshot 2020-09-28 at 4.29.24 PM.png|width=714,height=508!
> One option to optimize would be to add boundary conditions on "length" with 
> the min/max length stored in the hashes (ref: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CuckooSetBytes.java#L85])
>  . This would significantly reduce the number of hash computation that needs 
> to happen. E.g 
> [TPCH-Q12|https://github.com/hortonworks/hive-testbench/blob/hdp3/sample-queries-tpch/tpch_query12.sql#L20]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to