[
https://issues.apache.org/jira/browse/HIVE-23976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191703#comment-17191703
]
Stamatis Zampetakis commented on HIVE-23976:
--------------------------------------------
Hey [~abstractdog], thanks for taking over this :)
I had a quick look on the PR and noticed that the vectorized hash
implementation seems to be a binary operator (two inputs, one output) while the
non-vectorized alternative (GenericUDFMurmurHash) is an n-ary operator.
Can we make the vectorized implementation n-ary or we should rather transform
an expression {{hash(a,b,c,d)}} to something like {{hash(hash(hash(a,b),c),d)}}?
In any case if we don't treat this now we should create a follow-up JIRA.
> Enable vectorization for multi-col semi join reducers
> -----------------------------------------------------
>
> Key: HIVE-23976
> URL: https://issues.apache.org/jira/browse/HIVE-23976
> Project: Hive
> Issue Type: Improvement
> Reporter: Stamatis Zampetakis
> Assignee: László Bodor
> Priority: Major
> Labels: pull-request-available
> Time Spent: 20m
> Remaining Estimate: 0h
>
> HIVE-21196 introduces multi-column semi-join reducers in the query engine.
> However, the implementation relies on GenericUDFMurmurHash which is not
> vectorized thus the respective operators cannot be executed in vectorized
> mode.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)