[
https://issues.apache.org/jira/browse/HIVE-27269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725691#comment-17725691
]
Seonggon Namgung commented on HIVE-27269:
-----------------------------------------
To reproduce this issue, one should set the following configurations:
# hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled=true
# hive.mapjoin.hashtable.load.threads=2 (or higher integer)
> VectorizedMapJoin returns wrong result for TPC-DS query 97
> ----------------------------------------------------------
>
> Key: HIVE-27269
> URL: https://issues.apache.org/jira/browse/HIVE-27269
> Project: Hive
> Issue Type: Sub-task
> Reporter: Seonggon Namgung
> Priority: Blocker
> Labels: hive-4.0.0-must
>
> TPC-DS query 97 returns wrong results when hive.auto.convert.join and
> hive.vectorized.execution.enabled are set to true.
>
> Result of query 97 on 1TB text dataset:
> CommonMergeJoinOperator(hive.auto.convert.join=false): 534151529,
> 284185{*}746{*}, 84163
> MapJoinOperator(hive.auto.convert.join=true,
> hive.vectorized.execution.enabled=false): 534151529, 284185{*}746{*}, 84163
> VectorMapJoinOperator(hive.auto.convert.join=true,
> hive.vectorized.execution.enabled=true): 534151529, 284185{*}388{*}, 84163
>
> Also I observed that VectorizedMapJoin returns different results for 100GB
> dataset when I run query 97 twice, but I could not reproduce it since then.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)