[
https://issues.apache.org/jira/browse/HIVE-28735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17923937#comment-17923937
]
Sungwoo Park commented on HIVE-28735:
-------------------------------------
No, we have not seen failures of the two queries. In our experiments with
TPC-DS (usually 10TB scale with 10+ nodes), we have always set
hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled to true. So, I
think we need more details in order to reproduce the problem.
> TPCDS queries q15, q19 are failing when
> hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled is set to
> true
> -----------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-28735
> URL: https://issues.apache.org/jira/browse/HIVE-28735
> Project: Hive
> Issue Type: Sub-task
> Components: Vectorization
> Affects Versions: 4.0.0, 4.0.1
> Reporter: Paramvir Singh
> Priority: Major
> Labels: hive-4.1.0-must
>
> TPCDS queries q15, q19 are failing when
> hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled is set to
> true.
> Setup should include atleast 2 node cluster. It's passing when the cluster
> has only 1 node.
> The wrong result is also random(on each run I get different random wrong
> values).
> Small repro query on TPCDS dataset
> {code:java}
> select ca_zip, count(*)
> from catalog_sales_small, customer_small, customer_address_small
> where cs_bill_customer_sk = c_customer_sk
> and c_current_addr_sk = ca_address_sk
> group by ca_zip
> order by ca_zip
> limit 100;
> {code}
> If we set the following properties, we get correct results
> {code:java}
> set hive.vectorized.execution.enabled=false; - Correct results
> {code}
> OR
> {code:java}
> set hive.auto.convert.join=false; - Correct results
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)