gaozhan ding created HIVE-23667: ----------------------------------- Summary: Incorrect output with option hive.auto.convert.join=fasle Key: HIVE-23667 URL: https://issues.apache.org/jira/browse/HIVE-23667 Project: Hive Issue Type: Bug Reporter: gaozhan ding Fix For: 3.1.0
We use hive with version 3.1.0 with tez engine 0.9.1.3 I encountered an error when executing a hive SQL. This SQL is as follows {code:java} set mapreduce.job.queuename=root.xxx; set hive.exec.dynamic.partition.mode=nonstrict; set hive.exec.dynamic.partition=true; set hive.exec.max.dynamic.partitions.pernode=10000; set hive.exec.max.dynamic.partitions=10000; set hive.fileformat.check=false; set mapred.reduce.tasks=50; set hive.auto.convert.join=true; use xxx; select count(*) from 230_dim_site join dw_fact_inverter_detail on dw_fact_inverter_detail.site=230_dim_site.id;{code} with the output. {code:java} +----------+ | _c0 | +----------+ | 4954736 | +----------+ {code} But when the hive.auto.convert.join option is set to false,the utput is not as expected。 The SQL is as follows {code:java} set mapreduce.job.queuename=root.xxx; set hive.exec.dynamic.partition.mode=nonstrict; set hive.exec.dynamic.partition=true; set hive.exec.max.dynamic.partitions.pernode=10000; set hive.exec.max.dynamic.partitions=10000; set hive.fileformat.check=false; set mapred.reduce.tasks=50; set hive.auto.convert.join=false; //changed use xxx; select count(*) from 230_dim_site join dw_fact_inverter_detail on dw_fact_inverter_detail.site=230_dim_site.id;{code} with output: {code:java} +------+ | _c0 | +------+ | 0 | +------+ {code} Beside,both tables participating in the join are partition tables. Especially,if the option mapred.reduce.tasks=50 was not set,all above the sql output expected results. We just upgraded hive from 1.2 to 3.1.0, and we found that these problems only occurred in the old hive table. -- This message was sent by Atlassian Jira (v8.3.4#803005)