[jira] [Updated] (HIVE-23667) Incorrect output with option hive.auto.convert.join=fasle

gaozhan ding (Jira) Mon, 19 Oct 2020 05:09:30 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-23667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


gaozhan ding updated HIVE-23667:
--------------------------------
    Description: 
We use hive with version 3.1.0 with tez engine 0.9.1.3

I encountered an error when executing a hive SQL. This SQL is as follows
{code:java}
set mapreduce.job.queuename=root.xxx;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set hive.exec.max.dynamic.partitions.pernode=10000;
set hive.exec.max.dynamic.partitions=10000;
set hive.fileformat.check=false;
set mapred.reduce.tasks=50;
set hive.auto.convert.join=true;
use xxx;

select count(*) from   A  join B on  B.b=A.a;{code}
with the output.
{code:java}
+----------+ | _c0 | +----------+ | 4954736 | +----------+
{code}
But when the hive.auto.convert.join option is set to false，the utput is not as 
expected。

The SQL is as follows
{code:java}
set mapreduce.job.queuename=root.xxx;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set hive.exec.max.dynamic.partitions.pernode=10000;
set hive.exec.max.dynamic.partitions=10000;
set hive.fileformat.check=false;  
set mapred.reduce.tasks=50;
set hive.auto.convert.join=false; //changed
use xxx;

select count(*) from A join B on B.b=A.a;{code}
with output：
{code:java}
+------+ | _c0 | +------+ | 0 | +------+
{code}
Beside，both tables participating in the join are partition tables.

Especially，if the option mapred.reduce.tasks=50 was not set，all above the sql 
output expected results.

We just upgraded hive from 1.2 to 3.1.0, and we found that these problems only 
occurred in the old hive table.

  was:
We use hive with version 3.1.0 with tez engine 0.9.1.3

I encountered an error when executing a hive SQL. This SQL is as follows
{code:java}
set mapreduce.job.queuename=root.xxx;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set hive.exec.max.dynamic.partitions.pernode=10000;
set hive.exec.max.dynamic.partitions=10000;
set hive.fileformat.check=false;
set mapred.reduce.tasks=50;
set hive.auto.convert.join=true;
use xxx;

select count(*) from   230_dim_site  join dw_fact_inverter_detail on  
dw_fact_inverter_detail.site=230_dim_site.id;{code}
with the output.
{code:java}
+----------+ | _c0 | +----------+ | 4954736 | +----------+
{code}
But when the hive.auto.convert.join option is set to false，the utput is not as 
expected。

The SQL is as follows
{code:java}
set mapreduce.job.queuename=root.xxx;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set hive.exec.max.dynamic.partitions.pernode=10000;
set hive.exec.max.dynamic.partitions=10000;
set hive.fileformat.check=false;  
set mapred.reduce.tasks=50;
set hive.auto.convert.join=false; //changed
use xxx;

select count(*) from   230_dim_site  join dw_fact_inverter_detail on  
dw_fact_inverter_detail.site=230_dim_site.id;{code}
with output：
{code:java}
+------+ | _c0 | +------+ | 0 | +------+
{code}
Beside，both tables participating in the join are partition tables.

Especially，if the option mapred.reduce.tasks=50 was not set，all above the sql 
output expected results.

We just upgraded hive from 1.2 to 3.1.0, and we found that these problems only 
occurred in the old hive table.


> Incorrect output with option hive.auto.convert.join=fasle
> ---------------------------------------------------------
>
>                 Key: HIVE-23667
>                 URL: https://issues.apache.org/jira/browse/HIVE-23667
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.1.0
>            Reporter: gaozhan ding
>            Priority: Critical
>
> We use hive with version 3.1.0 with tez engine 0.9.1.3
> I encountered an error when executing a hive SQL. This SQL is as follows
> {code:java}
> set mapreduce.job.queuename=root.xxx;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.dynamic.partition=true;
> set hive.exec.max.dynamic.partitions.pernode=10000;
> set hive.exec.max.dynamic.partitions=10000;
> set hive.fileformat.check=false;
> set mapred.reduce.tasks=50;
> set hive.auto.convert.join=true;
> use xxx;
> select count(*) from   A  join B on  B.b=A.a;{code}
> with the output.
> {code:java}
> +----------+ | _c0 | +----------+ | 4954736 | +----------+
> {code}
> But when the hive.auto.convert.join option is set to false，the utput is not 
> as expected。
> The SQL is as follows
> {code:java}
> set mapreduce.job.queuename=root.xxx;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.dynamic.partition=true;
> set hive.exec.max.dynamic.partitions.pernode=10000;
> set hive.exec.max.dynamic.partitions=10000;
> set hive.fileformat.check=false;  
> set mapred.reduce.tasks=50;
> set hive.auto.convert.join=false; //changed
> use xxx;
> select count(*) from A join B on B.b=A.a;{code}
> with output：
> {code:java}
> +------+ | _c0 | +------+ | 0 | +------+
> {code}
> Beside，both tables participating in the join are partition tables.
> Especially，if the option mapred.reduce.tasks=50 was not set，all above the sql 
> output expected results.
> We just upgraded hive from 1.2 to 3.1.0, and we found that these problems 
> only occurred in the old hive table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23667) Incorrect output with option hive.auto.convert.join=fasle

Reply via email to