[ 
https://issues.apache.org/jira/browse/HIVE-27997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17806470#comment-17806470
 ] 

Butao Zhang edited comment on HIVE-27997 at 1/14/24 2:06 PM:
-------------------------------------------------------------

[~mergen] Thanks for you reporting this issue!

Looks like you are using Hive3.1.3, and i tested your sql on Hive master branch 
(or Hive4), found that everything was correct.

I think it's due to the udf *NVL* implementation bug, but in 
https://issues.apache.org/jira/browse/HIVE-20961 (From Hive4.0-alpha-1) we have 
removed the *NVL* implementation, and replaced it by *coalesce.* So in Hive3 i 
may suggest you to use *coalesce* instead, but in Hive4 you can use the name 
*NVL* because it is actually an alias to {*}coalesce{*}.

 

BTW, if *coalesce* still can not fix you problem, then maybe it's not the udf 
implementation bug. You could try to search some related nvl or coalece join 
bug in Apache Hive Jira issues, and i think there are some fixes already exist 
on Hive4&master.

Thanks.


was (Author: zhangbutao):
[~mergen] Thanks for you reporting this issue!

Looks like you are using Hive3.1.3, and i tested your sql on Hive master branch 
(or Hive4), found that everything was correct.

I think it's due to the udf *NVL* implementation bug, but in 
https://issues.apache.org/jira/browse/HIVE-20961 (From Hive4.0-alpha-1) we have 
removed the *NVL* implementation, and replaced it by *coalesce.* So in Hive3 i 
may suggest you to use *coalesce* instead, but in Hive4 you can use the name 
*NVL* because it is actually an alias to {*}coalesce{*}.

Thanks.

> Incorrect result for Hive join query with NVL and Map Join
> ----------------------------------------------------------
>
>                 Key: HIVE-27997
>                 URL: https://issues.apache.org/jira/browse/HIVE-27997
>             Project: Hive
>          Issue Type: Bug
>          Components: Operators
>    Affects Versions: 3.1.3
>            Reporter: Mergen
>            Priority: Major
>
> Hive returns incorrect result if there is NVL() in an ON clause with Map Join 
> enabled.
>  
> STEPS TO REPRODUCE:
> {code:java}
> Step 1: Create a table test_nvl
> create table test_nvl(a string);
> Step 2: Insert null and non-null data into table test_nvl
> insert into test_nvl values ('x'), ('y'), (null);
> select * from test_nvl;
> +-------------+
> | test_nvl.a  |
> +-------------+
> | x           |
> | y           |
> | NULL        |
> +-------------+
> Step 3 : Execute the following query
> select x.a, y.a
> from test_nvl x
> left join test_nvl y
> on nvl(x.a, '') = nvl(y.a, '');{code}
>  
> EXPECTED RESULT:
> {code:java}
> +-------+-------+
> | x.a   | y.a   |
> +-------+-------+
> | x     | x     |
> | y     | y     |
> | NULL  | NULL  |
> +-------+-------+ {code}
>  
> ACTUAL RESULT:
> {code:java}
> +-------+------+
> | x.a   | y.a  |
> +-------+------+
> | x     | x    |
> | y     | x    |
> | NULL  | x    |
> +-------+------+{code}
> (Obviously 'y' != 'x' and NULL != 'x' so they should not be in the same line)
>  
> The query works fine with Map Join disabled:
> {code:java}
> -- Using Merge Join instead.
> set hive.auto.convert.join=false;
> select x.a, y.a
> from test_nvl x
> left join test_nvl y
> on nvl(x.a, '') = nvl(y.a, '');
> +-------+-------+
> | x.a   | y.a   |
> +-------+-------+
> | NULL  | NULL  |
> | x     | x     |
> | y     | y     |
> +-------+-------+ {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to