[ https://issues.apache.org/jira/browse/HIVE-29176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas Rebele updated HIVE-29176: --------------------------------- Description: Given the following script: {code:java} --! qt:dataset:src --! qt:dataset:part -- SORT_QUERY_RESULTS create table t1_n55 as select cast(key as int) key, value from src where key <= 10; select * from t1_n55 sort by key; create table t2_n33 as select cast(2*key as int) key, value from t1_n55 union select 10 key, null value; select * from t2_n33 sort by key; create table t3_n12 as select * from (select * from t1_n55 union all select * from t2_n33) b; select * from t3_n12 sort by key, value; SET hive.auto.convert.anti.join=true; select "\nantijoin=true\n"; explain cbo select a.key, a.value from t1_n55 a left join t2_n33 b on a.key=b.key where b.value is null; select a.key, a.value from t1_n55 a left join t2_n33 b on a.key=b.key where b.value is null; SET hive.auto.convert.anti.join=false; select "\nantijoin=false\n"; explain cbo select a.key, a.value from t1_n55 a left join t2_n33 b on a.key=b.key where b.value is null; select a.key, a.value from t1_n55 a left join t2_n33 b on a.key=b.key where b.value is null; {code} There's a difference when executing the query with/without anti-join: Result with antijoin: {code:java} 2 val_2 5 val_5 5 val_5 5 val_5 9 val_9 {code} Result without antijoin: {code:java} 10 val_10 2 val_2 5 val_5 5 val_5 5 val_5 9 val_9 {code} was: Using tables as defined in `ql/src/test/queries/clientpositive/antijoin.q`, except changing the definition of t2_n33 to: {code:java} create table t2_n33 as select cast(2*key as int) key, value from t1_n55 union select 10 key, null value; {code} There's a difference when executing the query {code:java} select a.key, a.value, ''+a.value+b.value from t1_n55 a left join t2_n33 b on a.key=b.key where b.value is null; {code} With HiveJoin, the result contains a row {{{}10 val_10 NULL{}}}, with HiveAntiJoin the row is missing. > AntiJoin may not be introduced when the IS NULL filter is applied to a > nullable column > -------------------------------------------------------------------------------------- > > Key: HIVE-29176 > URL: https://issues.apache.org/jira/browse/HIVE-29176 > Project: Hive > Issue Type: Bug > Reporter: Thomas Rebele > Assignee: Thomas Rebele > Priority: Major > > Given the following script: > {code:java} > --! qt:dataset:src > --! qt:dataset:part > -- SORT_QUERY_RESULTS > create table t1_n55 as select cast(key as int) key, value from src where key > <= 10; > select * from t1_n55 sort by key; > create table t2_n33 as select cast(2*key as int) key, value from t1_n55 union > select 10 key, null value; > select * from t2_n33 sort by key; > create table t3_n12 as select * from (select * from t1_n55 union all select * > from t2_n33) b; > select * from t3_n12 sort by key, value; > SET hive.auto.convert.anti.join=true; > select "\nantijoin=true\n"; > explain cbo select a.key, a.value from t1_n55 a left join t2_n33 b on > a.key=b.key where b.value is null; > select a.key, a.value from t1_n55 a left join t2_n33 b on a.key=b.key where > b.value is null; > SET hive.auto.convert.anti.join=false; > select "\nantijoin=false\n"; > explain cbo select a.key, a.value from t1_n55 a left join t2_n33 b on > a.key=b.key where b.value is null; > select a.key, a.value from t1_n55 a left join t2_n33 b on a.key=b.key where > b.value is null; > {code} > There's a difference when executing the query with/without anti-join: > Result with antijoin: > {code:java} > 2 val_2 > 5 val_5 > 5 val_5 > 5 val_5 > 9 val_9 > {code} > Result without antijoin: > {code:java} > 10 val_10 > 2 val_2 > 5 val_5 > 5 val_5 > 5 val_5 > 9 val_9 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)