Alexander Pivovarov created HIVE-10841: ------------------------------------------
Summary: [WHERE col is not null] does not work for large queries Key: HIVE-10841 URL: https://issues.apache.org/jira/browse/HIVE-10841 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Alexander Pivovarov The result from the following SELCT query is 3 rows but it should be 1 row. I checked it in MySQL - it returned 1 row. To reproduce the issue in Hive 1. prepare tables {code} drop table if exists L; drop table if exists LA; drop table if exists FR; drop table if exists A; drop table if exists PI; drop table if exists acct; create table L as select 4436 id; create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id; create table FR as select 4436 loan_id; create table A as select 4748 id; create table PI as select 4415 id; create table acct as select 4748 aid, 10 acc_n, 122 brn; insert into table acct values(4748, null, null); insert into table acct values(4748, null, null); {code} 2. run SELECT query {code} select acct.ACC_N, acct.brn FROM L JOIN LA ON L.id = LA.loan_id JOIN FR ON L.id = FR.loan_id JOIN A ON LA.aid = A.id JOIN PI ON PI.id = LA.pi_id JOIN acct ON A.id = acct.aid WHERE L.id = 4436 and acct.brn is not null; {code} the result is 3 rows {code} 10 122 NULL NULL NULL NULL {code} but it should be 1 row {code} 10 122 {code} 3. workaround is to put "acct.brn is not null" to join condition {code} select acct.ACC_N, acct.brn FROM L JOIN LA ON L.id = LA.loan_id JOIN FR ON L.id = FR.loan_id JOIN A ON LA.aid = A.id JOIN PI ON PI.id = LA.pi_id JOIN acct ON A.id = acct.aid and acct.brn is not null WHERE L.id = 4436; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)