Luis E Martinez-Poblete created IMPALA-7957:
-----------------------------------------------
Summary: UNION ALL query returns incorrect results
Key: IMPALA-7957
URL: https://issues.apache.org/jira/browse/IMPALA-7957
Project: IMPALA
Issue Type: Bug
Components: Frontend
Affects Versions: Impala 2.12.0
Reporter: Luis E Martinez-Poblete
Synopsis:
=========
UNION ALL query returns incorrect results
Problem:
========
Customer reported a UNION ALL query returning incorrect results. The UNION ALL
query has 2 legs, but Impala is only returning information from one leg.
Issue can be reproduced in the latest version of Impala. Below is the
reproduction case:
create table mytest_t (c1 timestamp, c2 timestamp, c3 int, c4 int);
insert into mytest_t values (now(), ADDDATE (now(),1), 1,1);
insert into mytest_t values (now(), ADDDATE (now(),1), 2,2);
insert into mytest_t values (now(), ADDDATE (now(),1), 3,3);
SELECT t.c1
FROM
(SELECT c1, c2
FROM mytest_t) t
LEFT JOIN
(SELECT c1, c2
FROM mytest_t
WHERE c2 = c1) t2 ON (t.c2 = t2.c2)
UNION ALL
VALUES (NULL)
The above query produces the following execution plan:
+------------------------------------------------------------------------------------+
| Explain String |
+------------------------------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=34.02MB Threads=5 |
| Per-Host Resource Estimates: Memory=2.06GB |
| WARNING: The following tables are missing relevant table and/or column
statistics. |
| default.mytest_t |
| |
| PLAN-ROOT SINK |
| | |
| 06:EXCHANGE [UNPARTITIONED] |
| | |
| 00:UNION |
| | constant-operands=1 |
| | |
| 04:SELECT |
| | predicates: default.mytest_t.c1 = default.mytest_t.c2 |
| | |
| 03:HASH JOIN [LEFT OUTER JOIN, BROADCAST] |
| | hash predicates: c2 = c2 |
| | |
| |--05:EXCHANGE [BROADCAST] |
| | | |
| | 02:SCAN HDFS [default.mytest_t] |
| | partitions=1/1 files=3 size=192B |
| | predicates: c2 = c1 |
| | |
| 01:SCAN HDFS [default.mytest_t] |
| partitions=1/1 files=3 size=192B |
+------------------------------------------------------------------------------------+
The issue is in operator 4:
| 04:SELECT |
| | predicates: default.mytest_t.c1 = default.mytest_t.c2 |
It's definitely a bug with predicate placement - that c1 = c2 predicate
shouldn't be evaluated outside the right branch of the LEFT JOIN.
Thanks,
Luis Martinez.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)