Riza Suminto created IMPALA-13991:
-------------------------------------
Summary: Nested loop join compose wrong predicate.
Key: IMPALA-13991
URL: https://issues.apache.org/jira/browse/IMPALA-13991
Project: IMPALA
Issue Type: Bug
Components: Frontend
Reporter: Riza Suminto
Attachments: profile_9c41d9d5e3a79d15_e927d8c400000000.txt
Using functional_parquet database, run the following:
{noformat}
[localhost:21050] functional_parquet> select * from alltypes where (id in
(1000, 1001)) OR ((id in (1, 2)) AND timestamp_col <= (select timestamp_col
from alltypestiny order by timestamp_col DESC limit 1));
Query: select * from alltypes where (id in (1000, 1001)) OR ((id in (1, 2)) AND
timestamp_col <= (select timestamp_col from alltypestiny order by timestamp_col
DESC limit 1))
Query submitted at: 2025-04-24 12:06:34 (Coordinator:
http://rsuminto-22746:25000)
Query state can be monitored at:
http://rsuminto-22746:25000/query_plan?query_id=9c41d9d5e3a79d15:e927d8c400000000
+----+----------+-------------+--------------+---------+------------+---------------+------------+-----------------+------------+-------------------------------+------+-------+
| id | bool_col | tinyint_col | smallint_col | int_col | bigint_col | float_col
| double_col | date_string_col | string_col | timestamp_col
| year | month |
+----+----------+-------------+--------------+---------+------------+---------------+------------+-----------------+------------+-------------------------------+------+-------+
| 1 | false | 1 | 1 | 1 | 10 |
1.10000002384 | 10.1 | 01/01/09 | 1 | 2009-01-01 00:01:00
| 2009 | 1 |
| 2 | true | 2 | 2 | 2 | 20 |
2.20000004768 | 20.2 | 01/01/09 | 2 | 2009-01-01
00:02:00.100000000 | 2009 | 1 |
+----+----------+-------------+--------------+---------+------------+---------------+------------+-----------------+------------+-------------------------------+------+-------+
Fetched 2 row(s) in 0.11s
[localhost:21050] functional_parquet> select * from alltypes where (id in
(1000, 1001)) OR ((id in (1, 2)) AND timestamp_col <= '2009-04-01 00:01:00');
Query: select * from alltypes where (id in (1000, 1001)) OR ((id in (1, 2)) AND
timestamp_col <= '2009-04-01 00:01:00')
Query submitted at: 2025-04-24 12:06:43 (Coordinator:
http://rsuminto-22746:25000)
Query state can be monitored at:
http://rsuminto-22746:25000/query_plan?query_id=fc40ff3109deea71:cb99a7e200000000
+------+----------+-------------+--------------+---------+------------+---------------+------------+-----------------+------------+-------------------------------+------+-------+
| id | bool_col | tinyint_col | smallint_col | int_col | bigint_col |
float_col | double_col | date_string_col | string_col | timestamp_col
| year | month |
+------+----------+-------------+--------------+---------+------------+---------------+------------+-----------------+------------+-------------------------------+------+-------+
| 1 | false | 1 | 1 | 1 | 10 |
1.10000002384 | 10.1 | 01/01/09 | 1 | 2009-01-01 00:01:00
| 2009 | 1 |
| 2 | true | 2 | 2 | 2 | 20 |
2.20000004768 | 20.2 | 01/01/09 | 2 | 2009-01-01
00:02:00.100000000 | 2009 | 1 |
| 1001 | false | 1 | 1 | 1 | 10 |
1.10000002384 | 10.1 | 04/11/09 | 1 | 2009-04-11
01:41:04.500000000 | 2009 | 4 |
| 1000 | true | 0 | 0 | 0 | 0 | 0.0
| 0.0 | 04/11/09 | 0 | 2009-04-11
01:40:04.500000000 | 2009 | 4 |
+------+----------+-------------+--------------+---------+------------+---------------+------------+-----------------+------------+-------------------------------+------+-------+
Fetched 4 row(s) in 0.11s
[localhost:21050] functional_parquet> select timestamp_col from alltypestiny
order by timestamp_col DESC limit 1;
Query: select timestamp_col from alltypestiny order by timestamp_col DESC limit
1
Query submitted at: 2025-04-24 12:11:43 (Coordinator:
http://rsuminto-22746:25000)
Query state can be monitored at:
http://rsuminto-22746:25000/query_plan?query_id=054746f7ea1bdc26:3711f6f800000000
+---------------------+
| timestamp_col |
+---------------------+
| 2009-04-01 00:01:00 |
+---------------------+
Fetched 1 row(s) in 0.11s
{noformat}
NESTED LOOP JOIN should not take out the less than predicate over timestamp_col
as its own equality predicate.
{noformat}
03:NESTED LOOP JOIN [INNER JOIN, BROADCAST]
| predicates: timestamp_col <= timestamp_col, (id IN (CAST(1000 AS INT),
CAST(1001 AS INT))) OR ((id IN (CAST(1 AS INT), CAST(2 AS INT))) AND
timestamp_col IS NOT NULL)
| mem-estimate=16B mem-reservation=0B thread-reservation=0
| tuple-ids=0,2 row-size=96B cardinality=11.95K
| in pipelines: 00(GETNEXT), 02(OPEN){noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)