[ 
https://issues.apache.org/jira/browse/IMPALA-13991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rawat updated IMPALA-13991:
------------------------------------
    Priority: Critical  (was: Major)

> Nested loop join compose wrong predicate.
> -----------------------------------------
>
>                 Key: IMPALA-13991
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13991
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Riza Suminto
>            Assignee: Riza Suminto
>            Priority: Critical
>             Fix For: Impala 5.0.0
>
>         Attachments: profile_9c41d9d5e3a79d15_e927d8c400000000.txt
>
>
> Using functional_parquet database, run the following:
> {noformat}
> [localhost:21050] functional_parquet> select * from alltypes where (id in 
> (1000, 1001)) OR ((id in (1, 2)) AND timestamp_col <= (select timestamp_col 
> from alltypestiny order by timestamp_col DESC limit 1));
> Query: select * from alltypes where (id in (1000, 1001)) OR ((id in (1, 2)) 
> AND timestamp_col <= (select timestamp_col from alltypestiny order by 
> timestamp_col DESC limit 1))
> Query submitted at: 2025-04-24 12:06:34 (Coordinator: 
> http://rsuminto-22746:25000)
> Query state can be monitored at: 
> http://rsuminto-22746:25000/query_plan?query_id=9c41d9d5e3a79d15:e927d8c400000000
> +----+----------+-------------+--------------+---------+------------+---------------+------------+-----------------+------------+-------------------------------+------+-------+
> | id | bool_col | tinyint_col | smallint_col | int_col | bigint_col | 
> float_col     | double_col | date_string_col | string_col | timestamp_col     
>             | year | month |
> +----+----------+-------------+--------------+---------+------------+---------------+------------+-----------------+------------+-------------------------------+------+-------+
> | 1  | false    | 1           | 1            | 1       | 10         | 
> 1.10000002384 | 10.1       | 01/01/09        | 1          | 2009-01-01 
> 00:01:00           | 2009 | 1     |
> | 2  | true     | 2           | 2            | 2       | 20         | 
> 2.20000004768 | 20.2       | 01/01/09        | 2          | 2009-01-01 
> 00:02:00.100000000 | 2009 | 1     |
> +----+----------+-------------+--------------+---------+------------+---------------+------------+-----------------+------------+-------------------------------+------+-------+
> Fetched 2 row(s) in 0.11s
> [localhost:21050] functional_parquet> select * from alltypes where (id in 
> (1000, 1001)) OR ((id in (1, 2)) AND timestamp_col <= '2009-04-01 00:01:00');
> Query: select * from alltypes where (id in (1000, 1001)) OR ((id in (1, 2)) 
> AND timestamp_col <= '2009-04-01 00:01:00')
> Query submitted at: 2025-04-24 12:06:43 (Coordinator: 
> http://rsuminto-22746:25000)
> Query state can be monitored at: 
> http://rsuminto-22746:25000/query_plan?query_id=fc40ff3109deea71:cb99a7e200000000
> +------+----------+-------------+--------------+---------+------------+---------------+------------+-----------------+------------+-------------------------------+------+-------+
> | id   | bool_col | tinyint_col | smallint_col | int_col | bigint_col | 
> float_col     | double_col | date_string_col | string_col | timestamp_col     
>             | year | month |
> +------+----------+-------------+--------------+---------+------------+---------------+------------+-----------------+------------+-------------------------------+------+-------+
> | 1    | false    | 1           | 1            | 1       | 10         | 
> 1.10000002384 | 10.1       | 01/01/09        | 1          | 2009-01-01 
> 00:01:00           | 2009 | 1     |
> | 2    | true     | 2           | 2            | 2       | 20         | 
> 2.20000004768 | 20.2       | 01/01/09        | 2          | 2009-01-01 
> 00:02:00.100000000 | 2009 | 1     |
> | 1001 | false    | 1           | 1            | 1       | 10         | 
> 1.10000002384 | 10.1       | 04/11/09        | 1          | 2009-04-11 
> 01:41:04.500000000 | 2009 | 4     |
> | 1000 | true     | 0           | 0            | 0       | 0          | 0.0   
>         | 0.0        | 04/11/09        | 0          | 2009-04-11 
> 01:40:04.500000000 | 2009 | 4     |
> +------+----------+-------------+--------------+---------+------------+---------------+------------+-----------------+------------+-------------------------------+------+-------+
> Fetched 4 row(s) in 0.11s
> [localhost:21050] functional_parquet> select timestamp_col from alltypestiny 
> order by timestamp_col DESC limit 1;
> Query: select timestamp_col from alltypestiny order by timestamp_col DESC 
> limit 1
> Query submitted at: 2025-04-24 12:11:43 (Coordinator: 
> http://rsuminto-22746:25000)
> Query state can be monitored at: 
> http://rsuminto-22746:25000/query_plan?query_id=054746f7ea1bdc26:3711f6f800000000
> +---------------------+
> | timestamp_col       |
> +---------------------+
> | 2009-04-01 00:01:00 |
> +---------------------+
> Fetched 1 row(s) in 0.11s
> {noformat}
>  
> NESTED LOOP JOIN should not take out the less than predicate over 
> timestamp_col as its own equality predicate.
> {noformat}
> 03:NESTED LOOP JOIN [INNER JOIN, BROADCAST]
> |  predicates: timestamp_col <= timestamp_col, (id IN (CAST(1000 AS INT), 
> CAST(1001 AS INT))) OR ((id IN (CAST(1 AS INT), CAST(2 AS INT))) AND 
> timestamp_col IS NOT NULL)
> |  mem-estimate=16B mem-reservation=0B thread-reservation=0
> |  tuple-ids=0,2 row-size=96B cardinality=11.95K
> |  in pipelines: 00(GETNEXT), 02(OPEN){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to