[
https://issues.apache.org/jira/browse/IMPALA-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278346#comment-17278346
]
Tim Armstrong commented on IMPALA-4373:
---------------------------------------
The fix might be similar to IMPALA-10382 that [~xqhe] fixed recently.
> Wrong results with correlated WHERE-clause subquery inside a NULL-checking
> conditional function.
> ------------------------------------------------------------------------------------------------
>
> Key: IMPALA-4373
> URL: https://issues.apache.org/jira/browse/IMPALA-4373
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 2.5.0, Impala 2.4.0, Impala 2.6.0, Impala 2.7.0,
> Impala 2.8.0, Impala 2.9.0
> Reporter: Alexander Behm
> Assignee: Tim Armstrong
> Priority: Critical
> Labels: correctness
>
> Impala may generate an incorrect plan for queries that have a correlated
> scalar subquery as a parameter to a NULL-checking conditional function like
> ISNULL().
> Example query and incorrect plan:
> {code}
> select t1.int_col
> from functional.alltypessmall as t1
> where t1.int_col >= isnull
> (
> (
> SELECT
> MAX(t2.bigint_col)
> FROM
> functional.alltypestiny AS t2
> WHERE
> t1.id = t2.id + 10000
> ),
> 0
> )
> Fetched 0 row(s) in 1.09s
> Single-node plan:
> +-----------------------------------------------------------------------+
> | Explain String |
> +-----------------------------------------------------------------------+
> | Estimated Per-Host Requirements: Memory=0B VCores=0 |
> | |
> | PLAN-ROOT SINK |
> | | |
> | 03:HASH JOIN [LEFT SEMI JOIN] |
> | | hash predicates: t1.id = t2.id + 10000 |
> | | other join predicates: t1.int_col >= isnull(max(t2.bigint_col), 0) |
> | | runtime filters: RF000 <- t2.id + 10000 |
> | | |
> | |--02:AGGREGATE [FINALIZE] |
> | | | output: max(t2.bigint_col) |
> | | | group by: t2.id |
> | | | |
> | | 01:SCAN HDFS [functional.alltypestiny t2] |
> | | partitions=4/4 files=4 size=460B |
> | | |
> | 00:SCAN HDFS [functional.alltypessmall t1] |
> | partitions=4/4 files=4 size=6.32KB |
> | runtime filters: RF000 -> t1.id |
> +-----------------------------------------------------------------------+
> {code}
> The query returns an empty result set but instead should return all rows from
> t1 because all invocations of the subquery return NULL, and all rows from t1
> satisfy "t1.int_col >= 0".
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]