[ 
https://issues.apache.org/jira/browse/IMPALA-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-9949.
-----------------------------------
    Fix Version/s: Impala 4.0
       Resolution: Fixed

> Subqueries in select can result in rows not being returned
> ----------------------------------------------------------
>
>                 Key: IMPALA-9949
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9949
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Blocker
>              Labels: correctness, regression
>             Fix For: Impala 4.0
>
>
> IMPALA-8954 added support for uncorrelated subqueries but some do not return 
> correct results. Both of those queries should return rows with NULLs where 
> the subquery returned 0 rows.
> {noformat}
> [localhost.EXAMPLE.COM:21000] default> select (select min(int_col) from 
> functional.alltypes having min(int_col) < 0) from functional.alltypestiny;
> Fetched 0 row(s) in 0.16s
> [localhost.EXAMPLE.COM:21000] default> select (select min(int_col) from 
> functional.alltypes limit 0) from functional.alltypestiny;
> Fetched 0 row(s) in 0.14s
> {noformat}
> The problem is that the CROSS JOIN will return 0 rows if the subquery returns 
> 0 rows.
> {noformat}
> [localhost.EXAMPLE.COM:21000] default> explain select (select min(int_col) 
> from functional.alltypes having min(int_col) < 0) from 
> functional.alltypestiny;
> Query: explain select (select min(int_col) from functional.alltypes having 
> min(int_col) < 0) from functional.alltypestiny
> +-------------------------------------------------------------+
> | Explain String                                              |
> +-------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=40.00KB Threads=5 |
> | Per-Host Resource Estimates: Memory=180MB                   |
> | Codegen disabled by planner                                 |
> |                                                             |
> | PLAN-ROOT SINK                                              |
> | |                                                           |
> | 03:NESTED LOOP JOIN [CROSS JOIN, BROADCAST]                 |
> | |  row-size=4B cardinality=8                                |
> | |                                                           |
> | |--06:EXCHANGE [UNPARTITIONED]                              |
> | |  |                                                        |
> | |  00:SCAN HDFS [functional.alltypestiny]                   |
> | |     HDFS partitions=4/4 files=4 size=460B                 |
> | |     row-size=0B cardinality=8                             |
> | |                                                           |
> | 05:AGGREGATE [FINALIZE]                                     |
> | |  output: min:merge(int_col)                               |
> | |  having: min(int_col) < 0                                 |
> | |  row-size=4B cardinality=1                                |
> | |                                                           |
> | 04:EXCHANGE [UNPARTITIONED]                                 |
> | |                                                           |
> | 02:AGGREGATE                                                |
> | |  output: min(int_col)                                     |
> | |  row-size=4B cardinality=1                                |
> | |                                                           |
> | 01:SCAN HDFS [functional.alltypes]                          |
> |    HDFS partitions=24/24 files=24 size=478.45KB             |
> |    row-size=4B cardinality=7.30K                            |
> +-------------------------------------------------------------+
> Fetched 29 row(s) in 0.04s
> {noformat}
> We need to detect cases where the subquery can return 0 rows and instead 
> insert a left outer join. 
> I did this in a patch and it fixed the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to