[ 
https://issues.apache.org/jira/browse/HIVE-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035656#comment-16035656
 ] 

Pengcheng Xiong commented on HIVE-15758:
----------------------------------------

Here is my 2 cents. I use a CTE to describe how i want to rewrite. The basic 
idea is to give it a row number. In the first join, we use left join to keep 
all the row_nums. In the second join, we are trying to match exactly the same 
"row" as we want.
{code}
with part_rewrite as
(select *, ROW_NUMBER() over () as row_num from part)
select p1.p_size, p1.p_type from part_rewrite p1,
(select count(pp.p_size) as cnt, row_num from part_rewrite left outer join part 
pp on part_rewrite.p_type <> pp.p_type group by row_num)p2
where p1.p_size<>p2.cnt and p1.row_num=p2.row_num;
{code}

Here are the rewrite test results:
{code}
horton=# select * from part;
 p_size | p_type
--------+--------
      1 |      1
      1 |
      1 |      2
      1 |      3
      2 |      3
      2 |     32
    233 |      2
        |      2
        |
        |      3
    233 |      2
(11 rows)

horton=# select * from part where p_size <> (select count(p_size) from part pp 
where part.p_type <> pp.p_type);
 p_size | p_type
--------+--------
      1 |      1
      1 |
      1 |      2
      1 |      3
      2 |      3
      2 |     32
    233 |      2
    233 |      2
(8 rows)

horton=# with part_rewrite as
horton-# (select *, ROW_NUMBER() over () as row_num from part)
horton-# select p1.p_size, p1.p_type from part_rewrite p1,
horton-# (select count(pp.p_size) as cnt, row_num from part_rewrite left outer 
join part pp on part_rewrite.p_type <> pp.p_type group by row_num)p2
horton-# where p1.p_size<>p2.cnt and p1.row_num=p2.row_num;
 p_size | p_type
--------+--------
      1 |      1
      1 |
      1 |      2
      1 |      3
      2 |      3
      2 |     32
    233 |      2
    233 |      2
(8 rows)
{code}


> Allow correlated scalar subqueries with aggregates which has non-equi join 
> predicates
> -------------------------------------------------------------------------------------
>
>                 Key: HIVE-15758
>                 URL: https://issues.apache.org/jira/browse/HIVE-15758
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Logical Optimizer
>            Reporter: Vineet Garg
>            Assignee: Vineet Garg
>              Labels: sub-query
>
> Queries such as 
> {code} select * from part where p_size <> (select count(p_size) from part pp 
> where part.p_type <> pp.p_type); {code} are currently not allowed since HIVE 
> doesn't know how to rewrite such queries to preserve the correctness for 
> cases when there is zero row



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to