[ 
https://issues.apache.org/jira/browse/HIVE-25822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-25822:
------------------------------------
    Description: 
needed
* outer join
* on condition has at least one condition for one side of the join
* in a single reducer:
** a right hand side only row outputted right before
** >=2 rows on LHS and 1 on RHS matching in the join keys but the first LHS 
doesn't satisfies the filter condition
** second LHS row with good filter condition

{code}
with
t_y as (select col1 as id,col2 as s from (VALUES(0,'a'),(1,'y')) as c),
t_xy as (select col1 as id,col2 as s from (VALUES(1,'x'),(1,'y')) as c) 
select * from t_xy l full outer join t_y r on (l.id=r.id and l.s='y');
{code}

null,null,1,y is a false positive result
{code}
+-------+-------+-------+-------+
| l.id  |  l.s  | r.id  |  r.s  |
+-------+-------+-------+-------+
| NULL  | NULL  | 0     | a     |
| 1     | x     | NULL  | NULL  |
| NULL  | NULL  | 1     | y     |
| 1     | y     | 1     | y     |
+-------+-------+-------+-------+
{code}

  was:
needed
* outer join
* on condition has at least one condition for one side of the join
* in a single reducer:
** a right hand side only row outputted right before
** >=2 rows on LHS and 1 on RHS matching in the join keys but the first LHS 
doesn't satisfies the filter condition
** second LHS row with good filter condition

{code}
with
t_y as (select col1 as id,col2 as s from (VALUES(0,'a'),(1,'y')) as c),
t_xy as (select col1 as id,col2 as s from (VALUES(1,'x'),(1,'y')) as c) 
select * from t_xy l full outer join t_y r on (l.id=r.id and l.s='y');
{code}

null,null,1,y is an unexpected result
{code}
+-------+-------+-------+-------+
| l.id  |  l.s  | r.id  |  r.s  |
+-------+-------+-------+-------+
| NULL  | NULL  | 0     | a     |
| 1     | x     | NULL  | NULL  |
| NULL  | NULL  | 1     | y     |
| 1     | y     | 1     | y     |
+-------+-------+-------+-------+
{code}

        Summary: False positive result rows may be outputter in case outer join 
has conditions only affecting one side  (was: Unexpected result rows in case of 
outer join contains conditions only affecting one side)

bq. Why we are saying unexpected results instead of wrong? I don't thing there 
are many ways to interpret the query.

I wanted to highlight that this is not taking away expected results - and thus 
its a false positive error and not a false negative.



> False positive result rows may be outputter in case outer join has conditions 
> only affecting one side
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-25822
>                 URL: https://issues.apache.org/jira/browse/HIVE-25822
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Zoltan Haindrich
>            Assignee: Zoltan Haindrich
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> needed
> * outer join
> * on condition has at least one condition for one side of the join
> * in a single reducer:
> ** a right hand side only row outputted right before
> ** >=2 rows on LHS and 1 on RHS matching in the join keys but the first LHS 
> doesn't satisfies the filter condition
> ** second LHS row with good filter condition
> {code}
> with
> t_y as (select col1 as id,col2 as s from (VALUES(0,'a'),(1,'y')) as c),
> t_xy as (select col1 as id,col2 as s from (VALUES(1,'x'),(1,'y')) as c) 
> select * from t_xy l full outer join t_y r on (l.id=r.id and l.s='y');
> {code}
> null,null,1,y is a false positive result
> {code}
> +-------+-------+-------+-------+
> | l.id  |  l.s  | r.id  |  r.s  |
> +-------+-------+-------+-------+
> | NULL  | NULL  | 0     | a     |
> | 1     | x     | NULL  | NULL  |
> | NULL  | NULL  | 1     | y     |
> | 1     | y     | 1     | y     |
> +-------+-------+-------+-------+
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to