[ 
https://issues.apache.org/jira/browse/IMPALA-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869640#comment-16869640
 ] 

Tim Armstrong edited comment on IMPALA-8276 at 6/21/19 4:11 PM:
----------------------------------------------------------------

Yeah, Quanlong was able to fix two issues in this area (he's a brave person). 

Looking at IMPALA-8386, I'm pretty sure this is a dupe of that - the query 
shapes are the same, except with an inline view vs a regular view. And both 
result in bogus x = x predicate in the "other predicates" of a join.

{noformat}
| 05:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED]                |
| |  hash predicates: c.a_id = a_id                           |
| |  other predicates: sum(amount) = sum(amount)      <---------- Wrong 
inferred predicate which incorrectly reject nulls
| |  runtime filters: RF000 <- a_id                           |
{noformat}
versus
{noformat}
505:HASH JOIN [LEFT OUTER JOIN, BROADCAST]                                      
          
...
|  other predicates: t.date1 = t.date1,
                     t.date2 = t.date2,
                     t.date3 = t.date3
{noformat}

I don't feel 100% confident in closing it though until we've confirmed the fix, 
it's just too subtle and maybe there's some slight difference.


was (Author: tarmstrong):
Yeah, Quanlong was able to fix two issues in this area (he's a brave person). 

Looking at IMPALA-8386, I'm pretty sure this is a dupe of that - the query 
shapes are the same, except with an inline view vs a regular view. And both 
result in bogus x = x predicate in the "other predicates" of a join.

{noformat}
| 05:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED]                |
| |  hash predicates: c.a_id = a_id                           |
| |  other predicates: sum(amount) = sum(amount)      <---------- Wrong 
inferred predicate which incorrectly reject nulls
| |  runtime filters: RF000 <- a_id                           |
{noformat}
versus
{noformat}
505:HASH JOIN [LEFT OUTER JOIN, BROADCAST]                                      
          
|  hash predicates: demand_source_line_id = oola.line_id, io_name = 
ood.organization_name 
|  other predicates: ooha.ordered_date = ooha.ordered_date,
                     oola.promise_date = oola.promise_date,
                     oola.request_date = oola.request_date
{noformat}

I don't feel 100% confident in closing it though until we've confirmed the fix, 
it's just too subtle and maybe there's some slight difference.

> Self equal to self predicate "x = x" generated by Impala caused incorrect 
> query result
> --------------------------------------------------------------------------------------
>
>                 Key: IMPALA-8276
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8276
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 3.0
>            Reporter: Yongjun Zhang
>            Priority: Blocker
>              Labels: correctness
>
> Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x 
> = x" is generated by Impala and caused incorrect query result, because this 
> kind of predicate return false for "null" entries.
> It was observed that a {{count(*)}} query returned fewer rows than a CTAS 
> query, though the query body is the same for both, because the former 
> generated the bogus predicate and the latter doesn't.
> For example,
> {code:java}
> select count(*) from 
> (select a.*, b.x, b.y, b.z_dt,  from view1 a left join view2 b on a.p = b.q) 
> a{code}
> returned fewer rows than
> {code:java}
> create table abc as 
> select a.*, b.x, b.y, b.z_dt,  from view1 a left join view2 b on a.p = 
> b.q{code}
>  because predicate {{a.z = a.z_dt}} was created (for reasons to understand, 
> notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the 
> query plan in Impala query profile because a and b are aliases of view1 and 
> view2,  both of which are views created in a very nested way that involves 
> table table1. 
> Though in cdh5.12.1 the select and the count query returns different result 
> in the initial case, an attempted reproduction shows that both queries get 
> bogus predicates. And cdh5.15.2 has the same problem.  Was not able to try 
> out with most recent master branch of impala due to meta data incompatibility.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to