[
https://issues.apache.org/jira/browse/IMPALA-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yongjun Zhang updated IMPALA-8276:
----------------------------------
Description:
Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x =
x" is generated by Impala and caused incorrect query result, because this kind
of predicate return false for "null" entries.
It was observed that a {{count(*)}} query returned fewer rows than a CTAS
query, though the query is the same, because the former generated the bogus
predicate and the latter doesn't.
For example,
{code:java}
select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join
view2 b on a.p = b.q) a{code}
returned fewer rows than
{code:java}
create table abc as
select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join
view2 b on a.p = b.q{code}
because predicate {{a.z = a.z_dt}} was created (for reasons to understand,
notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the
query plan in Impala query profile because a and b are aliases of view1 and
view2, both of which are views created in a very nested way that involves
table table1.
Though in cdh5.12.1 the select and the count query returns different result in
the initial case, an attempted reproduction shows that both queries get bogus
predicates. And cdh5.15.2 has the same problem. Was not able to try out with
most recent master branch of impala due to meta data incompatibility.
was:
Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x =
x" is generated by Impala and caused incorrect query result, because this kind
of predicate return false for "null" entries.
It was observed that a count(*) query returned fewer rows than a CTAS query,
though the query is the same, because the former generated the bogus predicate
and the latter doesn't.
For example,
{code:java}
select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join
view2 b on a.p = b.q) a{code}
returned fewer rows than
{code:java}
create table abc as
select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join
view2 b on a.p = b.q{code}
because predicate {{a.z = a.z_dt}} was created (for reasons to understand,
notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the
query plan in Impala query profile because a and b are aliases of view1 and
view2, both of which are views created in a very nested way that involves
table table1.
Though in cdh5.12.1 the select and the count query returns different result in
the initial case, an attempted reproduction shows that both queries get bogus
predicates. And cdh5.15.2 has the same problem. Was not able to try out with
most recent master branch of impala due to meta data incompatibility.
> Self equal to self predicate "x = x" generated by Impala caused incorrect
> query result
> --------------------------------------------------------------------------------------
>
> Key: IMPALA-8276
> URL: https://issues.apache.org/jira/browse/IMPALA-8276
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 3.0
> Reporter: Yongjun Zhang
> Priority: Major
>
> Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x
> = x" is generated by Impala and caused incorrect query result, because this
> kind of predicate return false for "null" entries.
> It was observed that a {{count(*)}} query returned fewer rows than a CTAS
> query, though the query is the same, because the former generated the bogus
> predicate and the latter doesn't.
> For example,
> {code:java}
> select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join
> view2 b on a.p = b.q) a{code}
> returned fewer rows than
> {code:java}
> create table abc as
> select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join
> view2 b on a.p = b.q{code}
> because predicate {{a.z = a.z_dt}} was created (for reasons to understand,
> notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the
> query plan in Impala query profile because a and b are aliases of view1 and
> view2, both of which are views created in a very nested way that involves
> table table1.
> Though in cdh5.12.1 the select and the count query returns different result
> in the initial case, an attempted reproduction shows that both queries get
> bogus predicates. And cdh5.15.2 has the same problem. Was not able to try
> out with most recent master branch of impala due to meta data incompatibility.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]