Paul Rogers created IMPALA-8285:
-----------------------------------
Summary: In DESCRIBE output, display base table and column names
Key: IMPALA-8285
URL: https://issues.apache.org/jira/browse/IMPALA-8285
Project: IMPALA
Issue Type: Improvement
Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers
When a query has aliases or views, the DESCRIBE predicates often refer to these
internal names, which is ambiguous. Instead, refer to base table and column
names.
Example PlannerTest case with a pathological use case in which the alias "a" is
used multiple times.
{noformat}
select
akey, bkey, ckey
from
(select a.c_custkey akey from tpch.customer a where a.c_nationkey = 10) a,
(select a.c_custkey bkey from tpch.customer a where a.c_name < "fred") b,
(select a.c_custkey ckey from tpch.customer a) c
where akey = b.bkey
and bkey = c.ckey
# ==> a.c_custkey = c.c_custkey
---- PLAN
PLAN-ROOT SINK
|
04:HASH JOIN [INNER JOIN]
| hash predicates: a.c_custkey = a.c_custkey
|
|--03:HASH JOIN [INNER JOIN]
| | hash predicates: a.c_custkey = a.c_custkey
| |
| |--00:SCAN HDFS [tpch.customer a]
| | predicates: a.c_nationkey = 10
| |
| 02:SCAN HDFS [tpch.customer a]
| runtime filters: RF002 -> a.c_custkey
|
01:SCAN HDFS [tpch.customer a]
predicates: a.c_name < 'fred'
{noformat}
Notice that the root hash join predicate appears to be a tautology:
{{a.c_custkey = a.c_custkey}}. This is actually {{b.c_custkey = a.c_custkey}}
(using top-level aliases). Even this is unclear since "a" and "b" are also
aliases. If we use table names we get {{customer.c_custkey =
customer.c_custkey}} which is also ambiguous.
Might be good to simply list tables using an unambiguous alias, or perhaps use
comment to clarify: {{customer /\* b \*/.c_custkey = customer /\* a
\*/.c_custkey}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]