Yida Wu created IMPALA-10861:
--------------------------------
Summary: Optimize the plan for two identical predicates
Key: IMPALA-10861
URL: https://issues.apache.org/jira/browse/IMPALA-10861
Project: IMPALA
Issue Type: Improvement
Components: Frontend
Reporter: Yida Wu
For the query with two same predicates, right now the plan doesn't seem to deal
with the case and include all the predicates even identical. It could be better
to do deduplication for this case.
{code:java}
Query: explain SELECT c_custkey from tpch.customer c left outer join
tpch.lineitem l
ON c.c_custkey = l.l_orderkey and c.c_custkey = l.l_orderkey
+----------------------------------------------------------------------------+
| Explain String |
+----------------------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=32.50MB Threads=6 |
| Per-Host Resource Estimates: Memory=798MB |
| |
| PLAN-ROOT SINK |
| | |
| 05:EXCHANGE [UNPARTITIONED] |
| | |
| 02:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED] |
| | hash predicates: l.l_orderkey = c.c_custkey, l.l_orderkey = c.c_custkey |
| | runtime filters: RF000 <- c.c_custkey, RF001 <- c.c_custkey |
| | row-size=16B cardinality=150.00K |
| | |
| |--04:EXCHANGE [HASH(c.c_custkey,c.c_custkey)] |
| | | |
| | 00:SCAN HDFS [tpch.customer c] |
| | HDFS partitions=1/1 files=1 size=23.08MB |
| | row-size=8B cardinality=150.00K |
| | |
| 03:EXCHANGE [HASH(l.l_orderkey,l.l_orderkey)] |
| | |
| 01:SCAN HDFS [tpch.lineitem l] |
| HDFS partitions=1/1 files=1 size=718.94MB |
| runtime filters: RF000 -> l.l_orderkey, RF001 -> l.l_orderkey |
| row-size=8B cardinality=6.00M |
+----------------------------------------------------------------------------+{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]