Quanlong Huang created IMPALA-8423: -------------------------------------- Summary: Add rule to remove useless SELECT node Key: IMPALA-8423 URL: https://issues.apache.org/jira/browse/IMPALA-8423 Project: IMPALA Issue Type: Improvement Components: Frontend Reporter: Quanlong Huang
We can add some rules to optimize the plan after we chose a cheapest plan based on cost. For example, one useful rule can be "removing useless SELECT nodes". Impala will generated a useless SELECT for the following query: {code:sql} SELECT t.id, t.int_col FROM functional.alltypestiny t LEFT JOIN (SELECT id, int_col FROM functional.alltypestiny) t2 ON (t.id = t2.id) WHERE t.int_col = t.id UNION ALL VALUES (NULL, NULL){code} Its single node plan is {code:java} PLAN-ROOT SINK | 00:UNION | constant-operands=1 | row-size=8B cardinality=1 | 04:SELECT | predicates: t.id = t.int_col | row-size=12B cardinality=0 | 03:HASH JOIN [RIGHT OUTER JOIN] | hash predicates: id = t.id | runtime filters: RF000 <- t.id | row-size=12B cardinality=1 | |--01:SCAN HDFS [functional.alltypestiny t] | HDFS partitions=4/4 files=4 size=460B | predicates: t.int_col = t.id | row-size=8B cardinality=1 | 02:SCAN HDFS [functional.alltypestiny] HDFS partitions=4/4 files=4 size=460B runtime filters: RF000 -> id row-size=4B cardinality=8{code} The SELECT node (id=04) is useless since its only predicate "t.id = t.int_col" has been enforced in the SCAN node (id=01) which is the right hand side of the RIGHT OUTER JOIN. The SELECT node won't filter out any more rows. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org