[jira] [Commented] (IMPALA-8423) Add rule to remove useless SELECT node

Tim Armstrong (Jira) Mon, 03 Feb 2020 09:24:32 -0800


    [ 
https://issues.apache.org/jira/browse/IMPALA-8423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17029106#comment-17029106
 ]


Tim Armstrong commented on IMPALA-8423:
---------------------------------------

Mostly the planner only inserts the SELECT nodes when the predicates could not 
be placed lower in the plan (it keeps track of which predicates have been 
assigned to a node already). This might've just been a consequence of a 
different bug. I wonder if there are even any (important) cases where the 
planner inserts an unnecessary select node and it's not a bug in the predicate 
placement algorithm.

> Add rule to remove useless SELECT node
> --------------------------------------
>
>                 Key: IMPALA-8423
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8423
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>            Reporter: Quanlong Huang
>            Assignee: Tamas Mate
>            Priority: Major
>              Labels: performance
>
> We can add some rules to optimize the plan after we chose a cheapest plan 
> based on cost. For example, one useful rule can be "removing useless SELECT 
> nodes".
> Impala will generated a useless SELECT for the following query:
> {code:sql}
> SELECT t.id, t.int_col
> FROM functional.alltypestiny t
> LEFT JOIN
>   (SELECT id, int_col
>   FROM functional.alltypestiny) t2
> ON (t.id = t2.id)
> WHERE t.int_col = t.id
> UNION ALL
> VALUES (NULL, NULL){code}
> Its single node plan is
> {code:java}
> PLAN-ROOT SINK
> |
> 00:UNION
> |  constant-operands=1
> |  row-size=8B cardinality=1
> |
> 04:SELECT
> |  predicates: t.id = t.int_col
> |  row-size=12B cardinality=0
> |
> 03:HASH JOIN [RIGHT OUTER JOIN]
> |  hash predicates: id = t.id
> |  runtime filters: RF000 <- t.id
> |  row-size=12B cardinality=1
> |
> |--01:SCAN HDFS [functional.alltypestiny t]
> |     HDFS partitions=4/4 files=4 size=460B
> |     predicates: t.int_col = t.id
> |     row-size=8B cardinality=1
> |
> 02:SCAN HDFS [functional.alltypestiny]
>    HDFS partitions=4/4 files=4 size=460B
>    runtime filters: RF000 -> id
>    row-size=4B cardinality=8{code}
> The SELECT node (id=04) is useless since its only predicate "t.id = 
> t.int_col" has been enforced in the SCAN node (id=01) which is the right hand 
> side of the RIGHT OUTER JOIN. The SELECT node won't filter out any more rows.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-8423) Add rule to remove useless SELECT node

Reply via email to