[
https://issues.apache.org/jira/browse/IMPALA-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Rorke updated IMPALA-9183:
--------------------------------
Description:
TPC-DS query 13 has a set of predicates on the customer_address table, ca_state
column that are currently evaluated after the join of customer_address and
store_sales. The ca_state predicates could be pushed down to the
customer_address scan node. This would reduce the size of the join input by a
factor of 3.4.
As an experiment I added an additional redundant predicate to the query (see
attached query13_mod.sql) which causes the planner to evaluate the predicate at
the scan node.
Performance of the original and modified queries at 10 TB scale factor:
Original: 164 seconds
Modified: 44 seconds
Query profiles for both versions attached.
was:
TPC-DS query 13 has a set of predicates on the customer_address table, ca_state
column that are currently evaluated after the join of customer_address and
store_sales. The ca_state predicates could be pushed down to the
customer_address scan node. This would reduce the size of the join input by a
factor of 3.4.
As an experiment I added an additional redundant predicate to the query (see
attached query13_mod.sql) which causes the planner to evaluate the predicate at
the scan node.
Performance of the original and modified queries at 10 TB scale factor:
Original: 164 seconds
Modieifed: 44 seconds
Query profiles for both versions attached.
> TPC-DS query 13 - customer_address predicates not propagated to scan
> --------------------------------------------------------------------
>
> Key: IMPALA-9183
> URL: https://issues.apache.org/jira/browse/IMPALA-9183
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 3.4.0
> Reporter: David Rorke
> Priority: Major
> Attachments: profile_q13.txt, profile_q13_mod.txt, q13_mod_plan.png,
> q13_plan.png, query13.sql, query13_mod.sql
>
>
> TPC-DS query 13 has a set of predicates on the customer_address table,
> ca_state column that are currently evaluated after the join of
> customer_address and store_sales. The ca_state predicates could be pushed
> down to the customer_address scan node. This would reduce the size of the
> join input by a factor of 3.4.
> As an experiment I added an additional redundant predicate to the query (see
> attached query13_mod.sql) which causes the planner to evaluate the predicate
> at the scan node.
> Performance of the original and modified queries at 10 TB scale factor:
> Original: 164 seconds
> Modified: 44 seconds
> Query profiles for both versions attached.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]