David Rorke created IMPALA-9183:
-----------------------------------
Summary: TPC-DS query 13 - customer_address predicates not
propagated to scan
Key: IMPALA-9183
URL: https://issues.apache.org/jira/browse/IMPALA-9183
Project: IMPALA
Issue Type: Bug
Components: Frontend
Affects Versions: Impala 3.4.0
Reporter: David Rorke
Attachments: profile_q13.txt, profile_q13_mod.txt, q13_mod_plan.png,
q13_plan.png, query13.sql, query13_mod.sql
TPC-DS query 13 has a set of predicates on the customer_address table, ca_state
column that are currently evaluated after the join of customer_address and
store_sales. The ca_state predicates could be pushed down to the
customer_address scan node. This would reduce the size of the join input by a
factor of 3.4.
As an experiment I added an additional redundant predicate to the query (see
attached query13_mod.sql) which causes the planner to evaluate the predicate at
the scan node.
Performance of the original and modified queries at 10 TB scale factor:
Original: 164 seconds
Modieifed: 44 seconds
Query profiles for both versions attached.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)