Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-2805: Order conjuncts based on selectivity and cost
......................................................................


Patch Set 8:

(5 comments)

I updated the targeted-perf test to a table from tpch. New performance results:

 
+-----------------+-------------------+----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
 | Workload        | Query             | File Format          | Avg(s) | Base 
Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters |
 
+-----------------+-------------------+----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
 | TARGETED-PERF() | conjunct_ordering | text / none / none   | 8.51   | 19.13  
     | I -55.50%  |   0.00%   |   0.00%        | 1           | 2     |
 | TARGETED-PERF() | conjunct_ordering | seq / block / block  | 8.85   | 21.27  
     | I -58.38%  |   0.00%   |   0.00%        | 1           | 2     |
 | TARGETED-PERF() | conjunct_ordering | avro / block / block | 7.91   | 24.27  
     | I -67.42%  |   0.00%   |   0.00%        | 1           | 2     |
 | TARGETED-PERF() | conjunct_ordering | rc / none / none     | 7.91   | 24.51  
     | I -67.74%  |   0.00%   |   0.00%        | 1           | 2     |
 | TARGETED-PERF() | conjunct_ordering | seq / block / block  | 7.20   | 24.22  
     | I -70.25%  |   0.00%   |   0.00%        | 1           | 2     |
 | TARGETED-PERF() | conjunct_ordering | avro / none / none   | 7.30   | 24.96  
     | I -70.73%  |   0.00%   |   0.00%        | 1           | 2     |
 
+-----------------+-------------------+----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+

http://gerrit.cloudera.org:8080/#/c/2598/8/fe/src/main/java/com/cloudera/impala/analysis/Expr.java
File fe/src/main/java/com/cloudera/impala/analysis/Expr.java:

Line 190:   protected float evalCost_;
> set it to -1 here instead of default c'tor
Done


http://gerrit.cloudera.org:8080/#/c/2598/8/testdata/workloads/functional-planner/queries/PlannerTest/joins.test
File testdata/workloads/functional-planner/queries/PlannerTest/joins.test:

Line 174: |  other predicates: a.tinyint_col = 15, b.string_col = '15', a.day 
>= 6, b.month > 2, a.tinyint_col + b.tinyint_col < 15, a.float_col - 
c.double_col < 0, (b.double_col * c.tinyint_col > 1000 OR c.tinyint_col < 1000)
> why did these change?
We have selectivity estimates for both 'a.tinyint_col = 15' and 'b.string_col = 
'15'' and not for any of the other conjuncts.


http://gerrit.cloudera.org:8080/#/c/2598/8/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
File testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test:

Line 1681: |  hash predicates: d_week_seq = d_week_seq - 52, s_store_id = 
s_store_id
> why did this get flipped? i'd expect the store id comparison to be cheaper.
We don't have selectivities for either. s_store_id is a string, so it gets the 
VAR_LENGTH_BINARY_PREDICATE_COST.


http://gerrit.cloudera.org:8080/#/c/2598/8/testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
File testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test:

Line 1837: |     predicates: p_size IN (49, 14, 23, 45, 19, 3, 36, 9), p_brand 
!= 'Brand#45', NOT p_type LIKE 'MEDIUM POLISHED%'
> is this what's causing the regression?
Possibly. See my recent top-level comment.

Short version: we get this ordering because we have a selectivity estimate for 
the IN conjunct and not for the other two.


http://gerrit.cloudera.org:8080/#/c/2598/8/testdata/workloads/targeted-perf/queries/conjunct_ordering.test
File testdata/workloads/targeted-perf/queries/conjunct_ordering.test:

Line 2: ---- QUERY: conjunct_ordering
> let's find a home for that in an existing test, this is too small to warran
This is consistent with the way we are currently handling targeted-perf - the 
majority of the targeted-perf test files are  < 15 lines.


-- 
To view, visit http://gerrit.cloudera.org:8080/2598
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02279a26fbc6308ac5eb819d78345fc010469034
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to