Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3286: prefetching for PartitionedAggregationNode
......................................................................


Patch Set 5:

Rebased onto the latest phj patch.

I did some benchmarking on a slightly earlier version. Overall trend is that 
high-ndv aggs are much faster, and low-ndv aggs are slightly slower. On 
end-to-end tests this seems to give a small net win:

        Run Description: "Base: 7ad3faa4e3fa5b55b84ae3b2888caa3e4bdf8238 vs 
Ref: ded7fb79caf0466a22ab64cad18998f73cb38f3d"

        Cluster Name: UNKNOWN
        Lab Run Info: UNKNOWN
        Impala Version:          impalad version 2.6.0-cdh5-INTERNAL RELEASE ()
        Baseline Impala Version: impalad version 2.6.0-cdh5-INTERNAL RELEASE ()

        
+--------------------+-----------------------+---------+------------+------------+----------------+
        | Workload           | File Format           | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |
        
+--------------------+-----------------------+---------+------------+------------+----------------+
        | TARGETED-PERF(_20) | parquet / none / none | 12.59   | -16.86%    | 
7.48       | -7.39%         |
        
+--------------------+-----------------------+---------+------------+------------+----------------+

        
+--------------------+---------------------------------------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
        | Workload           | Query                                 | File 
Format           | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base 
StdDev(%) | Num Clients | Iters |
        
+--------------------+---------------------------------------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
        | TARGETED-PERF(_20) | primitive_groupby_decimal_lowndv.test | parquet 
/ none / none | 2.27   | 1.96        |   +15.61%  |   1.36%   |   1.95%        
| 1           | 20    |
        | TARGETED-PERF(_20) | primitive_groupby_bigint_lowndv       | parquet 
/ none / none | 2.27   | 1.99        |   +13.81%  |   2.17%   |   2.87%        
| 1           | 20    |
        | TARGETED-PERF(_20) | primitive_groupby_bigint_pk           | parquet 
/ none / none | 35.71  | 42.77       |   -16.51%  |   6.66%   |   0.96%        
| 1           | 20    |
        | TARGETED-PERF(_20) | primitive_groupby_bigint_highndv      | parquet 
/ none / none | 10.17  | 12.27       |   -17.11%  |   1.70%   |   0.65%        
| 1           | 20    |
        | TARGETED-PERF(_20) | primitive_groupby_decimal_highndv     | parquet 
/ none / none | 12.55  | 16.74       | I -25.04%  |   2.48%   |   1.57%        
| 1           | 20    |
        
+--------------------+---------------------------------------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+

        (I) Improvement: TARGETED-PERF(_20) primitive_groupby_decimal_highndv 
[parquet / none / none] (16.74s -> 12.55s [-25.04%])
        
+--------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+---------+-----------+
        | Operator     | % of Query | Avg      | Base Avg | Delta(Avg) | 
StdDev(%) | Max      | Base Max | Delta(Max) | #Hosts | #Rows   | Est #Rows |
        
+--------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+---------+-----------+
        | 03:AGGREGATE | 2.62%      | 330.04ms | 321.08ms | +2.79%     |   
6.01%   | 379.54ms | 343.90ms | +10.36%    | 1      | 0       | 174.13K   |
        | 01:AGGREGATE | 94.94%     | 11.96s   | 16.19s   | -26.14%    |   
2.61%   | 12.66s   | 17.13s   | -26.10%    | 1      | 1.78M   | 1.74M     |
        | 00:SCAN HDFS | 2.32%      | 291.90ms | 270.24ms | +8.01%     |   
1.31%   | 299.88ms | 313.65ms | -4.39%     | 1      | 119.99M | 119.99M   |
        
+--------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+---------+-----------+

        Report Generated on 2016-05-16
        Run Description: "Base: 7ad3faa4e3fa5b55b84ae3b2888caa3e4bdf8238 vs 
Ref: ded7fb79caf0466a22ab64cad18998f73cb38f3d"

        Cluster Name: UNKNOWN
        Lab Run Info: UNKNOWN
        Impala Version:          impalad version 2.6.0-cdh5-INTERNAL RELEASE ()
        Baseline Impala Version: impalad version 2.6.0-cdh5-INTERNAL RELEASE ()

        
+-----------+-----------------------+---------+------------+------------+----------------+
        | Workload  | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) 
| Delta(GeoMean) |
        
+-----------+-----------------------+---------+------------+------------+----------------+
        | TPCH(_20) | parquet / none / none | 9.45    | -1.54%     | 6.38       
| -1.31%         |
        
+-----------+-----------------------+---------+------------+------------+----------------+

        
+-----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+
        | Workload  | Query    | File Format           | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%)  | Base StdDev(%) | Num Clients | Iters |
        
+-----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+
        | TPCH(_20) | TPCH-Q1  | parquet / none / none | 12.36  | 11.66       | 
  +6.02%   |   1.25%    |   1.56%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q17 | parquet / none / none | 14.11  | 13.62       | 
  +3.60%   |   2.28%    |   2.04%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q5  | parquet / none / none | 6.37   | 6.16        | 
  +3.35%   |   2.71%    |   1.75%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q3  | parquet / none / none | 5.04   | 4.91        | 
  +2.70%   |   2.26%    |   2.13%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q15 | parquet / none / none | 5.01   | 4.95        | 
  +1.36%   |   2.31%    |   2.39%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q12 | parquet / none / none | 4.25   | 4.20        | 
  +1.15%   |   2.12%    |   2.21%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q14 | parquet / none / none | 3.46   | 3.42        | 
  +0.94%   |   2.43%    |   2.15%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q4  | parquet / none / none | 4.02   | 3.98        | 
  +0.90%   | * 23.81% * | * 24.67% *     | 1           | 20    |
        | TPCH(_20) | TPCH-Q19 | parquet / none / none | 47.18  | 46.78       | 
  +0.86%   |   2.02%    |   1.74%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q20 | parquet / none / none | 3.84   | 3.81        | 
  +0.77%   |   2.23%    |   2.07%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q7  | parquet / none / none | 16.90  | 16.81       | 
  +0.51%   |   0.94%    |   0.85%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q11 | parquet / none / none | 1.55   | 1.54        | 
  +0.50%   |   2.79%    |   2.10%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q21 | parquet / none / none | 22.66  | 22.61       | 
  +0.24%   |   0.67%    |   0.69%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q9  | parquet / none / none | 13.37  | 13.34       | 
  +0.24%   |   0.52%    |   0.59%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q10 | parquet / none / none | 6.51   | 6.53        | 
  -0.42%   |   1.71%    |   1.03%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q16 | parquet / none / none | 2.22   | 2.23        | 
  -0.45%   |   2.62%    |   2.08%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q8  | parquet / none / none | 6.68   | 6.71        | 
  -0.50%   |   3.96%    |   5.01%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q22 | parquet / none / none | 2.84   | 2.86        | 
  -0.81%   |   2.44%    |   2.61%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q6  | parquet / none / none | 2.28   | 2.31        | 
  -1.30%   |   1.41%    |   1.28%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q2  | parquet / none / none | 2.46   | 2.58        | 
  -4.73%   |   1.72%    |   2.31%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q18 | parquet / none / none | 15.66  | 17.31       | 
  -9.53%   |   4.42%    |   3.51%        | 1           | 20    |
        | TPCH(_20) | TPCH-Q13 | parquet / none / none | 9.19   | 12.86       | 
I -28.57%  |   2.64%    |   4.52%        | 1           | 20    |
        
+-----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+

        (I) Improvement: TPCH(_20) TPCH-Q13 [parquet / none / none] (12.86s -> 
9.19s [-28.57%])
        
+--------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+--------+-----------+
        | Operator     | % of Query | Avg      | Base Avg | Delta(Avg) | 
StdDev(%) | Max      | Base Max | Delta(Max) | #Hosts | #Rows  | Est #Rows |
        
+--------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+--------+-----------+
        | 04:AGGREGATE | 2.17%      | 208.45ms | 252.62ms | -17.48%    |   
9.35%   | 254.88ms | 299.81ms | -14.99%    | 1      | 45     | 2.98M     |
        | 03:AGGREGATE | 44.10%     | 4.24s    | 7.79s    | -45.57%    |   
3.59%   | 4.62s    | 8.42s    | -45.08%    | 1      | 3.00M  | 2.98M     |
        | 02:HASH JOIN | 40.34%     | 3.88s    | 4.05s    | -4.08%     |   
3.20%   | 4.21s    | 4.48s    | -5.99%     | 1      | 30.68M | 3.00M     |
        | 06:EXCHANGE  | 3.67%      | 353.15ms | 293.33ms | +20.39%    |   
5.24%   | 399.72ms | 302.01ms | +32.35%    | 1      | 29.68M | 3.00M     |
        | 01:SCAN HDFS | 8.08%      | 776.89ms | 782.61ms | -0.73%     |   
4.39%   | 857.62ms | 900.48ms | -4.76%     | 1      | 29.68M | 3.00M     |
        
+--------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+--------+-----------+

        (V) Significant Variability: TPCH(_20) TPCH-Q4 [parquet / none / none] 
(24.67% -> 23.81%)
        
+--------------+------------+-----------+----------------+------------------+--------+-------+-----------+
        | Operator     | % of Query | StdDev(%) | Base StdDev(%) | 
Delta(StdDev(%)) | #Hosts | #Rows | Est #Rows |
        
+--------------+------------+-----------+----------------+------------------+--------+-------+-----------+
        | 08:AGGREGATE | 3.51%      | 20.08%    | 18.81%         | +6.74%       
    | 1      | 5     | 5         |
        | 03:AGGREGATE | 9.21%      | 12.80%    | 13.21%         | -3.09%       
    | 1      | 5     | 5         |
        | 02:HASH JOIN | 19.68%     | 35.76%    | 32.07%         | +11.51%      
    | 1      | 1.05M | 3.00M     |
        | 00:SCAN HDFS | 17.27%     | 14.81%    | 15.14%         | -2.21%       
    | 1      | 1.15M | 3.00M     |
        | 05:EXCHANGE  | 4.31%      | 60.05%    | 58.61%         | +2.45%       
    | 1      | 6.02M | 12.00M    |
        
+--------------+------------+-----------+----------------+------------------+--------+-------+-----------+

-- 
To view, visit http://gerrit.cloudera.org:8080/3070
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7726454efb416d61080c4e11db0ee7ada18c149b
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: No

Reply via email to