Michael Ho has uploaded a new change for review.
http://gerrit.cloudera.org:8080/3220
Change subject: IMPALA-3066: Lazy materialization of LLVM module bitcode.
......................................................................
IMPALA-3066: Lazy materialization of LLVM module bitcode.
Previously, each fragment using dynamic code generation will
parse the bitcode module and populate the LLVM data structures
for all the functions in the bitcode module. This is wasteful
as we may only use a few functions out of all the functions
parsed. We rely on dead code elimination to delete most of the
unused functions so we won't waste time compiling them.
This change implements lazy materialization of the bitcode.
Instead of eagerly populating the LLVM data structures for all
functions parsed, we just create the Function objects for each
function in the module. The functions' bodies will be materialized
on demand from the bitcode module when they are actually referenced
in the query. This ensures that the prepare time during codegen
is proportional to the number of IR functions referenced by the
query instead of being proportional to the total number of IR
functions in the module.
For TPCH-Q2, a fragment which only codegen 9 functions used to spend
146ms in codegen. It now goes down to 35ms, a 76% percent reduction.
CodeGen:(Total: 146.041ms, non-child: 146.041ms, % non-child: 100.00%)
- CodegenTime: 0.000ns
- CompileTime: 2.003ms
- LoadTime: 0.000ns
- ModuleBitcodeSize: 2.12 MB (2225304)
- NumFunctions: 9 (9)
- NumInstructions: 129 (129)
- OptimizationTime: 29.019ms
- PrepareTime: 114.651ms
CodeGen:(Total: 35.288ms, non-child: 35.288ms, % non-child: 100.00%)
- CodegenTime: 0.000ns
- CompileTime: 1.880ms
- LoadTime: 0.000ns
- ModuleBitcodeSize: 2.12 MB (2221276)
- NumFunctions: 9 (9)
- NumInstructions: 129 (129)
- OptimizationTime: 5.101ms
- PrepareTime: 28.044ms
A single node run of TPCH(15) also shows improvement in some short-running
queries:
+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) |
Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(15) | parquet / none / none | 6.78 | -0.81% | 4.52 | -3.66%
|
+----------+-----------------------+---------+------------+------------+----------------+
+----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+
| Workload | Query | File Format | Avg(s) | Base Avg(s) |
Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters |
+----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+
| TPCH(15) | TPCH-Q18 | parquet / none / none | 10.49 | 9.89 | +6.13%
| 1.09% | 1.71% | 1 | 10 |
| TPCH(15) | TPCH-Q7 | parquet / none / none | 11.57 | 11.23 | +3.02%
| 2.32% | 1.23% | 1 | 10 |
| TPCH(15) | TPCH-Q12 | parquet / none / none | 3.25 | 3.20 | +1.62%
| 2.84% | 2.36% | 1 | 10 |
| TPCH(15) | TPCH-Q9 | parquet / none / none | 9.81 | 9.70 | +1.09%
| 1.17% | 1.01% | 1 | 10 |
| TPCH(15) | TPCH-Q17 | parquet / none / none | 11.17 | 11.15 | +0.20%
| 9.31% | 9.45% | 1 | 10 |
| TPCH(15) | TPCH-Q21 | parquet / none / none | 17.02 | 17.04 | -0.08%
| 1.77% | 2.13% | 1 | 10 |
| TPCH(15) | TPCH-Q15 | parquet / none / none | 3.73 | 3.73 | -0.16%
| 3.43% | 2.54% | 1 | 10 |
| TPCH(15) | TPCH-Q19 | parquet / none / none | 34.33 | 34.39 | -0.18%
| 2.23% | 2.60% | 1 | 10 |
| TPCH(15) | TPCH-Q13 | parquet / none / none | 7.26 | 7.31 | -0.67%
| 1.53% | 2.65% | 1 | 10 |
| TPCH(15) | TPCH-Q1 | parquet / none / none | 8.62 | 8.71 | -1.05%
| 1.12% | 0.47% | 1 | 10 |
| TPCH(15) | TPCH-Q4 | parquet / none / none | 2.52 | 2.55 | -1.30%
| 2.96% | 1.63% | 1 | 10 |
| TPCH(15) | TPCH-Q14 | parquet / none / none | 2.75 | 2.80 | -1.87%
| 3.23% | 3.04% | 1 | 10 |
| TPCH(15) | TPCH-Q8 | parquet / none / none | 4.68 | 4.84 | -3.38%
| 2.38% | 2.52% | 1 | 10 |
| TPCH(15) | TPCH-Q5 | parquet / none / none | 3.40 | 3.52 | -3.55%
| 1.36% | 1.13% | 1 | 10 |
| TPCH(15) | TPCH-Q3 | parquet / none / none | 3.29 | 3.45 | -4.52%
| 1.37% | 1.84% | 1 | 10 |
| TPCH(15) | TPCH-Q16 | parquet / none / none | 1.61 | 1.72 | -5.88%
| 2.18% | 3.73% | 1 | 10 |
| TPCH(15) | TPCH-Q10 | parquet / none / none | 4.47 | 4.76 | -6.14%
| 3.96% | 1.98% | 1 | 10 |
| TPCH(15) | TPCH-Q22 | parquet / none / none | 1.95 | 2.10 | -6.93%
| 2.41% | 1.57% | 1 | 10 |
| TPCH(15) | TPCH-Q20 | parquet / none / none | 2.82 | 3.08 | -8.39%
| 2.71% | 2.21% | 1 | 10 |
| TPCH(15) | TPCH-Q11 | parquet / none / none | 1.15 | 1.26 | -8.81%
| 2.58% | 5.01% | 1 | 10 |
| TPCH(15) | TPCH-Q6 | parquet / none / none | 1.74 | 1.93 | -9.67%
| 1.66% | 2.53% | 1 | 10 |
| TPCH(15) | TPCH-Q2 | parquet / none / none | 1.53 | 2.03 | I
-24.56% | * 14.70% * | 2.57% | 1 | 10 |
+----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+
Change-Id: I6ed7862fc5e86005ecea83fa2ceb489e737d66b2
---
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/like-predicate-ir.cc
M be/src/exprs/like-predicate.h
M be/src/exprs/scalar-fn-call.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.inline.h
M be/src/util/symbols-util-test.cc
12 files changed, 194 insertions(+), 90 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/20/3220/1
--
To view, visit http://gerrit.cloudera.org:8080/3220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6ed7862fc5e86005ecea83fa2ceb489e737d66b2
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <[email protected]>