[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-07 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 10: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-07 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


IMPALA-4302,IMPALA-2379: constant expr arg fixes

This patch fixes two issues around handling of constant expr args.
The patches are combined because they touch some of the same code
and depend on some of the same memory management cleanup.

First, it fixes IMPALA-2379, where constant expr args were not visible
to UDAFs. The issue is that the input exprs need to be opened before
calling the UDAF Init() function.

Second, it avoids overhead from repeated evaluation of constant
arguments for ScalarFnCall expressions on both the codegen'd and
interpreted paths. A common example is an IN predicate with a
long list of constant values.

The interpreted path was inefficient because it always evaluated all
children expressions. Instead in this patch constant args are
evaluated once and cached. The memory management of the AnyVal*
objects was somewhat nebulous - adjusted it so that they're allocated
from ExprContext::mem_pool_, which has the correct lifetime.

The codegen'd path was inefficient only with varargs - with fixed
arguments the LLVM optimiser is able to infer after inlining that
the expressions are constant and remove all evaluation. However,
for varargs it stores the vararg values into a heap-allocated buffer.
The LLVM optimiser is unable to remove these stores because they
have a side-effect that is visible to code outside the function.

The codegen'd path is improved by evaluating varargs into an automatic
buffer that can be optimised out. We also make a small related change
to bake the string constants into the codegen'd code.

Testing:
Ran exhaustive build.

Added regression test for IMPALA-2379 and MemPool test for aligned
allocation. Added a test for in predicates with constant strings.

Perf:
Added a targeted query that demonstrates the improvement. Also manually
validated the non-codegend perf. Also ran TPC-H and targeted perf
queries locally - didn't see any significant changes.

++---+---++-++---++-+---+
| Workload   | Query | File Format   | 
Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | 
Iters |
++---+---++-++---++-+---+
| TARGETED-PERF(_20) | primitive_filter_in_predicate | parquet / none / none | 
1.19   | 9.82| I -87.85%  |   3.82%   |   0.71%| 1   | 
10|
++---+---++-++---++-+---+

(I) Improvement: TARGETED-PERF(_20) primitive_filter_in_predicate [parquet / 
none / none] (9.82s -> 1.19s [-87.85%])
+--++--+--++---+--+--++++---+
| Operator | % of Query | Avg  | Base Avg | Delta(Avg) | StdDev(%) | 
Max  | Base Max | Delta(Max) | #Hosts | #Rows  | Est #Rows |
+--++--+--++---+--+--++++---+
| 01:AGGREGATE | 14.39% | 155.88ms | 214.61ms | -27.37%|   2.68%   | 
163.38ms | 227.53ms | -28.19%| 1  | 1  | 1 |
| 00:SCAN HDFS | 85.60% | 927.46ms | 9.43s| -90.16%|   4.49%   | 
1.01s| 9.50s| -89.42%| 1  | 13.77K | 14.05K|
+--++--+--++---+--+--++++---+

Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Reviewed-on: http://gerrit.cloudera.org:8080/4838
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/exprs/case-expr.cc
M be/src/exprs/expr-context.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/in-predicate.h
M be/src/exprs/literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/exprs/utility-functions-ir.cc
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/testutil/test-udas.cc
M be/src/udf/udf-internal.h
M be/src/udf/udf-test-harness.cc
M be/src/udf/udf.cc
M be/sr

[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 10:

Hit IMPALA-3040

-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-07 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 10: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/435/

-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 10: Code-Review+2

(1 comment)

Carry +2

http://gerrit.cloudera.org:8080/#/c/4838/8/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

Line 1030:   // significant bits overlap in memory.
> I guess these should really have the LITTLE_ENDIAN checks too, right?
Reworded the comment to be less little-endian specific.

I think if we were to port this to big-endian, we'd add padding to DecimalVal 
so that the least significant bits lined up (which means the field offsets 
would be different). I don't think this code would have to change.


-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-07 Thread Tim Armstrong (Code Review)
Hello Internal Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4838

to look at the new patch set (#9).

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..

IMPALA-4302,IMPALA-2379: constant expr arg fixes

This patch fixes two issues around handling of constant expr args.
The patches are combined because they touch some of the same code
and depend on some of the same memory management cleanup.

First, it fixes IMPALA-2379, where constant expr args were not visible
to UDAFs. The issue is that the input exprs need to be opened before
calling the UDAF Init() function.

Second, it avoids overhead from repeated evaluation of constant
arguments for ScalarFnCall expressions on both the codegen'd and
interpreted paths. A common example is an IN predicate with a
long list of constant values.

The interpreted path was inefficient because it always evaluated all
children expressions. Instead in this patch constant args are
evaluated once and cached. The memory management of the AnyVal*
objects was somewhat nebulous - adjusted it so that they're allocated
from ExprContext::mem_pool_, which has the correct lifetime.

The codegen'd path was inefficient only with varargs - with fixed
arguments the LLVM optimiser is able to infer after inlining that
the expressions are constant and remove all evaluation. However,
for varargs it stores the vararg values into a heap-allocated buffer.
The LLVM optimiser is unable to remove these stores because they
have a side-effect that is visible to code outside the function.

The codegen'd path is improved by evaluating varargs into an automatic
buffer that can be optimised out. We also make a small related change
to bake the string constants into the codegen'd code.

Testing:
Ran exhaustive build.

Added regression test for IMPALA-2379 and MemPool test for aligned
allocation. Added a test for in predicates with constant strings.

Perf:
Added a targeted query that demonstrates the improvement. Also manually
validated the non-codegend perf. Also ran TPC-H and targeted perf
queries locally - didn't see any significant changes.

++---+---++-++---++-+---+
| Workload   | Query | File Format   | 
Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | 
Iters |
++---+---++-++---++-+---+
| TARGETED-PERF(_20) | primitive_filter_in_predicate | parquet / none / none | 
1.19   | 9.82| I -87.85%  |   3.82%   |   0.71%| 1   | 
10|
++---+---++-++---++-+---+

(I) Improvement: TARGETED-PERF(_20) primitive_filter_in_predicate [parquet / 
none / none] (9.82s -> 1.19s [-87.85%])
+--++--+--++---+--+--++++---+
| Operator | % of Query | Avg  | Base Avg | Delta(Avg) | StdDev(%) | 
Max  | Base Max | Delta(Max) | #Hosts | #Rows  | Est #Rows |
+--++--+--++---+--+--++++---+
| 01:AGGREGATE | 14.39% | 155.88ms | 214.61ms | -27.37%|   2.68%   | 
163.38ms | 227.53ms | -28.19%| 1  | 1  | 1 |
| 00:SCAN HDFS | 85.60% | 927.46ms | 9.43s| -90.16%|   4.49%   | 
1.01s| 9.50s| -89.42%| 1  | 13.77K | 14.05K|
+--++--+--++---+--+--++++---+

Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
---
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/exprs/case-expr.cc
M be/src/exprs/expr-context.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/in-predicate.h
M be/src/exprs/literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/exprs/utility-functions-ir.cc
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/testutil/test-udas.cc
M be/src/udf/udf-internal.h
M be/src/udf/udf-test-harness.cc
M be/src/udf/udf.cc
M be/src/udf

[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 8: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4838/8/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

Line 1030:   // of the val16 should be initialized to all 0's.
I guess these should really have the LITTLE_ENDIAN checks too, right?


-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 8:

Ran into a bug in the patch where histogram was interpreting a val4 or val8 as 
a val16, and seeing uninitialised bits. This seems to be an expected way to use 
DecimalVal in the UDF interface.

The fix is to memset() the whole value to 0 when allocating it. Documented this 
behaviour a bit better.

-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-07 Thread Tim Armstrong (Code Review)
Hello Internal Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4838

to look at the new patch set (#8).

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..

IMPALA-4302,IMPALA-2379: constant expr arg fixes

This patch fixes two issues around handling of constant expr args.
The patches are combined because they touch some of the same code
and depend on some of the same memory management cleanup.

First, it fixes IMPALA-2379, where constant expr args were not visible
to UDAFs. The issue is that the input exprs need to be opened before
calling the UDAF Init() function.

Second, it avoids overhead from repeated evaluation of constant
arguments for ScalarFnCall expressions on both the codegen'd and
interpreted paths. A common example is an IN predicate with a
long list of constant values.

The interpreted path was inefficient because it always evaluated all
children expressions. Instead in this patch constant args are
evaluated once and cached. The memory management of the AnyVal*
objects was somewhat nebulous - adjusted it so that they're allocated
from ExprContext::mem_pool_, which has the correct lifetime.

The codegen'd path was inefficient only with varargs - with fixed
arguments the LLVM optimiser is able to infer after inlining that
the expressions are constant and remove all evaluation. However,
for varargs it stores the vararg values into a heap-allocated buffer.
The LLVM optimiser is unable to remove these stores because they
have a side-effect that is visible to code outside the function.

The codegen'd path is improved by evaluating varargs into an automatic
buffer that can be optimised out. We also make a small related change
to bake the string constants into the codegen'd code.

Testing:
Ran exhaustive build.

Added regression test for IMPALA-2379 and MemPool test for aligned
allocation. Added a test for in predicates with constant strings.

Perf:
Added a targeted query that demonstrates the improvement. Also manually
validated the non-codegend perf. Also ran TPC-H and targeted perf
queries locally - didn't see any significant changes.

++---+---++-++---++-+---+
| Workload   | Query | File Format   | 
Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | 
Iters |
++---+---++-++---++-+---+
| TARGETED-PERF(_20) | primitive_filter_in_predicate | parquet / none / none | 
1.19   | 9.82| I -87.85%  |   3.82%   |   0.71%| 1   | 
10|
++---+---++-++---++-+---+

(I) Improvement: TARGETED-PERF(_20) primitive_filter_in_predicate [parquet / 
none / none] (9.82s -> 1.19s [-87.85%])
+--++--+--++---+--+--++++---+
| Operator | % of Query | Avg  | Base Avg | Delta(Avg) | StdDev(%) | 
Max  | Base Max | Delta(Max) | #Hosts | #Rows  | Est #Rows |
+--++--+--++---+--+--++++---+
| 01:AGGREGATE | 14.39% | 155.88ms | 214.61ms | -27.37%|   2.68%   | 
163.38ms | 227.53ms | -28.19%| 1  | 1  | 1 |
| 00:SCAN HDFS | 85.60% | 927.46ms | 9.43s| -90.16%|   4.49%   | 
1.01s| 9.50s| -89.42%| 1  | 13.77K | 14.05K|
+--++--+--++---+--+--++++---+

Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
---
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/exprs/case-expr.cc
M be/src/exprs/expr-context.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/in-predicate.h
M be/src/exprs/literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/exprs/utility-functions-ir.cc
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/testutil/test-udas.cc
M be/src/udf/udf-internal.h
M be/src/udf/udf-test-harness.cc
M be/src/udf/udf.cc
M be/src/udf

[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-07 Thread Tim Armstrong (Code Review)
Hello Internal Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4838

to look at the new patch set (#7).

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..

IMPALA-4302,IMPALA-2379: constant expr arg fixes

This patch fixes two issues around handling of constant expr args.
The patches are combined because they touch some of the same code
and depend on some of the same memory management cleanup.

First, it fixes IMPALA-2379, where constant expr args were not visible
to UDAFs. The issue is that the input exprs need to be opened before
calling the UDAF Init() function.

Second, it avoids overhead from repeated evaluation of constant
arguments for ScalarFnCall expressions on both the codegen'd and
interpreted paths. A common example is an IN predicate with a
long list of constant values.

The interpreted path was inefficient because it always evaluated all
children expressions. Instead in this patch constant args are
evaluated once and cached. The memory management of the AnyVal*
objects was somewhat nebulous - adjusted it so that they're allocated
from ExprContext::mem_pool_, which has the correct lifetime.

The codegen'd path was inefficient only with varargs - with fixed
arguments the LLVM optimiser is able to infer after inlining that
the expressions are constant and remove all evaluation. However,
for varargs it stores the vararg values into a heap-allocated buffer.
The LLVM optimiser is unable to remove these stores because they
have a side-effect that is visible to code outside the function.

The codegen'd path is improved by evaluating varargs into an automatic
buffer that can be optimised out. We also make a small related change
to bake the string constants into the codegen'd code.

Testing:
Ran exhaustive build.

Added regression test for IMPALA-2379 and MemPool test for aligned
allocation. Added a test for in predicates with constant strings.

Perf:
Added a targeted query that demonstrates the improvement. Also manually
validated the non-codegend perf. Also ran TPC-H and targeted perf
queries locally - didn't see any significant changes.

++---+---++-++---++-+---+
| Workload   | Query | File Format   | 
Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | 
Iters |
++---+---++-++---++-+---+
| TARGETED-PERF(_20) | primitive_filter_in_predicate | parquet / none / none | 
1.19   | 9.82| I -87.85%  |   3.82%   |   0.71%| 1   | 
10|
++---+---++-++---++-+---+

(I) Improvement: TARGETED-PERF(_20) primitive_filter_in_predicate [parquet / 
none / none] (9.82s -> 1.19s [-87.85%])
+--++--+--++---+--+--++++---+
| Operator | % of Query | Avg  | Base Avg | Delta(Avg) | StdDev(%) | 
Max  | Base Max | Delta(Max) | #Hosts | #Rows  | Est #Rows |
+--++--+--++---+--+--++++---+
| 01:AGGREGATE | 14.39% | 155.88ms | 214.61ms | -27.37%|   2.68%   | 
163.38ms | 227.53ms | -28.19%| 1  | 1  | 1 |
| 00:SCAN HDFS | 85.60% | 927.46ms | 9.43s| -90.16%|   4.49%   | 
1.01s| 9.50s| -89.42%| 1  | 13.77K | 14.05K|
+--++--+--++---+--+--++++---+

Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
---
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/exprs/case-expr.cc
M be/src/exprs/expr-context.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/in-predicate.h
M be/src/exprs/literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/exprs/utility-functions-ir.cc
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/testutil/test-udas.cc
M be/src/udf/udf-internal.h
M be/src/udf/udf-test-harness.cc
M be/src/udf/udf.cc
M be/src/udf

[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-05 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 6: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/430/

-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 6: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-04 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 4:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/4838/4//COMMIT_MSG
Commit Message:

Line 18: arguments for ScalarFnCall expressions on both the codegen'd and
> missing words
Done


PS4, Line 36: temporary
> automatic?  at least I would have understood this a little quicker with tha
Done


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/agg-fn-evaluator.cc
File be/src/exprs/agg-fn-evaluator.cc:

PS4, Line 158: ",
> is it worth differentiating these messages to make debugging easier?
Can't hurt.


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/anyval-util.h
File be/src/exprs/anyval-util.h:

PS4, Line 297: The returned AnyVal is not initialized.
> this sentence is repeated
Done


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/expr-context.h
File be/src/exprs/expr-context.h:

PS4, Line 66: fragment
> do you mean fragment instance?
It's kind-of unclear what the direction here is. There's a FRAGMENT_LOCAL 
concept in the udf interface that this corresponds to, so I think this is 
consistent at the moment (although maybe it should be FRAGMENT_INSTANCE_LOCAL 
everywhere).


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

PS4, Line 188: fragment
> fragment or fragment instance?
Reworded to avoid making the distinction and be clearer. It's not clear that 
the FRAGMENT_LOCAL terminology is accurate, but I think that's out of scope for 
this fix.


PS4, Line 189: cases
> not sure what this sentence is saying. what cases?
Reworded to be clearer. It's copied when the ExprContext is cloned.


Line 205: // children.
> do we document the calling convention somewhere? I think a brief explanatio
Updated the comment. The UDF calling convention is in udf/udf.h and the 
contents of the buffers are in udf/udf-internal.h


PS4, Line 340: optimise out the buffer
> is this also enabling constant propagation for constant args, or was that a
The constant was propagated but the side-effect of storing to the varargs 
buffer couldn't be optimised out.


PS4, Line 370: Either no varargs or arguments before varargs begin
> this comment seems unnecessary with the new conditional
Done


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/runtime/free-pool.h
File be/src/runtime/free-pool.h:

Line 73: size = std::max(8, size);
> seems fine, but why is it required?
Not required but it's pointless making smaller allocations, since it's aligned 
to 8 bytes by the MemPool anyway. Added a comment


-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#5).

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..

IMPALA-4302,IMPALA-2379: constant expr arg fixes

This patch fixes two issues around handling of constant expr args.
The patches are combined because they touch some of the same code
and depend on some of the same memory management cleanup.

First, it fixes IMPALA-2379, where constant expr args were not visible
to UDAFs. The issue is that the input exprs need to be opened before
calling the UDAF Init() function.

Second, it avoids overhead from repeated evaluation of constant
arguments for ScalarFnCall expressions on both the codegen'd and
interpreted paths. A common example is an IN predicate with a
long list of constant values.

The interpreted path was inefficient because it always evaluated all
children expressions. Instead in this patch constant args are
evaluated once and cached. The memory management of the AnyVal*
objects was somewhat nebulous - adjusted it so that they're allocated
from ExprContext::mem_pool_, which has the correct lifetime.

The codegen'd path was inefficient only with varargs - with fixed
arguments the LLVM optimiser is able to infer after inlining that
the expressions are constant and remove all evaluation. However,
for varargs it stores the vararg values into a heap-allocated buffer.
The LLVM optimiser is unable to remove these stores because they
have a side-effect that is visible to code outside the function.

The codegen'd path is improved by evaluating varargs into an automatic
buffer that can be optimised out. We also make a small related change
to bake the string constants into the codegen'd code.

Testing:
Ran exhaustive build.

Added regression test for IMPALA-2379 and MemPool test for aligned
allocation. Added a test for in predicates with constant strings.

Perf:
Added a targeted query that demonstrates the improvement. Also manually
validated the non-codegend perf. Also ran TPC-H and targeted perf
queries locally - didn't see any significant changes.

++---+---++-++---++-+---+
| Workload   | Query | File Format   | 
Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | 
Iters |
++---+---++-++---++-+---+
| TARGETED-PERF(_20) | primitive_filter_in_predicate | parquet / none / none | 
1.19   | 9.82| I -87.85%  |   3.82%   |   0.71%| 1   | 
10|
++---+---++-++---++-+---+

(I) Improvement: TARGETED-PERF(_20) primitive_filter_in_predicate [parquet / 
none / none] (9.82s -> 1.19s [-87.85%])
+--++--+--++---+--+--++++---+
| Operator | % of Query | Avg  | Base Avg | Delta(Avg) | StdDev(%) | 
Max  | Base Max | Delta(Max) | #Hosts | #Rows  | Est #Rows |
+--++--+--++---+--+--++++---+
| 01:AGGREGATE | 14.39% | 155.88ms | 214.61ms | -27.37%|   2.68%   | 
163.38ms | 227.53ms | -28.19%| 1  | 1  | 1 |
| 00:SCAN HDFS | 85.60% | 927.46ms | 9.43s| -90.16%|   4.49%   | 
1.01s| 9.50s| -89.42%| 1  | 13.77K | 14.05K|
+--++--+--++---+--+--++++---+

Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
---
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/exprs/case-expr.cc
M be/src/exprs/expr-context.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/in-predicate.h
M be/src/exprs/literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/testutil/test-udas.cc
M be/src/udf/udf-internal.h
M be/src/udf/udf-test-harness.cc
M be/src/udf/udf.cc
M be/src/util/bit-util.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/uda.test
A testdata/workloads/targeted-perf/queries/primiti

[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-04 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 4:

(11 comments)

Looks good. I think just comments about comments.

http://gerrit.cloudera.org:8080/#/c/4838/4//COMMIT_MSG
Commit Message:

Line 18: arguments for ScalarFnCall expressions on both the codegen'd and
missing words


PS4, Line 36: temporary
automatic?  at least I would have understood this a little quicker with that 
terminology.


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/agg-fn-evaluator.cc
File be/src/exprs/agg-fn-evaluator.cc:

PS4, Line 158: ",
is it worth differentiating these messages to make debugging easier?


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/anyval-util.h
File be/src/exprs/anyval-util.h:

PS4, Line 297: The returned AnyVal is not initialized.
this sentence is repeated


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/expr-context.h
File be/src/exprs/expr-context.h:

PS4, Line 66: fragment
do you mean fragment instance?


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

PS4, Line 188: fragment
fragment or fragment instance?


PS4, Line 189: cases
not sure what this sentence is saying. what cases?


Line 205: // children.
do we document the calling convention somewhere? I think a brief explanation 
would help casual readers understand why the interpreted path and codegen path 
set things up the way they do.


PS4, Line 340: optimise out the buffer
is this also enabling constant propagation for constant args, or was that 
already happening?


PS4, Line 370: Either no varargs or arguments before varargs begin
this comment seems unnecessary with the new conditional


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/runtime/free-pool.h
File be/src/runtime/free-pool.h:

Line 73: size = std::max(8, size);
seems fine, but why is it required?


-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-10-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4838/3/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

Line 396:   /// Initialise a constant global string and returns a pointer to it.
Need to doc that it's an i8*


http://gerrit.cloudera.org:8080/#/c/4838/3/be/src/exprs/anyval-util.h
File be/src/exprs/anyval-util.h:

Line 21: #include "gutil/strings/substitute.h"
Move to .cc


http://gerrit.cloudera.org:8080/#/c/4838/3/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

Line 337:   // Get IR i8* pointer to a varargs buffer. We allocate the buffer 
with Alloca so that
needs updating, not an i8


-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-10-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#4).

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..

IMPALA-4302,IMPALA-2379: constant expr arg fixes

This patch fixes two issues around handling of constant expr args.
The patches are combined because they touch some of the same code
and depend on some of the same memory management cleanup.

First, it fixes IMPALA-2379, where constant expr args were not visible
to UDAFs. The issue is that the input exprs need to be opened before
calling the UDAF Init() function.

Second, it avoids overhead from repeated evaluation of constant
arguments for ScalarFnCall expressions on both the codegen'd and
A common example is an IN predicate with a long list of constant
values.

The interpreted path was inefficient because it always evaluated all
children expressions. This is improved by separating out constant
and non-constant args and only evaluating the non-constant args.
The memory management of the AnyVal* objects was somewhat nebulous -
adjusted it so that they're allocated from ExprContext::mem_pool_,
which has the correct lifetime.

The codegen'd path was inefficient only with varargs - with fixed
arguments the LLVM optimiser is able to infer after inlining that
the expressions are constant and remove all evaluation. However,
for varargs it stores the vararg values into a heap-allocated buffer.
The LLVM optimiser is unable to remove these stores because they
have a side-effect that is visible to code outside the function.

The codegen'd path is improved by evaluating varargs into a temporary
buffer that can be optimised out. We also make a small related change
to bake the string constants into the codegen'd code.

Testing:
Ran exhaustive build.

Added regression test for IMPALA-2379 and MemPool test for aligned
allocation. Added a test for in predicates with constant strings.

Perf:
Added a targeted query that demonstrates the improvement. Also manually
validated the non-codegend perf. Also ran TPC-H and targeted perf
queries locally - didn't see any significant changes.

++---+---++-++---++-+---+
| Workload   | Query | File Format   | 
Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | 
Iters |
++---+---++-++---++-+---+
| TARGETED-PERF(_20) | primitive_filter_in_predicate | parquet / none / none | 
1.19   | 9.82| I -87.85%  |   3.82%   |   0.71%| 1   | 
10|
++---+---++-++---++-+---+

(I) Improvement: TARGETED-PERF(_20) primitive_filter_in_predicate [parquet / 
none / none] (9.82s -> 1.19s [-87.85%])
+--++--+--++---+--+--++++---+
| Operator | % of Query | Avg  | Base Avg | Delta(Avg) | StdDev(%) | 
Max  | Base Max | Delta(Max) | #Hosts | #Rows  | Est #Rows |
+--++--+--++---+--+--++++---+
| 01:AGGREGATE | 14.39% | 155.88ms | 214.61ms | -27.37%|   2.68%   | 
163.38ms | 227.53ms | -28.19%| 1  | 1  | 1 |
| 00:SCAN HDFS | 85.60% | 927.46ms | 9.43s| -90.16%|   4.49%   | 
1.01s| 9.50s| -89.42%| 1  | 13.77K | 14.05K|
+--++--+--++---+--+--++++---+

Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
---
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/exprs/case-expr.cc
M be/src/exprs/expr-context.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/in-predicate.h
M be/src/exprs/literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/testutil/test-udas.cc
M be/src/udf/udf-internal.h
M be/src/udf/udf-test-harness.cc
M be/src/udf/udf.cc
M be/src/util/bit-util.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/uda.test
A testdata/workloads/targete

[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-10-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#3).

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..

IMPALA-4302,IMPALA-2379: constant expr arg fixes

This patch fixes two issues around handling of constant expr args.
The patches are combined because they touch some of the same code
and depend on some of the same memory management cleanup.

First, it fixes IMPALA-2379, where constant expr args were not visible
to UDAFs. The issue is that the input exprs need to be opened before
calling the UDAF Init() function.

Second, it avoids overhead from repeated evaluation of constant
arguments for ScalarFnCall expressions on both the codegen'd and
A common example is an IN predicate with a long list of constant
values.

The interpreted path was inefficient because it always evaluated all
children expressions. This is improved by separating out constant
and non-constant args and only evaluating the non-constant args.
The memory management of the AnyVal* objects was somewhat nebulous -
adjusted it so that they're allocated from ExprContext::mem_pool_,
which has the correct lifetime.

The codegen'd path was inefficient only with varargs - with fixed
arguments the LLVM optimiser is able to infer after inlining that
the expressions are constant and remove all evaluation. However,
for varargs it stores the vararg values into a heap-allocated buffer.
The LLVM optimiser is unable to remove these stores because they
have a side-effect that is visible to code outside the function.

The codegen'd path is improved by evaluating varargs into a temporary
buffer that can be optimised out. We also make a small related change
to bake the string constants into the codegen'd code.

Testing:
Ran exhaustive build.

Added regression test for IMPALA-2379 and MemPool test for aligned
allocation. Added a test for in predicates with constant strings.

Perf:
Added a targeted query that demonstrates the improvement. Also manually
validated the non-codegend perf. Also ran TPC-H and targeted perf
queries locally - didn't see any significant changes.

++---+---++-++---++-+---+
| Workload   | Query | File Format   | 
Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | 
Iters |
++---+---++-++---++-+---+
| TARGETED-PERF(_20) | primitive_filter_in_predicate | parquet / none / none | 
1.19   | 9.82| I -87.85%  |   3.82%   |   0.71%| 1   | 
10|
++---+---++-++---++-+---+

(I) Improvement: TARGETED-PERF(_20) primitive_filter_in_predicate [parquet / 
none / none] (9.82s -> 1.19s [-87.85%])
+--++--+--++---+--+--++++---+
| Operator | % of Query | Avg  | Base Avg | Delta(Avg) | StdDev(%) | 
Max  | Base Max | Delta(Max) | #Hosts | #Rows  | Est #Rows |
+--++--+--++---+--+--++++---+
| 01:AGGREGATE | 14.39% | 155.88ms | 214.61ms | -27.37%|   2.68%   | 
163.38ms | 227.53ms | -28.19%| 1  | 1  | 1 |
| 00:SCAN HDFS | 85.60% | 927.46ms | 9.43s| -90.16%|   4.49%   | 
1.01s| 9.50s| -89.42%| 1  | 13.77K | 14.05K|
+--++--+--++---+--+--++++---+

Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
---
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/exprs/case-expr.cc
M be/src/exprs/expr-context.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/in-predicate.h
M be/src/exprs/literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/testutil/test-udas.cc
M be/src/udf/udf-internal.h
M be/src/udf/udf-test-harness.cc
M be/src/udf/udf.cc
M be/src/util/bit-util.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/uda.test
A testdata/workloads/targete

[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-10-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..

IMPALA-4302,IMPALA-2379: constant expr arg fixes

This patch fixes two issues around handling of constant expr args.
The patches are combined because they touch some of the same code
and depend on some of the same memory management cleanup.

First, it fixes IMPALA-2379, where constant expr args were not visible
to UDAFs. The issue is that the input exprs need to be opened before
calling the UDAF Init() function.

Second, it avoids overhead from repeated evaluation of constant
arguments for ScalarFnCall expressions on both the codegen'd and
A common example is an IN predicate with a long list of constant
values.

The interpreted path was inefficient because it always evaluated all
children expressions. This is improved by separating out constant
and non-constant args and only evaluating the non-constant args.
The memory management of the AnyVal* objects was somewhat nebulous -
adjusted it so that they're allocated from ExprContext::mem_pool_,
which has the correct lifetime.

The codegen'd path was inefficient only with varargs - with fixed
arguments the LLVM optimiser is able to infer after inlining that
the expressions are constant and remove all evaluation. However,
for varargs it stores the vararg values into a heap-allocated buffer.
The LLVM optimiser is unable to remove these stores because they
have a side-effect that is visible to code outside the function.

The codegen'd path is improved by evaluating varargs into a temporary
buffer that can be optimised out. We also make a small related change
to bake the string constants into the codegen'd code.

Testing:
Ran exhaustive build.

Added regression test for IMPALA-2379 and MemPool test for aligned
allocation. Added a test for in predicates with constant strings.

Perf:
Added a targeted query that demonstrates the improvement. Also manually
validated the non-codegend perf. Also ran TPC-H and targeted perf
queries locally - didn't see any significant changes.

++---+---++-++---++-+---+
| Workload   | Query | File Format   | 
Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | 
Iters |
++---+---++-++---++-+---+
| TARGETED-PERF(_20) | primitive_filter_in_predicate | parquet / none / none | 
1.19   | 9.82| I -87.85%  |   3.82%   |   0.71%| 1   | 
10|
++---+---++-++---++-+---+

(I) Improvement: TARGETED-PERF(_20) primitive_filter_in_predicate [parquet / 
none / none] (9.82s -> 1.19s [-87.85%])
+--++--+--++---+--+--++++---+
| Operator | % of Query | Avg  | Base Avg | Delta(Avg) | StdDev(%) | 
Max  | Base Max | Delta(Max) | #Hosts | #Rows  | Est #Rows |
+--++--+--++---+--+--++++---+
| 01:AGGREGATE | 14.39% | 155.88ms | 214.61ms | -27.37%|   2.68%   | 
163.38ms | 227.53ms | -28.19%| 1  | 1  | 1 |
| 00:SCAN HDFS | 85.60% | 927.46ms | 9.43s| -90.16%|   4.49%   | 
1.01s| 9.50s| -89.42%| 1  | 13.77K | 14.05K|
+--++--+--++---+--+--++++---+

Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
---
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/exprs/case-expr.cc
M be/src/exprs/expr-context.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/in-predicate.h
M be/src/exprs/literal.cc
M be/src/exprs/scalar-fn-call.h
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/testutil/test-udas.cc
M be/src/udf/udf-internal.h
M be/src/udf/udf-test-harness.cc
M be/src/udf/udf.cc
M be/src/util/bit-util.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/uda.test
A testdata/workloads/targeted-perf/queries/primitive_filter_i