Tim Armstrong has uploaded a new patch set (#7).

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
......................................................................

IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

This change enables codegen for all builtin aggregate functions,
e.g. timestamp functions and group_concat.

There are several parts to the change:
* Adding support for generic UDAs. Previous the codegen code did not
  handle multiple input arguments or NULL return values.
* Defaulting to using the UDA interface when there is not a special
  codegen path (we have implementations of all builtin aggregate
  functions for the interpreted path).
* Remove all the logic to disable codegen for the special cases that now
  are supported.

Also fix the generation of code to get/set NULL bits since I needed
to add functionality there anyway.

Testing:
Add tests that check that codegen was enabled for builtin aggregate
functions. Also fix some gaps in the preexisting tests.

Also add tests for UDAs that check input/output nulls are handled
correctly, in anticipation of enabling codegen for arbitrary UDAs.

Perf:
Ran local TPC-H and targeted perf. Spent a lot of time on TPC-H Q1,
since my original approach regressed it ~5%. In the end the problem was
to do with the ordering of loads/stores to the slot and null bit in the
generated code: the previous version of the code exploited some
properties of the particular aggregate function. I ended up replicating
this behaviour to avoid regressing perf.

Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/aggregation-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/text-converter.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/case-expr.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr.cc
M be/src/exprs/literal.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/tuple.cc
M be/src/runtime/types.h
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udfs.cc
M be/src/util/tuple-row-compare.cc
M testdata/workloads/functional-query/queries/QueryTest/uda.test
M tests/common/test_result_verifier.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_udfs.py
39 files changed, 845 insertions(+), 561 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/4655/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>

Reply via email to