Tim Armstrong has posted comments on this change. Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions ......................................................................
Patch Set 2: (14 comments) http://gerrit.cloudera.org:8080/#/c/4655/2/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: Line 1064: // To minimize branching on the UpdateTuple path, initialize the result value > The details of how this saves branches was somewhat nebulous to me. Seems w Added an example that hopefully makes it clearer. Also fix it to refer to the Add() operation, not Update() Line 1597: Value* expr_ctx_ptr = builder.CreateGEP( > inbounds? Yep, good idea. I just fixed it everywhere. Line 1658: // properties. First, for these builtins, null input values can be skipped. Second, > NULL Done Line 1747: // Create pointers to input args to pass to uda_fn. We must use the unlowered type. Let me know if I can help explain any of this. The stuff with lowered and unlowered values is annoyingly convoluted but stems from some limitations of CodegenAnyVal. I had a draft patch to clean it up but it's a pretty big change and doesn't have much immediate impact. http://gerrit.cloudera.org:8080/#/c/4655/2/be/src/exec/partitioned-aggregation-node.h File be/src/exec/partitioned-aggregation-node.h: Line 654: void CodegenBranchIfNull(LlvmCodeGen* codegen, LlvmBuilder* builder, > Does it make sense to put this into LlvmCodegen? Seems generally useful. Done. Moved it to CodegenAnyVal. http://gerrit.cloudera.org:8080/#/c/4655/2/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: Line 630: static inline StringVal ALWAYS_INLINE DefaultStringConcatDelim() { > typo: not Done http://gerrit.cloudera.org:8080/#/c/4655/2/be/src/runtime/descriptors.cc File be/src/runtime/descriptors.cc: Line 580: builder->CreateGEP(tuple_bytes, null_byte_offset, "null_byte_ptr"); > inbounds? (and elsewhere) Done Line 619: result = builder->CreateOr(null_byte, mask, "null_bit_cleared"); > null_bit_set Done Line 624: } else { > Why would we not require is_null to be a ConstantInt? Do we exercise the co Yes, I added it for use in CodegenUpdateSlot(): Value* result_is_null = result_val.GetIsNull("result_is_null"); slot_desc->CodegenSetNullIndicator( codegen, &builder, agg_tuple_arg, result_is_null); http://gerrit.cloudera.org:8080/#/c/4655/2/be/src/runtime/descriptors.h File be/src/runtime/descriptors.h: Line 139: /// represented as a 1-bit integer indicating whether this slot is null in 'tuple'. > 1-byte integer? Nope, it's a 1-bit integer (LLVM has an internal logical type that can get mapped in various ways onto CPU flag registers or wider integers). Called it llvm i1 so it's more obvious that it's that. Line 144: /// NULL bit in the given 'tuple' to the 'value' in null. 'tuple' is a pointer > typo: value 'is_null' Done Line 145: /// to the tuple, and 'null' is an boolean value represented as a 1-bit integer. > some typos in this sentence, should be 'is_null' Done http://gerrit.cloudera.org:8080/#/c/4655/2/tests/query_test/test_aggregation.py File tests/query_test/test_aggregation.py: Line 31: agg_functions = ['sum', 'count', 'min', 'max', 'avg'] > add ndv if easy enough? alternatively file a JIRA for adding more coverage Done Line 33: data_types = ['int', 'bool', 'double', 'bigint', 'tinyint', > maybe you can squeeze string into this patch if it's easy enough Done. I think we have decent decimal coverage from decimal.test at least. -- To view, visit http://gerrit.cloudera.org:8080/4655 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-HasComments: Yes