Alex Behm has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

Patch Set 2:


Need to spend more time with the meat of this patch.
File be/src/exec/

Line 1064:     // To minimize branching on the UpdateTuple path, initialize the 
result value
The details of how this saves branches was somewhat nebulous to me. Seems worth 
mentioning that we do this trick here because later in codegen we want to 
ignore the dst slot's null indicator to get rid of a branch, although the dst 
slot is nullable.

When reading this I was confused about where these UpdateTuple() and Update() 
functions come from. Do you mean the UDA Update() function?

Line 1597:     Value* expr_ctx_ptr = builder.CreateGEP(

Line 1658:     // properties. First, for these builtins, null input values can 
be skipped. Second,
File be/src/exec/partitioned-aggregation-node.h:

Line 654:   void CodegenBranchIfNull(LlvmCodeGen* codegen, LlvmBuilder* builder,
Does it make sense to put this into LlvmCodegen? Seems generally useful.
File be/src/exprs/

Line 630: static inline StringVal ALWAYS_INLINE DefaultStringConcatDelim() {
typo: not
File be/src/runtime/

Line 580:       builder->CreateGEP(tuple_bytes, null_byte_offset, 
inbounds? (and elsewhere)

Line 619:       result = builder->CreateOr(null_byte, mask, "null_bit_cleared");

Line 624:   } else {
Why would we not require is_null to be a ConstantInt? Do we exercise the code 
in the else block anywhere?
File be/src/runtime/descriptors.h:

Line 139:   /// represented as a 1-bit integer indicating whether this slot is 
null in 'tuple'.
1-byte integer?

Line 144:   /// NULL bit in the given 'tuple' to the 'value' in null. 'tuple' 
is a pointer
typo: value 'is_null'

Line 145:   /// to the tuple, and 'null' is an boolean value represented as a 
1-bit integer.
some typos in this sentence, should be 'is_null'

1-byte integer?
File tests/query_test/

Line 31: agg_functions = ['sum', 'count', 'min', 'max', 'avg']
add ndv if easy enough? alternatively file a JIRA for adding more coverage (and 
the extended coverage in my comments below)

Line 33: data_types = ['int', 'bool', 'double', 'bigint', 'tinyint',
maybe you can squeeze string into this patch if it's easy enough

we should leave decimal for later

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-HasComments: Yes

Reply via email to