Michael Ho has posted comments on this change. Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions ......................................................................
Patch Set 4: (13 comments) http://gerrit.cloudera.org:8080/#/c/4655/4/be/src/codegen/codegen-anyval.h File be/src/codegen/codegen-anyval.h: PS4, Line 242: if (val.is_null) goto null_block; Can you please elaborate this comment with code snippet like the following ? if (val.is_null) goto null_block; non_null_block: //default entry point after calling this function null_block: .... http://gerrit.cloudera.org:8080/#/c/4655/4/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: PS4, Line 1065: can ignore the NULL bit of its destination value is : // ignored ? PS4, Line 1067: set did you mean with NULL bit cleared ? Line 1599: RETURN_IF_ERROR(agg_expr->GetCodegendComputeFn(state_, &agg_expr_fn)); DCHECK(agg_expr_fn != NULL); PS4, Line 1620: src.CodegenBranchIfNull(&builder, ret_block); It seems that we emit this check for all paths except for the UDA path. Why is it safe to make the assumption that 'src' is not NULL for the UDA path ? Also, it appears that we may consider doing some refactoring by computing 'bool use_uda_interface' up front and only emit this cmp-and-branch if that's false. Please see comments below about 'use_uda_interface'. PS4, Line 1628: dst_type.type != TYPE_TIMESTAMP : && !dst_type.IsStringType() This seems to be reused in multiple places. May be it's worth factoring this expression out. PS4, Line 1637: dst_type.type != TYPE_TIMESTAMP : && !dst_type.IsStringType() It may be easier to read if we do the check once up front: agg_op = evaluator->agg_op(); bool use_uda_interface = agg_op != COUNT && (dst_type.type == TYPE_TIMESTAMP || dst_type.IsStringType || (dst_type.type == TYPE_DECIMAL && agg_op() == AggFnEvaluator::SUM); if (use_uda_interface) { .... } else { switch (agg_op) { } } PS4, Line 1676: src.CodegenBranchIfNull(&builder, ret_block); Actually, referring to the comment above again, it seems that we should just do the check before the if-stmt sequence and emit the cmp-and-branch instructions up front for these cases. IMHO, I find the new code harder to follow because the basic blocks in the emitted code are not as explicitly laid out as the old code. PS4, Line 1685: resulting in with results stored in PS4, Line 1725: codegen->GetFunction(symbol); Should we consider adding "bool clone" as the second argument similar to the other variant of GetFunction() ? PS4, Line 1745: Value* input_lowered_ptr = : codegen->CreateEntryBlockAlloca(builder->GetInsertBlock()->getParent(), : LlvmCodeGen::NamedVariable("input_lowered_ptr", : input_vals[i].value()->getType())); : builder->CreateStore(input_vals[i].value(), input_lowered_ptr); : Type* input_unlowered_ptr_type = CodegenAnyVal::GetUnloweredPtrType( : codegen, evaluator->input_expr_ctxs()[i]->root()->type()); : Value* input_unlowered_ptr = builder->CreateBitCast( : input_lowered_ptr, input_unlowered_ptr_type, "input_unlowered_ptr"); Do you think it's worth factoring this logic out to a function given it's repeated again from line 1757-1764 ? http://gerrit.cloudera.org:8080/#/c/4655/4/be/src/exprs/compound-predicates.cc File be/src/exprs/compound-predicates.cc: PS4, Line 68: Please consider removing these extra blank spaces too. http://gerrit.cloudera.org:8080/#/c/4655/4/be/src/runtime/descriptors.cc File be/src/runtime/descriptors.cc: PS4, Line 603: Value* tuple_bytes = builder->CreateBitCast(tuple, codegen->ptr_type()); : Constant* null_byte_offset = : ConstantInt::get(codegen->int_type(), null_indicator_offset_.byte_offset); : Value* null_byte_ptr = : builder->CreateInBoundsGEP(tuple_bytes, null_byte_offset, "null_byte_ptr"); What do you think about factoring this code snippet as a utility function for generating code to return null_byte_ptr ? This seems to be useful for CodegenIsNull() too. -- To view, visit http://gerrit.cloudera.org:8080/4655 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
