[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Dan Hecht has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 13: (5 comments) http://gerrit.cloudera.org:8080/#/c/5161/13/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: Line 215: AnyValUtil::ColumnTypeToTypeDesc(input_expr_ctxs_[i]->root()->type())); this seems wrong for merge/finalize. in those cases, this will be the intermediate type, I believe, but we already have that available from GetIntermediateType(). It seems to me GetArgTypes() should always be the aggregate function's logical inputs. PS13, Line 535: ScalarFnCall seems like a strange place given this isn't a scalar function being loaded... PS13, Line 538: !(*uda_fn)->isDeclaration() what does this condition mean? that it's defined? if it's not defined, what does that mean -- that it's defined only in native code? PS13, Line 541: Constant inlining must therefore expose it in the same way. what is this saying? i can see why arg_types for InlineConstants() should only have the true arguments, but not sure what we're saying here. PS13, Line 547: arg_types as we discussed in person, it doesn't seem right to use these types for FunctionContext::GetArgType(), because this will be the intermediate type not the input types. i.e. GetArgTypes() currently returns different types depending on if executing in the update or the merge/finalize functions. This does match what we do when constructing the FunctionContext (but I'm proposing we change the Prepare() code). -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/5161/13/be/src/exprs/scalar-fn-call.h File be/src/exprs/scalar-fn-call.h: Line 53: /// Loads a native or IR UDF or UDA function 'fn' with symbol 'symbol' from the > the term 'scalar function' means 'not an aggregate function'. i don't under I'm fine with moving it to a different class, I mainly wanted to leave as-is initially so that reviewer could review the diff of the function implementation. -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/5161/13/be/src/exprs/scalar-fn-call.h File be/src/exprs/scalar-fn-call.h: Line 53: /// Loads a native or IR UDF or UDA function 'fn' with symbol 'symbol' from the the term 'scalar function' means 'not an aggregate function'. i don't understand how adding this functionality fits into the class abstractions. let's talk tomorrow. -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 13: Code-Review+1 carry +1 -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 12: (2 comments) http://gerrit.cloudera.org:8080/#/c/5161/12/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS12, Line 540: return/intermediate > intermediate value's Done http://gerrit.cloudera.org:8080/#/c/5161/12/be/src/exprs/scalar-fn-call.cc File be/src/exprs/scalar-fn-call.cc: PS12, Line 340: // Prepare() : // and Close() functions from 'codegen'. > one line. Done -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Hello Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5161 to look at the new patch set (#13). Change subject: IMPALA-1430: enable codegen for native UDAs .. IMPALA-1430: enable codegen for native UDAs This uses the existing infrastructure for codegening builtin UDAs and for codegening calls to UDFs. GetUdf() is refactored to support both UDFs and UDAs. IR UDAs are still not allowed by the frontend. It's unclear if we want to enable them going forward because of the difficulties in testing and supporting IR UDFs/UDAs. Testing: test_udfs.py tests UDAs with codegen enabled and disabled Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/partitioned-aggregation-node.cc M be/src/exprs/agg-fn-evaluator.cc M be/src/exprs/agg-fn-evaluator.h M be/src/exprs/anyval-util.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr.cc M be/src/exprs/expr.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/exprs/timestamp-functions.cc M be/src/udf/udf-internal.h M be/src/udf/udf.h 14 files changed, 190 insertions(+), 138 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/13 -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Michael Ho has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 12: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/5161/12/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS12, Line 540: return/intermediate intermediate value's http://gerrit.cloudera.org:8080/#/c/5161/12/be/src/exprs/scalar-fn-call.cc File be/src/exprs/scalar-fn-call.cc: PS12, Line 340: // Prepare() : // and Close() functions from 'codegen'. one line. -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/5161/11/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS11, Line 546: AnyValUtil::ColumnTypeToTypeDesc(output_slot_desc_->type()) > As discussed offline, I believe there is a confusion on what ExprConstant: I added a couple of comments where the enums are defined. -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has uploaded a new patch set (#12). Change subject: IMPALA-1430: enable codegen for native UDAs .. IMPALA-1430: enable codegen for native UDAs This uses the existing infrastructure for codegening builtin UDAs and for codegening calls to UDFs. GetUdf() is refactored to support both UDFs and UDAs. IR UDAs are still not allowed by the frontend. It's unclear if we want to enable them going forward because of the difficulties in testing and supporting IR UDFs/UDAs. Testing: test_udfs.py tests UDAs with codegen enabled and disabled Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/partitioned-aggregation-node.cc M be/src/exprs/agg-fn-evaluator.cc M be/src/exprs/agg-fn-evaluator.h M be/src/exprs/anyval-util.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr.cc M be/src/exprs/expr.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/exprs/timestamp-functions.cc M be/src/udf/udf-internal.h M be/src/udf/udf.h 14 files changed, 191 insertions(+), 138 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/12 -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Michael Ho has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/5161/11/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS11, Line 546: AnyValUtil::ColumnTypeToTypeDesc(output_slot_desc_->type()) > It's right in that it does what is documented in udf.h and is consistent wi As discussed offline, I believe there is a confusion on what ExprConstant::RETURN_TYPE_SIZE really means. One can interpret it as say the return type of UDA Update() function or it could be the return type of a UDA. Please provide detailed explanation to avoid confusion. -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Michael Ho has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 11: (1 comment) Can you please address the question in agg-fn-evaluator.cc ? I can +1 afterwards. http://gerrit.cloudera.org:8080/#/c/5161/11/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS11, Line 546: AnyValUtil::ColumnTypeToTypeDesc(output_slot_desc_->type()) Is that actually right / useful for Update() / Merge() functions ? That should work for Finalize() only, right ? Will passing NULL make a difference ? -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has uploaded a new patch set (#11). Change subject: IMPALA-1430: enable codegen for native UDAs .. IMPALA-1430: enable codegen for native UDAs This uses the existing infrastructure for codegening builtin UDAs and for codegening calls to UDFs. GetUdf() is refactored to support both UDFs and UDAs. IR UDAs are still not allowed by the frontend. It's unclear if we want to enable them going forward because of the difficulties in testing and supporting IR UDFs/UDAs. Testing: test_udfs.py tests UDAs with codegen enabled and disabled Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/partitioned-aggregation-node.cc M be/src/exprs/agg-fn-evaluator.cc M be/src/exprs/agg-fn-evaluator.h M be/src/exprs/anyval-util.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr.cc M be/src/exprs/expr.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/exprs/timestamp-functions.cc M be/src/udf/udf-internal.h M be/src/udf/udf.h 14 files changed, 189 insertions(+), 138 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/11 -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 10: (7 comments) http://gerrit.cloudera.org:8080/#/c/5161/10/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS10, Line 539: // The last (in/out) argument is exposed to UDAs via the FunctionContext as the : // return type, rather than as the last argument type. Constant inlining must : // therefore expose it in the same way. > May help to add a remark in aggregate-functions-ir.cc about the assumption I realised that FunctionContext is probably the right place to document this (since it's really the contract of GetReturnType() and friends). So I briefly augmented some of the comments there to mention UDA args. When reading those comments I realised that there was a bug here - GetReturnType() is documented to return the return type of Finalize() rather than the intermediate type. I fixed that here and filed IMPALA-4785 to add constant propagation for intermediate_type(). http://gerrit.cloudera.org:8080/#/c/5161/10/be/src/exprs/scalar-fn-call.h File be/src/exprs/scalar-fn-call.h: PS10, Line 53: or HDFS > or external module ? library? PS10, Line 54: UDF > It could also be the update/merge function of a UDA, right ? Not sure if it I reworded just to refer to it as a function. I could also move this to a different class - I didn't want to start off doing that since it would make the diff harder to review. PS10, Line 61: > nit: extra blank space. Done http://gerrit.cloudera.org:8080/#/c/5161/10/be/src/udf/udf-internal.h File be/src/udf/udf-internal.h: PS10, Line 51: UDAF > May be less confusing to say "UDA functions" UDAs seems more consistent. PS10, Line 52: an Expr node : /// generated by the Impala frontend > May be it helps to differentiate the ExprNode in the FE and BE b stating: Done PS10, Line 54: Backend code should not create FunctionContexts for other : /// purposes > FunctionContexts should entirely be allocated and managed by ExprContext. E Done -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Michael Ho has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 10: (7 comments) Looking good. Most comments about comments. http://gerrit.cloudera.org:8080/#/c/5161/10/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS10, Line 539: // The last (in/out) argument is exposed to UDAs via the FunctionContext as the : // return type, rather than as the last argument type. Constant inlining must : // therefore expose it in the same way. May help to add a remark in aggregate-functions-ir.cc about the assumption that the type of the last argument (the intermediate value) is treated as return type instead of argument type from the perspective of Expr::InlineConstants(). Otherwise, this can easily lead to confusion when writing built-in UDA. I think all use cases of Expr::InlineConstants() for built-in UDAs now are for substituting the arguments' types. On the other hand, if we don't consider the last argument as the return value but instead as a value which will be updated by the UDA, these logic can be hidden inside ScalarFnCall::GetUdf(). Not sure if we are getting much by using the current convention in the patch. http://gerrit.cloudera.org:8080/#/c/5161/10/be/src/exprs/scalar-fn-call.h File be/src/exprs/scalar-fn-call.h: PS10, Line 53: or HDFS or external module ? PS10, Line 54: UDF It could also be the update/merge function of a UDA, right ? Not sure if it makes sense to call it a UDF in that case. PS10, Line 61: nit: extra blank space. http://gerrit.cloudera.org:8080/#/c/5161/10/be/src/udf/udf-internal.h File be/src/udf/udf-internal.h: PS10, Line 51: UDAF May be less confusing to say "UDA functions" PS10, Line 52: an Expr node : /// generated by the Impala frontend May be it helps to differentiate the ExprNode in the FE and BE b stating: "TExprNode generated by the Impala frontend" PS10, Line 54: Backend code should not create FunctionContexts for other : /// purposes FunctionContexts should entirely be allocated and managed by ExprContext. Exprs shouldn't try to create FunctionContext themselves. -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has uploaded a new patch set (#9). Change subject: IMPALA-1430: enable codegen for native UDAs .. IMPALA-1430: enable codegen for native UDAs This uses the existing infrastructure for codegening builtin UDAs and for codegening calls to UDFs. GetUdf() is refactored to support both UDFs and UDAs. IR UDAs are still not allowed by the frontend. It's unclear if we want to enable them going forward because of the difficulties in testing and supporting IR UDFs/UDAs. Testing: test_udfs.py tests UDAs with codegen enabled and disabled Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/partitioned-aggregation-node.cc M be/src/exprs/agg-fn-evaluator.cc M be/src/exprs/agg-fn-evaluator.h M be/src/exprs/anyval-util.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr.cc M be/src/exprs/expr.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/udf/udf-internal.h 12 files changed, 169 insertions(+), 124 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/9 -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS6, Line 542: Expr::InlineConstants(AnyValUtil::ColumnTypeToTypeDesc(intermediate_type()), : AnyValUtil::ColumnTypesToTypeDescs(arg_types), codegen, *uda_fn); > I agree. There is really no clear guideline on what's "valid" here. Even if I forgot to respond to this. Creating a FunctionContext like that seems like asking for trouble. I added a comment to the FunctionContextImpl header that makes the relationship between FunctionContexts and expressions more explicit. -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has uploaded a new patch set (#8). Change subject: IMPALA-1430: enable codegen for native UDAs .. IMPALA-1430: enable codegen for native UDAs This uses the existing infrastructure for codegening builtin UDAs and for codegening calls to UDFs. GetUdf() is refactored to support both UDFs and UDAs. IR UDAs are still not allowed by the frontend. It's unclear if we want to enable them going forward because of the difficulties in testing and supporting IR UDFs/UDAs. Testing: test_udfs.py tests UDAs with codegen enabled and disabled Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/partitioned-aggregation-node.cc M be/src/exprs/agg-fn-evaluator.cc M be/src/exprs/agg-fn-evaluator.h M be/src/exprs/anyval-util.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr.cc M be/src/exprs/expr.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h 11 files changed, 162 insertions(+), 124 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/8 -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS6, Line 542: Expr::InlineConstants(AnyValUtil::ColumnTypeToTypeDesc(intermediate_type()), : AnyValUtil::ColumnTypesToTypeDescs(arg_types), codegen, *uda_fn); > My comment wasn't advocating for the change of UDA interface but instead qu It's not valid to call the UDF function directly with 'ctx', since it's not a FunctionContext that was set up for that UDF. The valid way to do that would be to have a helper function that was called by both CastToDecimalVal() and some_date() with the correct types. http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/scalar-fn-call.cc File be/src/exprs/scalar-fn-call.cc: PS6, Line 478: codegen->void_type() : : CodegenAnyVal::GetLoweredType(codegen, *return_type); > Can you please keep it on a single line ? Done -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Michael Ho has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS6, Line 542: Expr::InlineConstants(AnyValUtil::ColumnTypeToTypeDesc(intermediate_type()), : AnyValUtil::ColumnTypesToTypeDescs(arg_types), codegen, *uda_fn); > This is preserving the pre-existing behaviour of the UDA interface, which t My comment wasn't advocating for the change of UDA interface but instead questioning the validity of calling Expr::InlineConstants() here. In particular, Expr::InlineConstants() was simply searching for call instructions in the UDA function and replacing those with callee with substring _ZN6impala4Expr14GetConstantInt with some constants. What if there is a new built-in UDA Update function likes the following: void some_update(FunctionContext* ctx, IntVal* arg1, StringVal *dst) { . DecimalVal d_val = DecimalOperator::CastToDecimalVal(ctx, arg1); ... *dst = } In the code above, will Expr::InlineConstants() cause incorrect replacement for Expr::GetConstantInt() called in CastToDecimalVal() ? http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/scalar-fn-call.cc File be/src/exprs/scalar-fn-call.cc: PS6, Line 328: InlineConstants(AnyValUtil::ColumnTypeToTypeDesc(type_), : AnyValUtil::ColumnTypesToTypeDescs(arg_types), codegen, udf); > That was the initial path I took but it's not possible because UDAs treat t I will hold off from commenting on this until we resolve the question in AggFnEvaluator. PS6, Line 449: DCHECK(has_varargs || arg_types.size() == num_fixed_args); : DCHECK(!has_varargs || arg_types.size() > num_fixed_args); > We actually want to reject the case of passing 0 varargs into a varargs fun Ah ! This is really a subtle corner case. PS6, Line 478: codegen->void_type() : : CodegenAnyVal::GetLoweredType(codegen, *return_type); > clang-format seems to prefer it this way. I don't feel strongly Can you please keep it on a single line ? -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has uploaded a new patch set (#7). Change subject: IMPALA-1430: enable codegen for native UDAs .. IMPALA-1430: enable codegen for native UDAs This uses the existing infrastructure for codegening builtin UDAs and for codegening calls to UDFs. GetUdf() is refactored to support both UDFs and UDAs. IR UDAs are still not allowed by the frontend. It's unclear if we want to enable them going forward because of the difficulties in testing and supporting IR UDFs/UDAs. Testing: test_udfs.py tests UDAs with codegen enabled and disabled Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/partitioned-aggregation-node.cc M be/src/exprs/agg-fn-evaluator.cc M be/src/exprs/agg-fn-evaluator.h M be/src/exprs/anyval-util.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr.cc M be/src/exprs/expr.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h 11 files changed, 163 insertions(+), 124 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/7 -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 6: (7 comments) http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS6, Line 542: Expr::InlineConstants(AnyValUtil::ColumnTypeToTypeDesc(intermediate_type()), : AnyValUtil::ColumnTypesToTypeDescs(arg_types), codegen, *uda_fn); > I am not sure I understand the intention of this. This is preserving the pre-existing behaviour of the UDA interface, which treats the last (in-out) argument as the return type in the function context. Changing how the types are exposed could break UDAs that were written against the old interface. I updated the comment to hopefully make this clearer. I don't feel strongly about whether this is the right API (on one hand, it doesn't match the C++ function signature, on the other the intermediate type is logically the return type) but I don't think it's worth breaking compatibility for. The constants aren't inlined into the input expressions, only to the UDA function itself, so the second scenario isn't possible. http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/scalar-fn-call.cc File be/src/exprs/scalar-fn-call.cc: PS6, Line 327: !udf->isDeclaration() > May help to add a comment on why this can be a declaration only (i.e. no fu Done PS6, Line 328: InlineConstants(AnyValUtil::ColumnTypeToTypeDesc(type_), : AnyValUtil::ColumnTypesToTypeDescs(arg_types), codegen, udf); > Please see comments in AggFnEvaluator. It seems that we should be able to k That was the initial path I took but it's not possible because UDAs treat the final in/out argument as the return type. I could pass in an additional argument to determine which mode it should use if you prefer. PS6, Line 449: DCHECK(has_varargs || arg_types.size() == num_fixed_args); : DCHECK(!has_varargs || arg_types.size() > num_fixed_args); > DCHECK(arg_types.size() == num_fixed_args || (has_var_args && arg_types.si We actually want to reject the case of passing 0 varargs into a varargs function. The frontend currently rejects those cases and I added a comment to this function that specifically mentions this case. It would be reasonable to pass 0 args into varargs functions, but a bunch of code here doesn't handle that correctly - it assumes that having 0 varargs means that the function signature doesn't have a varargs argument. PS6, Line 478: codegen->void_type() : : CodegenAnyVal::GetLoweredType(codegen, *return_type); > nit: one line ? clang-format seems to prefer it this way. I don't feel strongly http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/scalar-fn-call.h File be/src/exprs/scalar-fn-call.h: PS6, Line 59: 'cache_entry' > May help to have a comment about what 'cache_entry' is. Something like the Done Line 60: /// updated to point to the library (or its use count is incremented if already loaded). > Please add a comment that the caller is expected to call FinalizeFunction() Done -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has uploaded a new patch set (#3). Change subject: IMPALA-1430: enable codegen for native UDAs .. IMPALA-1430: enable codegen for native UDAs This uses the existing infrastructure for codegening builtin UDAs and for codegening calls to UDFs. GetUdf() is refactored to support both UDFs and UDAs. IR UDAs are still not allowed by the frontend. It's unclear if we want to enable them going forward because of the difficulties in testing and supporting IR UDFs/UDAs. Testing: test_udfs.py tests UDAs with codegen enabled and disabled Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/partitioned-aggregation-node.cc M be/src/exprs/agg-fn-evaluator.cc M be/src/exprs/agg-fn-evaluator.h M be/src/exprs/anyval-util.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr.cc M be/src/exprs/expr.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h 11 files changed, 153 insertions(+), 124 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/3 -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has uploaded a new patch set (#2). Change subject: IMPALA-1430: enable codegen for native UDAs .. IMPALA-1430: enable codegen for native UDAs This uses the existing infrastructure for codegening builtin UDAs and for codegening calls to UDFs. GetUdf() is refactored to support both UDFs and UDAs. IR UDAs are still not allowed by the frontend. It's unclear if we want to enable them going forward because of the difficulties in testing and supporting IR UDFs/UDAs. Testing: test_udfs.py tests UDAs with codegen enabled and disabled Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/partitioned-aggregation-node.cc M be/src/exprs/agg-fn-evaluator.cc M be/src/exprs/agg-fn-evaluator.h M be/src/exprs/anyval-util.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr.cc M be/src/exprs/expr.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h 11 files changed, 154 insertions(+), 122 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/2 -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/5161 Change subject: IMPALA-1430: enable codegen for native UDAs .. IMPALA-1430: enable codegen for native UDAs This uses the existing infrastructure for codegening builtin UDAs and for codegening calls to UDFs. GetUdf() is refactored to support both UDFs and UDAs. IR UDAs are still not allowed by the frontend. It's unclear if we want to enable them going forward because of the difficulties in testing and supporting IR UDFs/UDAs. Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/partitioned-aggregation-node.cc M be/src/exprs/agg-fn-evaluator.cc M be/src/exprs/agg-fn-evaluator.h M be/src/exprs/anyval-util.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr.cc M be/src/exprs/expr.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h 11 files changed, 154 insertions(+), 122 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/1 -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong