Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/12797 )
Change subject: IMPALA-4356,IMPALA-7331: codegen all ScalarExprs ...................................................................... Patch Set 12: (12 comments) Thanks for the explanation. Can you please answer the question in scalar-fn-call.cc. Otherwise, I think this can be +2'ed. http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exec/grouping-aggregator.cc File be/src/exec/grouping-aggregator.cc: http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exec/grouping-aggregator.cc@119 PS12, Line 119: intermediate_row_desc_, /* is_entry_point */ false, state)); nit: indentation http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-expr.h File be/src/exprs/scalar-expr.h: http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-expr.h@110 PS12, Line 110: interpreted code. (e.g. an operator which doesn't support codegen yet). http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-expr.h@117 PS12, Line 117: /// TODO: Fix subclasses which call GetCodegendComputeFnWrapper() to not call interpreted : /// functions. Stale ? http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-expr.h@170 PS12, Line 170: /// compilation. Returns error status on failure. Can you please also add something similar to below: This function is invoked either by some other codegen functions (e.g. the codegen code of a join) or by RuntimeState::CodegenScalarExprs() which is called from FIS::Open() before LLVM compilation. These two call sites correspond to the two usage patterns in the class comment. http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-expr.h@246 PS12, Line 246: for each to be overridden by http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-expr.h@388 PS12, Line 388: If this is true, then after the expressions are codegen'd, : /// then 'codegend_compute_fn_' is non-NULL. If this is true, 'codegend_compute_fn_' will be set to the JIT'd function produced by GetCodegendComputeFn() after LLVM compilation. http://gerrit.cloudera.org:8080/#/c/12797/10/be/src/exprs/scalar-expr.cc File be/src/exprs/scalar-expr.cc: http://gerrit.cloudera.org:8080/#/c/12797/10/be/src/exprs/scalar-expr.cc@284 PS10, Line 284: > This assumption is false - GetCodegendComputeFnWrapper() generates > a codegen'd function that calls the interpreted path of its child. > We need to generate entry points there to avoid a perf cliff where > the whole subtree doesn't get codegen'd (which would be a > regression from the old behaviour too). > That's true. I suppose this mostly affects ScalarFnCall, right ? Before this change, if the expression is not codegen'd, its children are not codegen'd unless the children are ScalarFnCall. In that sense, that'll be a regression. Are there other cases which I may have missed ? > > ScalarExpr doesn't know the context in which it is used. E.g. if > it's the child of an AggregateExpr, then it's not an entry point > but doesn't know that based on local information. There are > actually a bunch more places where the expr is always called from > codegen'd code and we could save some overhead, but I didn't go > through and find them all. I see. I can see an alternate implementation is to revert the polarity of the "is_entry_point" flag to false by default and makes the owners of expressions to explicitly set 'is_entry_point' when creating / initialization the expressions. Those owners most likely include some operators which don't yet support codegen but I haven't looked through the entire code to see if there could be other non-trivial cases. On the other hand, this may make it more likely to regress if we miss some cases as ScalarFnCall currently codegen all the time regardless of the callers. I suppose we kind of need to go through all owners of expressions and mark those which don't need entry points when addressing the TODO in Create() above. http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-expr.cc File be/src/exprs/scalar-expr.cc: http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-expr.cc@86 PS12, Line 86: /// nit: // http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-expr.inline.h File be/src/exprs/scalar-expr.inline.h: http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-expr.inline.h@24 PS12, Line 24: // nit: /// http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-expr.inline.h@28 PS12, Line 28: ScalarFnCall ScalarExpr http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-fn-call.cc File be/src/exprs/scalar-fn-call.cc: http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-fn-call.cc@95 PS12, Line 95: // CHAR or VARCHAR are not supported as input arguments or return values for UDFs. stale ? http://gerrit.cloudera.org:8080/#/c/12797/12/be/src/exprs/scalar-fn-call.cc@143 PS12, Line 143: // We're in the interpreted path (i.e. no JIT). Populate our FunctionContext's : // staging_input_vals, which will be reused across calls to scalar_fn_ on the : // interpreted code path.. : vector<AnyVal*>* input_vals = fn_ctx->impl()->staging_input_vals(); : for (int i = 0; i < NumFixedArgs(); ++i) { : AnyVal* input_val; : RETURN_IF_ERROR(AllocateAnyVal(state, eval->expr_perm_pool(), children_[i]->type(), : "Could not allocate expression value", &input_val)); : input_vals->push_back(input_val); : } Not sure I understand this part. Does it mean that we will do this initialization for the interpreted code path even if codegen succeeded ? -- To view, visit http://gerrit.cloudera.org:8080/12797 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I839d7a3a2f5e1309c33a1f66013ef11628c5dc11 Gerrit-Change-Number: 12797 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Thomas Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Fri, 10 May 2019 23:07:04 +0000 Gerrit-HasComments: Yes