Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4192: Disentangle Expr and ExprContext ......................................................................
Patch Set 6: (38 comments) http://gerrit.cloudera.org:8080/#/c/5483/6/be/src/exprs/case-expr.cc File be/src/exprs/case-expr.cc: Line 76: fn_ctx->Free(reinterpret_cast<uint8_t*>(case_state)); is this really case-specific? doesn't look like it. (is there a need for an expr-specific 'close'? or do we simply free thread-local state?) http://gerrit.cloudera.org:8080/#/c/5483/6/be/src/exprs/expr-context.cc File be/src/exprs/expr-context.cc: Line 73: RETURN_IF_ERROR(root_->Init(state, row_desc)); init should happen independently of the exprctxs (and certainly not once per exprctx) http://gerrit.cloudera.org:8080/#/c/5483/6/be/src/exprs/expr-context.h File be/src/exprs/expr-context.h: Line 41: /// An ExprContext contains thread private states for the evaluation of an Expr Tree. 'thread private' sounds mysterious/vague. isn't this 'the state necessary for the evaluation of an expr tree' (ie, all of it)? Line 52: /// Allocates all FunctionContexts for this ExprContext. Also initialize the expr tree what does 'expr tree' refer to? the tree of expr objects? (if so, why would that be initialized here?) Line 58: /// Initializes the FunctionContexts in the ExprContext on all nodes in the Expr tree. is there a need to separate prepare() and open()? Line 66: /// originals but have their own MemPool and thread-local state. Clone() should be used this seems convoluted, and they shouldn't all need a separate mempool. unless you think this cloning stuff is a good idea, leave todo in class comment. Line 77: /// call is a no-op. The new ExprContexts are created in state->obj_pool(). do we really need this? it's bizarre. Line 124: void EvaluateWithoutRow(TColumnValue* col_val); awkward name. getconstantvalue? Line 200: /// Pool backing fn_contexts_. Counts against the runtime state's UDF mem tracker. is there any need to have a separate pool for each exprctx? if not, leave todo to remove. also leave todos for other specific clean-up items. http://gerrit.cloudera.org:8080/#/c/5483/6/be/src/exprs/expr.cc File be/src/exprs/expr.cc: Line 112: Expr::~Expr() { > I think we've been trying to move away from destructors releasing resources related note: we should figure out what differentiates prepare from open and whether we need both (and if the counterpart should be close or releaseresources) Line 119: void Expr::CloseContext(RuntimeState* state, ExprContext* context, this shouldn't be here Line 131: return Status("Expression tree only partially reconstructed. Not all thrift " \ why does this make sense? http://gerrit.cloudera.org:8080/#/c/5483/3/be/src/exprs/expr.h File be/src/exprs/expr.h: Line 41: /// The Expr superclass defines a virtual Get*Val() compute function for each possible a lot of this is outdated. http://gerrit.cloudera.org:8080/#/c/5483/6/be/src/exprs/expr.h File be/src/exprs/expr.h: Line 60: /// Only short-circuited operators (e.g. &&, ||) and other special functions like duplicated paragraph Line 79: /// Expr users should use the static Get*Val() wrapper functions to evaluate exprs, this still reflects the confusion between expr and expctx. please make a complete pass and clean this up. rewrite where needed, as opposed to just adding new text. this should read as if it were written from scratch, not pieced together over the years. Line 84: /// --- Relationship with ExprContext and FunctionContext this makes sense to do at the very top: explain what the class's abstraction is, and how it relates to others. Line 87: /// a query is represented as a tree of Expr whose states are static after initialization whose states are static: hard to follow, unless i already know what it means. exprs represent static information, such as types, etc. Line 88: /// by their Prepare() functions. init Line 90: /// ExprContext contains thread private states (e.g. evaluation result, FunctionContext). "thread private states": exprctx encapsulates the execution state, which is why it can't be shared between executing threads. Line 93: /// via the field 'root_'. do you gain anything here by referencing private members vars? Line 100: /// Expressions in exec nodes are expected to be initialized once per fragment and the "exprs in ..." (clearer to reference the class name directly) same for exec nodes, etc. Line 128: class BasicBlock; remove indentation, like below Line 159: virtual BooleanVal GetBooleanVal(ExprContext* context, const TupleRow*); these don't make sense here anymore, execution-related functions should be in exprctx Line 171: inline void AddChild(Expr* expr) { children_.push_back(expr); } why do these need to be constructed manually, as opposed to from a texpr/texprnode? Line 180: inline int fn_context_index() const { return fn_context_index_; } we use ctx elsewhere to abbreviate context Line 181: inline int NumFnContexts() const { Get... also, not entirely clear what this is unless you look at the implementation Line 209: /// them in 'exprs'. Returns an error if any ExprTree causes error, otherwise OK. causes an error Line 213: /// Creates vector of input Exprs of a function call expression 'texpr' Returns error input = parameter? also, this seems oddly specialized, is this really necessary? Line 302: static int GetConstantInt(const FunctionContext& ctx, ExprConstant c, int i = -1); this class shouldn't be dealing with functionctx. make this a non-static member of fnctx? Line 309: static int InlineConstants(const FunctionContext::TypeDesc& return_type, why is this in expr? this looks like a generic llvm utility. Line 338: /// Initializes this expr instance for execution. odd to say 'for execution' when the abstraction of this class is to contain the static/non-execution state Line 343: /// Initializes 'context' for execution. If scope if FRAGMENT_LOCAL, both fragment- and you should mention that this is used to allocate expr-specific state (and that there's no expr-specific logic in exprctx or fnctx). is this really want we want to do or should there be a hierarchy of fnctx subclasses that contain expr-specific state (just like we have a hierarchy of execnodes that contain plannode-specific execution state)? just a thought experiment, and certainly not meant to be addressed in this patch. Line 377: /// ['fn_context_index_start_', 'fn_context_index_end_') is the range of index 'range of indices' Line 383: int fn_context_index_start_; if fn_ctx_idx_ != -1, is it true that fn_ctx_idx_start_ == fn_ctx_idx_? Line 437: /// into the vector of nodes which this function should start constructing the tree. 'which this function should ...': unclear Line 462: /// Static wrappers around the virtual Get*Val() functions. Calls the appropriate move to exprctx http://gerrit.cloudera.org:8080/#/c/5483/6/be/src/exprs/hive-udf-call.h File be/src/exprs/hive-udf-call.h: Line 35: /// Executor for hive udfs using JNI. This works with the UdfExecutor on the update comment (exprs shouldn't execute). Line 94: /// Evalutes the UDF over row. Returns the result as an AnyVal. This function more execution logic, which could argue for splitting that off into expr-specific fn contexts -- To view, visit http://gerrit.cloudera.org:8080/5483 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iefdc9aeeba033355cb9497e3a5d2363627dcf2f3 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
