Tim Armstrong has posted comments on this change. Change subject: IMPALA-4192: Disentangle Expr and ExprContext ......................................................................
Patch Set 16: (6 comments) Looking good, just had a handful of comments. http://gerrit.cloudera.org:8080/#/c/5483/16/be/src/exec/aggregation-node.cc File be/src/exec/aggregation-node.cc: Line 93: pool_->Add(build_expr); Wrap pool_->Add() around the above line. Makes it harder to reintroduce the bug by adding a line in-between. http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: Line 165: new RowDescriptor(min_max_tuple_desc_, /* is_nullable */ false)); > Good point. I think heap allocated may be safer in case the expression keep I think we should be explicit in the ScalarExpr comment about whether Create() can hold onto a reference, or whether it has to copy. http://gerrit.cloudera.org:8080/#/c/5483/16/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: PS16, Line 163: tab? http://gerrit.cloudera.org:8080/#/c/5483/16/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: Line 87: Status status = ScalarExpr::Create(eq_join_conjunct.left, child(0)->row_desc(), It would be nice if we had a way to make this cleanup more automatic to make it harder to introduce bugs like this. Maybe ScalarExpr::Create() should close the expr and return NULL if it fails? Similarly maybe the vector version of ScalarExpr::Create() should close all the exprs and clear the vector on failure. http://gerrit.cloudera.org:8080/#/c/5483/16/be/src/exprs/anyval-util.h File be/src/exprs/anyval-util.h: Line 32: using impala_udf::FunctionContext; Using the whole namespace seems fine to me too so long as it's scoped to impala (this is also fine). http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/scalar-expr-evaluator.h File be/src/exprs/scalar-expr-evaluator.h: Line 157: /// or its sub-expressions. 'start_idx' and 'end_idx' correspond to the range > Yes, it'd work out but not sure the difference in overhead for not reusing I'm not too concerned about that. There's two possible levels of caching: * In the MemPool if we don't transfer and call Clear() * In TCMalloc or the new buffer pool, where there's thread-local caching or core-local caching respectively. TCMalloc's caching should be pretty effective for small allocations. -- To view, visit http://gerrit.cloudera.org:8080/5483 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iefdc9aeeba033355cb9497e3a5d2363627dcf2f3 Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
