Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4192: Disentangle Expr and ExprContext
......................................................................


Patch Set 14:

(32 comments)

Made it through, finally... Overall looks way saner and easier to understand 
than the previous state of things.

http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

Line 694:   AggFnEvaluator::ShallowClone(parent->pool_, agg_fn_pool.get(),
This should go into 'partition_pool_'. Otherwise if we have an Agg in a subplan 
these will accumulate every Reset(). E.g. there would be a bug like 
https://gerrit.cloudera.org/#/c/1141/


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/agg-fn-evaluator.cc
File be/src/exprs/agg-fn-evaluator.cc:

Line 84: AggFnEvaluator::~AggFnEvaluator() {
Can we add a DCHECK() that it was closed?


PS14, Line 87: inline
probably not needed


PS14, Line 91: inline
probably not needed


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/agg-fn-evaluator.h
File be/src/exprs/agg-fn-evaluator.h:

Line 250
Consider leaving these as inline functions in the header - AnalyticEvalNode is 
still interpreted so the function call overhead may actually affect perf there.


Line 72:   /// Convenience functions for creating evaluators for multiple 
aggregate functions.
Mention that the caller is responsible for closing the created evaluators 
regardless of success or failure. That is subtle. Or we could just have 
Create() handle that internally.

This comment may also apply to ScalarExprEvaluator.


Line 174:   /// True if this evaluator is created from a Clone() call.
ShallowClone()


PS14, Line 175: is_clone_
Maybe is_shallow_clone_ to be more precise - although there is only one type of 
cloning supported so arguably not necessary to disambiguate.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/agg-fn.cc
File be/src/exprs/agg-fn.cc:

Line 87:   
RETURN_IF_ERROR(LibCache::instance()->GetSoFunctionPtr(fn_.hdfs_location,
Nit: I don't think the remainder of this function needs all the vertical 
whitespace - it's essentially double-spaced code now


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/agg-fn.h
File be/src/exprs/agg-fn.h:

PS14, Line 52: state
value?. "state" could be taken to mean that it initialises the evaluator or 
some such thing.


Line 81:   /// TODO: The aggregation node has custom codegen paths for a few of 
the builtins.
Consider removing the TODO - it's a bit vague and unclear what if anything we 
want to do.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/expr.h
File be/src/exprs/expr.h:

Line 53: /// --- Expr overview
Nice overview


Line 81:   static Status CreateTree(const TExpr& texpr, ObjectPool* pool, Expr* 
root);
I think this should be protected since it's only called from ScalarExpr and 
AggFn.


Line 94:   bool is_constant() const { return is_constant_; }
I think is_constant() should maybe only be defined in the ScalarExpr 
sub-hierarchy. It doesn't really mean anything for AggFn.


Line 100:   /// Simple debug string that provides no expr subclass-specific 
information.
Comment seems stale since there's no implementation.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/hive-udf-call.h
File be/src/exprs/hive-udf-call.h:

Line 28: using namespace impala_udf;
I guess some of these 'using namespace' instances were pre-existing - still 
seems like it contradicts our coding standards.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/literal.h
File be/src/exprs/literal.h:

Line 28: using namespace impala_udf;
See other comment about using namespace in headers.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/scalar-expr-evaluator.cc
File be/src/exprs/scalar-expr-evaluator.cc:

Line 154:   // Owner of mem_pool_ needs to call FreeAll() on it.
We don't really know what the owner wants to do. Maybe something like "Memory 
allocated by 'fn_ctx_' is still living in 'mem_pool_'. The owner of 'mem_pool_' 
is responsible for freeing it when appropriate."


Line 436:   // TODO: is there a better way to do this?
Consider removing TODO - I don't think we're likely to fix it.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/scalar-expr-evaluator.h
File be/src/exprs/scalar-expr-evaluator.h:

Line 29: using namespace impala_udf;
I'm not sure if this works, but maybe if we just move this "using" into the 
impala namespace it won't pollute any other namespaces. That would still avoid 
having impala_udf:: everywhere.


PS14, Line 43: types
type


Line 122:   void* GetValue(const ScalarExpr& e, const TupleRow* row);
Maybe GetSubExprValue()? I don't feel strongly but might make the intent 
clearer.


Line 144:   Status GetError(int start_idx = 0, int end_idx = -1) const 
WARN_UNUSED_RESULT;
Arguments maybe need explanation.


Line 147:   void PrintValue(const TupleRow* row, std::string* str);
Does this evaluate the expr?


Line 153:   /// The last two are helper functions.
Not sure which "last two" refers to.


Line 157:   /// Frees all local allocations made by fn_ctxs_. This can be 
called when result
Thinking out loud about future work: I think to fix memory transfer I'll need a 
way to transfer local allocations out of the function contexts and attach them 
to a RowBatch. 

I think to do this we'd need to pass in two different MemPools - one to back 
the FreePool and one to back the local allocations. Then instead of 
FreeLocalAllocations() we'd just free or transfer things out of the MemPool.

We don't need to do this in this patch since it's a big behavioural change but 
it seems like that will work out, right?


PS14, Line 158:  last two
Same here - maybe out of sync with the code?


Line 183:   /// Users of private GetValue() or 'pool_'.
It might make sense to move the methods that the "friend class" is mean to 
apply to under "protected:", just so there's a clearer delineation between 
those methods and the truly private methods.


PS14, Line 204: the evaluation
maybe something like "Stores the result of the last evaluation of the the root 
expr"


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/scalar-expr.cc
File be/src/exprs/scalar-expr.cc:

PS14, Line 398: subclassed
I think this should be "overridden"


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

Line 164:   bool is_interpreted = scalar_fn_wrapper_ == nullptr;
Thank you! This is much clearer.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/slot-ref.h
File be/src/exprs/slot-ref.h:

Line 24: using namespace impala_udf;
I think generally we try to avoid using entire namespaces in the headers. Maybe 
you can scope it to the SlotRef class?


-- 
To view, visit http://gerrit.cloudera.org:8080/5483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iefdc9aeeba033355cb9497e3a5d2363627dcf2f3
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to