Tim Armstrong has posted comments on this change. Change subject: IMPALA-3637,IMPALA-3636: refactor codegen constant replacement ......................................................................
Patch Set 6: (19 comments) http://gerrit.cloudera.org:8080/#/c/3401/6/be/src/codegen/constant-replacement.h File be/src/codegen/constant-replacement.h: PS6, Line 35: struct ReplaceableSymbol { > I find the entire layout of classes a bit confusing and they can be simplif I think I'd rather avoid special-casing the zero argument case too much. Materializing all the possible constants upfront works fine for the zero-arg case but doesn't work in general for constant arguments - e.g. even if the function had 100 possible argument values, or you had two arguments with 10 possible values each, then you have to do a lot of work to build the data structure. I'm not sure if you were going down that path but I'd rather keep the interface for the zero-arg case analogous to the one-arg case. PS6, Line 35: Symbol > ReplaceableFunction seems to be a more precise description. I'll rename. PS6, Line 53: a set of symbols. > Doesn't it only handle replacement of functions ? "symbols" seems to imply Yep will change. PS6, Line 54: This uses the "Visitor" pattern, > IMHO, this is not important details. It would be more helpful to explain wh Done Line 55: /// to replace a function callsite with a symbol. > I think this class warrants a more detailed explanation. Essentially, the r Done PS6, Line 59: virtual std::vector<ReplaceableSymbol> HandledSymbols() = 0; > This seems to be implementation details which really shouldn't be exposed t It's part of the interface: the constant replacement pass needs to know which functions the replacer can operate on. PS6, Line 62: 'constant_arg_index' of -1 is found. > It may be easier to understand to say the replaced function has no argument That's not true though, the function may have arguments, but they're ignored for the purposes of constant replacement. E.g. most functions we're replacing have a 'this' argument. I clarified in the class comment PS6, Line 65: virtual llvm::Value* ReplaceNoArgs( > Following from the restructuring comments above, I don't see there is need As I mentioned above, it doesn't make sense to materialize all of the possible replacement values. It's feasible for some cases but that approach breaks down as soon as the domain of valid input arguments get larger than, say, 50 to 100. I want to make sure the interface is fairly future-proof so we can use it it more places without reinventing the mechanism. PS6, Line 70: /// This function is invoked when a call to one of the handled symbols with : /// a non-negative 'constant_arg_index' is found. > This function handles replacing functions with at least one constant argume Updated the comment. Not sure why it would be at least one? This can only handle a single constant argument. PS6, Line 76: int64_t arg) > Please document the arguments. Done PS6, Line 89: class SimpleNoArgConstantReplacer > Please see comments above. This class may not need to exist. I would rather keep them separate so that the ConstantReplacer interface is as minimal as possible and doesn't have logic for handling this special case mixed in with the generic interface. Putting the data structure into ConstantReplacer also makes every ConstantReplacer stateful, so I'd have to change CpuInfo to return a fresh object for every call. PS6, Line 91: a replacement > Add a replacement entry (i.e. function name 'symbol' and its replacement co Done PS6, Line 92: void AddReplacement(const std::string& symbol, llvm::Value* replacement) { > Please see comments above. This may fit in ConstantReplacer better. See replies above. PS6, Line 120: struct SymbolMapEntry { > Please see comments above. Done http://gerrit.cloudera.org:8080/#/c/3401/6/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: PS6, Line 651: symbol_map.find(callee->getName()) != symbol_map.end()) > Is it really necessary to expose the symbol_map to the caller ? Isn't it im I'd rather expose that information because it allows us to implement the pass more efficiently, rather than having to blindly call into every replacer for every function call in the IR. There's a bunch of processing that this short-circuits. It's not super-perf-critical but I think it's worth doing things efficiently upfront. It would also be nice to be able to efficiently do multiple replacements in a single pass in future. With this approach we know which replacer handles which functions, so we could route each function to the right place. I didn't actually implement that yet since there isn't a use case right now. http://gerrit.cloudera.org:8080/#/c/3401/6/be/src/exec/hash-table-constant-replacer.h File be/src/exec/hash-table-constant-replacer.h: PS6, Line 17: : #ifndef IMPALA_EXEC_HASH_TABLE_CONSTANT_REPLACER_H : #define IMPALA_EXEC_HASH_TABLE_CONSTANT_REPLACER_H > Shouldn't these lines be above the #include lines above ? Done http://gerrit.cloudera.org:8080/#/c/3401/6/be/src/exprs/expr.cc File be/src/exprs/expr.cc: PS6, Line 580: class ArgTypeConstantReplacer : public ConstantReplacer > This seems to be applicable to Decimal only so shouldn't this be moved to d I moved it to expr-constant-replacement.cc since it doesn't really need to be in this file. The fact that it is only used for decimal right now is incidental. E.g. we could use it for VARCHAR/CHAR length. I think it makes sense to name it more generically since we don't want to have to move everything if we make an obvious extension. PS6, Line 609: rn codegen->GetIntConstant(TYPE_INT, type.precision); > Please see comments for the ConstantReplacer class. I don't see why this ca See my comment there. Materializing a map of all possible values in general only works when 'arg' has a small number of possible values. It would be ok in this specific case but I'd rather stick with the more general-purpose (and efficient) pattern since the code is pretty straightforward. PS6, Line 616: const FunctionContext::TypeDesc& return_type_; : const vector<FunctionContext::TypeDesc>& arg_types_; > If we can insert all the mappings from function/symbol name to replacement See above. -- To view, visit http://gerrit.cloudera.org:8080/3401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6ba029ed8589698eb15dbfb0a20dd2a7ea752635 Gerrit-PatchSet: 6 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Jim Apple <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
