Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3637,IMPALA-3636: refactor codegen constant replacement
......................................................................


Patch Set 6:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/3401/6/be/src/codegen/constant-replacement.h
File be/src/codegen/constant-replacement.h:

PS6, Line 35: struct ReplaceableSymbol {
> I find the entire layout of classes a bit confusing and they can be simplif
I think I'd rather avoid special-casing the zero argument case too much. 
Materializing all the possible constants upfront works fine for the zero-arg 
case but doesn't work in general for constant arguments - e.g. even if the 
function had 100 possible argument values, or you had two arguments with 10 
possible values each, then you have to do a lot of work to build the data 
structure. 

I'm not sure if you were going down that path but I'd rather keep the interface 
for the zero-arg case analogous to the one-arg case.


PS6, Line 35: Symbol
> ReplaceableFunction seems to be a more precise description.
I'll rename.


PS6, Line 53: a set of symbols.
> Doesn't it only handle replacement of functions ? "symbols" seems to imply 
Yep will change.


PS6, Line 54: This uses the "Visitor" pattern,
> IMHO, this is not important details. It would be more helpful to explain wh
Done


Line 55: /// to replace a function callsite with a symbol.
> I think this class warrants a more detailed explanation. Essentially, the r
Done


PS6, Line 59: virtual std::vector<ReplaceableSymbol> HandledSymbols() = 0;
> This seems to be implementation details which really shouldn't be exposed t
It's part of the interface: the constant replacement pass needs to know which 
functions the replacer can operate on.


PS6, Line 62:  'constant_arg_index' of -1 is found.
> It may be easier to understand to say the replaced function has no argument
That's not true though, the function may have arguments, but they're ignored 
for the purposes of constant replacement. E.g. most functions we're replacing 
have a 'this' argument. I clarified in the class comment


PS6, Line 65: virtual llvm::Value* ReplaceNoArgs(
> Following from the restructuring comments above, I don't see there is need 
As I mentioned above, it doesn't make sense to materialize all of the possible 
replacement values. It's feasible for some cases but that approach breaks down 
as soon as the domain of valid input arguments get larger than, say, 50 to 100. 
I want to make sure the interface is fairly future-proof so we can use it it 
more places without reinventing the mechanism.


PS6, Line 70:  /// This function is invoked when a call to one of the handled 
symbols with
            :   /// a non-negative 'constant_arg_index' is found.
> This function handles replacing functions with at least one constant argume
Updated the comment. Not sure why it would be at least one? This can only 
handle a single constant argument.


PS6, Line 76: int64_t arg)
> Please document the arguments.
Done


PS6, Line 89: class SimpleNoArgConstantReplacer 
> Please see comments above. This class may not need to exist.
I would rather keep them separate so that the ConstantReplacer interface is as 
minimal as possible and doesn't have logic for handling this special case mixed 
in with the generic interface.

Putting the data structure into ConstantReplacer also makes every 
ConstantReplacer stateful, so I'd have to change CpuInfo to return a fresh 
object for every call.


PS6, Line 91: a replacement 
> Add a replacement entry (i.e. function name 'symbol' and its replacement co
Done


PS6, Line 92: void AddReplacement(const std::string& symbol, llvm::Value* 
replacement) {
> Please see comments above. This may fit in ConstantReplacer better.
See replies above.


PS6, Line 120: struct SymbolMapEntry {
> Please see comments above.
Done


http://gerrit.cloudera.org:8080/#/c/3401/6/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

PS6, Line 651:  symbol_map.find(callee->getName()) != symbol_map.end())
> Is it really necessary to expose the symbol_map to the caller ? Isn't it im
I'd rather expose that information because it allows us to implement the pass 
more efficiently, rather than having to blindly call into every replacer for 
every function call in the IR. There's a bunch of processing that this 
short-circuits. It's not super-perf-critical but I think it's worth doing 
things efficiently upfront.

It would also be nice to be able to efficiently do multiple replacements in a 
single pass in future. With this approach we know which replacer handles which 
functions, so we could route each function to the right place. I didn't 
actually implement that yet since there isn't a use case right now.


http://gerrit.cloudera.org:8080/#/c/3401/6/be/src/exec/hash-table-constant-replacer.h
File be/src/exec/hash-table-constant-replacer.h:

PS6, Line 17: 
            : #ifndef IMPALA_EXEC_HASH_TABLE_CONSTANT_REPLACER_H
            : #define IMPALA_EXEC_HASH_TABLE_CONSTANT_REPLACER_H
> Shouldn't these lines be above the #include lines above ?
Done


http://gerrit.cloudera.org:8080/#/c/3401/6/be/src/exprs/expr.cc
File be/src/exprs/expr.cc:

PS6, Line 580: class ArgTypeConstantReplacer : public ConstantReplacer 
> This seems to be applicable to Decimal only so shouldn't this be moved to d
I moved it to expr-constant-replacement.cc since it doesn't really need to be 
in this file.

The fact that it is only used for decimal right now is incidental. E.g. we 
could use it for VARCHAR/CHAR length. I think it makes sense to name it more 
generically since we don't want to have to move everything if we make an 
obvious extension.


PS6, Line 609: rn codegen->GetIntConstant(TYPE_INT, type.precision);
> Please see comments for the ConstantReplacer class. I don't see why this ca
See my comment there. Materializing a map of all possible values in general 
only works when 'arg' has a small number of possible values. It would be ok in 
this specific case but I'd rather stick with the more general-purpose (and 
efficient) pattern since the code is pretty straightforward.


PS6, Line 616:   const FunctionContext::TypeDesc& return_type_;
             :   const vector<FunctionContext::TypeDesc>& arg_types_;
> If we can insert all the mappings from function/symbol name to replacement 
See above.


-- 
To view, visit http://gerrit.cloudera.org:8080/3401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba029ed8589698eb15dbfb0a20dd2a7ea752635
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to