[Impala-CR](cdh5-trunk) IMPALA-2548: Codegen Tuple::MaterializeExprs() and use in TopN node

Skye Wanderman-Milne (Code Review) Fri, 11 Mar 2016 15:40:22 -0800

Skye Wanderman-Milne has posted comments on this change.

Change subject: IMPALA-2548: Codegen Tuple::MaterializeExprs() and use in TopN 
node
......................................................................



Patch Set 6:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/1901/6/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

Line 152:  CodegenAnyVal(cg, builder, type, v, name);
> If result_ptr != NULL, CreateCall() will create NULL. So, what's the point 
Good catch. I'll remove the result_ptr argument from CreateCallWrapped(), we 
don't use it anywhere.


Line 507: Value* CodegenAnyVal::ToNativeValue(MemPool* pool) 
> Generated IR in the header comment will help.
I'm not sure it makes sense here, because it's different for every type, and 
each type only emits one or two instructions and/or calls to some other 
function.


Line 565: CodegenAnyVal::WriteToSlot
> May help readers if you post the generated IR in the header comment.
Done


http://gerrit.cloudera.org:8080/#/c/1901/6/be/src/codegen/codegen-anyval.h
File be/src/codegen/codegen-anyval.h:

Line 214: insert_before
> nit: 'insert_before'
Done


Line 217: WriteToSlot
> Can you please consider calling it CodegenWriteToSlot() ?
All of the functions in this class are "Codegen" functions (i.e. they produce 
IR to do the thing they describe), so I think if I changed this function, I 
would have to add "Codegen" to all of them. I could do this, but I don't think 
it's worth the verbosity (and the class name already has "Codegen" in it).


http://gerrit.cloudera.org:8080/#/c/1901/6/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

Line 432:  CodegenMemcpy
> Does this need to be exposed ? Does this even need to be a separate functio
This patch uses both signatures.


http://gerrit.cloudera.org:8080/#/c/1901/6/be/src/exec/topn-node.cc
File be/src/exec/topn-node.cc:

Line 84:   insert_batch_fn = codegen->ReplaceCallSites(insert_batch_fn, false,
> There are two calls to MaterializeExprs(), one with pool != NULL and the ot
No way to guarantee, it's just synchronized with the cross-compiled code. 

I tried defining new MaterializeExprsNoPool() functions, but it's a little 
awkward unfortunately... it results in duplicating the function that takes a 
vector, and the NoPool functions still have to take a pool argument since the 
replacement assumes the signatures match. It does make this replacement less 
brittle though. I'll post a patch with these changes and you can see which you 
like better (or maybe there's a better way to do this altogether).


http://gerrit.cloudera.org:8080/#/c/1901/6/be/src/runtime/tuple.cc
File be/src/runtime/tuple.cc:

Line 218: non_null_string_values++)
> I thought our coding style prefers pre-increment if possible.
It does, but here I actually need the post-increment. Personally I like it this 
way, but I can also put the assignment and increment on separate lines if you 
think that'll be clearer (and then I can use pre-increment).


Line 249: trunc i64 %src to i1
> Is trunc the right instruction to use here ? I believe this is generated by
We use trunc to truncate the val field of the IntVal %src, which is returned as 
an i64 instead of { i8, i32 }, leaving us only with the is_null field. We could 
probably use a mask or maybe some other instruction instead, although the 
optimizer can hopefully do this for us if it's faster. See codegen-anyval.h for 
more info on why we have to do this at all.


Line 252: // non_null:                                         ; preds = %entry
> Part of this IR is generated by functions in CodegenAnyVal, right ? Can you
I added comments showing what was generated by CodegenAnyVal::WriteToSlot(). 
Let me know if you have suggestions for how to make this more clear.


Line 297: Status Tuple::CodegenMaterializeExprs(RuntimeState* state, bool 
collect_string_vals,
> Just curious if you have considered refactoring Tuple::MaterializeExprs() s
You mean cross-compile MaterializeExprs(), but pull out the loop and replace it 
with a codegen'd loop function? The only non-loop code that'd be cross-compiled 
would be the memset, so that wouldn't gain us much. We could cross-compile the 
body of the loop and manually create calls for each loop iteration, but 
unfortunately we don't have the infrastructure to do all the specializations we 
do using the IRBuilder (e.g. injecting constants, replacing generic GetValue() 
calls).


Line 327: GetPtrType(ExprContext::LLVM_CLASS_NAME)->getPointerTo();
> Is there a reason why we mix the use getPointerTo() and GetPtrType() ? Acco
No good reason, I think no one knew about getPointerTo() when GetPtrType() was 
written. We should probably standardize on one or the other (I like 
getPointerTo(), so I'd be in favor of getting rid of GetPtrType()).


Line 348:   // Value* pool_arg = args[4]; // unused, we hardcode 'pool's ptr 
instead
        :   // Value* non_null_string_values_arg = args[5]; // unused
        :   // Value* total_string_arg = args[6]; // unused
> Can these unused arguments just be a single line of comment ?
I don't think they'd fit as-is? I like having these here because it helps not 
screw up the arg order.


Line 364:        
> nit: wrong indentation ?
Done


http://gerrit.cloudera.org:8080/#/c/1901/6/be/src/runtime/tuple.h
File be/src/runtime/tuple.h:

Line 122: std::vector<StringValue*>* non_null_string_values =
> The conversion of std::vector<StringValue*> to StringValue** seems dangerou
You mean passing in &materialize_expr_ctxs[0] below? AFAIK that's the idiomatic 
way to access a vector's underlying buffer. I'm not sure what you mean to not 
use vector here, wouldn't callers have to convert their vectors in the same way 
then?


-- 
To view, visit http://gerrit.cloudera.org:8080/1901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib422a8d50303c21c6a228675157bf867e8619444
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Skye Wanderman-Milne <[email protected]>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-2548: Codegen Tuple::MaterializeExprs() and use in TopN node

Reply via email to