Skye Wanderman-Milne has posted comments on this change. Change subject: IMPALA-2548: Codegen Tuple::MaterializeExprs() and use in TopN node ......................................................................
Patch Set 10: (4 comments) http://gerrit.cloudera.org:8080/#/c/1901/10/be/src/runtime/descriptors.h File be/src/runtime/descriptors.h: Line 395: static const char* LLVM_CLASS_NAME; > does everybody need access to this? if not, make private and add friends Can you explain the underlying principle here? This is used by codegen functions to get the TupleDescriptor LLVM type (and similarly we have a LLVM_CLASS_NAME for every class that needs to be referenced in an IRBuilder codegen'd function). According to https://google.github.io/styleguide/cppguide.html#Friends, "Friends extend, but do not break, the encapsulation boundary of a class". My understanding is that LLVM_CLASS_NAME users don't qualify since they're not implementing TupleDescriptor functionality, but I'm not super familiar with friend class best practices. http://gerrit.cloudera.org:8080/#/c/1901/10/be/src/runtime/tuple.cc File be/src/runtime/tuple.cc: Line 208: SlotDescriptor* slot_desc = desc.slots()[i]; > instead of generating ir by hand below, could you do this by I implemented a quick prototype of the interpreted functionality so you can see what it looks like before I implement the rest and clean up what I don't need anymore. Here's how it works: 1. The loop body is broken out into a function for each type. We need this for cross-compiling so we can use the specific ExprContext::Get*Val() methods instead of the generic ExprContext::GetValue() method. This is done with a macro. So far I've only plumbed through int and string, but this would be extended to all types. This also requires having a RawValue::Write() function for each AnyVal subclass. 2. MaterializeExprs() has a switch statement in the for loop that calls the correct body function. 3. (NYI) I'm not sure how to write a generic codegen helper function to manually call the body function with the correct arguments, but maybe you can help me think of a way. I'm currently planning to write a new CodegenMaterializeExprs() function that loops over all the slot descriptors, converts the relevant slot parameters to constant LLVM values, and generates a call to the correct loop body function passing in the constant params. We'll have to generate constant ColumnType and Tuple::NullIndicatorOffset structs to pass in, but luckily I already have a patch to do ColumnType and NullIndicator should be similar. This function will also generate the initial memset call, which is the only non-loop-body work, so we don't have to cross-compile MaterializeExprs(). This function will also replace the Get*Val calls with the codegen'd versions of these functions. What do you think? Even though this makes the C++ code more verbose, I think it's an improvement since we'll end up cross-compiling the AnyVal::Write() calls instead of reimplementing them using the IRBuilder in CodegenAnyVal::WriteToSlot(). However, I'm not sure this will make CodegenMaterializeExprs() any simpler, and it may actually get a little longer and more nuanced. Feel free to grab me if you wanna discuss in person. http://gerrit.cloudera.org:8080/#/c/1901/10/be/src/runtime/tuple.h File be/src/runtime/tuple.h: Line 123: template <bool collect_string_vals, bool no_pool> > what is the point of templatizing this? if there's a codegen'd version of i It's to differentiate the pool vs. the no pool function calls for when we replace the function calls during codegen. Having the template param gives us two different symbols for the two different cases. I'll expand the comment to explain this more. Line 143: static const char* MATERIALIZE_EXPRS_SYMBOL; > move to private (and make whoever needs this a friend, hopefully that'll be See my other comment, I think this case is similar. -- To view, visit http://gerrit.cloudera.org:8080/1901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib422a8d50303c21c6a228675157bf867e8619444 Gerrit-PatchSet: 10 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Skye Wanderman-Milne <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Skye Wanderman-Milne <[email protected]> Gerrit-HasComments: Yes
