Dan Hecht has posted comments on this change. Change subject: IMPALA-3931: arbitrary fixed-size uda intermediate types ......................................................................
Patch Set 12: (9 comments) http://gerrit.cloudera.org:8080/#/c/7526/12/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: PS12, Line 555: the StringVal is already set up I'm not sure I follow this comment given that this function is setting the *Value not the *Val, right? Is it saying that the slot was already modified since the StringVal was referencing the destination slot itself and updates were done in place? http://gerrit.cloudera.org:8080/#/c/7526/12/be/src/codegen/codegen-anyval.h File be/src/codegen/codegen-anyval.h: PS12, Line 52: TYPE_FIXED_UDA_INTERMEDIATE why isn't this a byte array if that's how it is represented? I guess we want the pointer indirection to pass the thing around and so StringVal is convenient (even though runtime len shouldn't be needed)? I guess it's clearer once you read the comment in udf.h, but it's kinda hidden. PS12, Line 193: matching : /// native type isn't the native type for intermediate just char array? By "native" do we mean the slot representation? Should we say that instead, or do we use "native" to mean the slot type (as opposed to the C type or lowered type, etc)? PS12, Line 195: raw_val_ptr is this a pointer to the slot itself? if so, why not just specify it like that? (here and elsewhere) Isn't that actually required, given the assumption that StoreToNativePtr() makes about the fixed intermediate case being a no-op? http://gerrit.cloudera.org:8080/#/c/7526/12/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: PS12, Line 240: the delete http://gerrit.cloudera.org:8080/#/c/7526/12/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: Line 888 it might be a bit confusing that now StringVal's can have different management rules depending on the actual intermediate type. But maybe any alternative (e.g. a new FixedArrayVal that's just a 'char *' (and len is determined statically from the type) adds complexity that's not worth it? http://gerrit.cloudera.org:8080/#/c/7526/12/be/src/udf/udf.h File be/src/udf/udf.h: PS12, Line 81: TYPE_FIXED_BUFFER should we rename that to TYPE_CHAR, or would that break compat? Alternatively, is there a reason TYPE_FIXED_UDA_INTERMEDIATE has to be distinct from this? PS12, Line 346: For UDAs that need a complex data structure as the intermediate state, the : /// intermediate type should be string and the UDA can cast the ptr to the structure : /// it is using. : /// : /// Memory Management: For allocations that are not returned to Impala, the UDA should use : /// the FunctionContext::Allocate()/Free() methods. In general, Allocate() is called in : /// Init(), and then Free() must be called in both Serialize() and Finalize(), since : /// either of these functions may be called to clean up the state. For StringVal : /// allocations returned to Impala (e.g. returned by UdaSerialize()), the UDA should : /// allocate the result via StringVal(FunctionContext*, int) ctor or the function : /// StringVal::CopyFrom(FunctionContext*, const uint8_t*, size_t) and Impala will : /// automatically handle freeing it. does any of that need updating to account for the fact that these new StringVal's point at the slot itself? http://gerrit.cloudera.org:8080/#/c/7526/12/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java File fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java: Line 57: private static final int RANK_INTERMEDIATE_SIZE = 16; are these constants verified for consistency in any way? -- To view, visit http://gerrit.cloudera.org:8080/7526 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ife90cf27989f98ffb5ef5c39f1e09ce92e8cb87c Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
