Tim Armstrong has posted comments on this change. Change subject: IMPALA-3931: arbitrary fixed-size uda intermediate types ......................................................................
Patch Set 8: (3 comments) http://gerrit.cloudera.org:8080/#/c/7526/8/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: PS8, Line 467: // Represent this as an array of bytes. : return ArrayType::get(GetType(TYPE_TINYINT), type.len); > IMHO, this change in behavior makes GetType() an error prone interface. Giv I think the relationship is analogous to the existing cases: int64 <==> BIGINT StringValue <==> STRING TimestampValue <==> TIMESTAMP int128 <==> DECIMAL(38,7) int8[n] <==> FIXED_UDA_INTERMEDIATE[n] It's already a bug if you call this with TYPE_STRING when constructing an IR function argument since it returns StringValue (we should rename string_val_type and timestamp_val_type for sure though). http://gerrit.cloudera.org:8080/#/c/7526/11/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: PS11, Line 1540: bitcast { i64, i8* }* %dst_lowered_ptr to %"struct.impala_udf: > This seems unnecessary now but I guess DCE will eliminate it. Yeah, DCE will get it. I think it's simpler than modifying the codegen interfaces to avoid emitting the instruction in this special case. http://gerrit.cloudera.org:8080/#/c/7526/8/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS8, Line 478: TYPE_FIXED_UDA_INTERMEDIATE: > Is TYPE_FIXED_UDA_INTERMEIDATE limited to built-in UDA only ? stddev already uses it internally and there doesn't seem to have been anything preventing declaration of a UDA using CHAR(n). Codegen is disabled though. On master I can declare a UDA with a CHAR intermediate if I load the UDAs from this patch: $ MAKE_CMD=ninja ./testdata/bin/copy-udfs-udas.sh -build $ git checkout origin/master $ git rev-parse HEAD df9ecdc45a34b176f2ee6eda8e12d0195fba9ae7 [localhost:21000] > create aggregate function udasum(int) RETURNS int INTERMEDIATE CHAR(10) LOCATION '/test-warehouse/libTestUdas.so' UPDATE_FN='AggCharIntermediateUpdate' INIT_FN='AggCharIntermediateInit' MERGE_FN='AggCharIntermediateMerge' FINALIZE_FN='AggCharIntermediateFinalize'; Query: create aggregate function udasum(int) RETURNS int INTERMEDIATE CHAR(10) LOCATION '/test-warehouse/libTestUdas.so' UPDATE_FN='AggCharIntermediateUpdate' INIT_FN='AggCharIntermediateInit' MERGE_FN='AggCharIntermediateMerge' FINALIZE_FN='AggCharIntermediateFinalize' Fetched 0 row(s) in 0.17s [localhost:21000] > select year, month, day, udasum(int_col), sum(int_col) from functional.alltypesagg group by year, month, day; Query: select year, month, day, udasum(int_col), sum(int_col) from functional.alltypesagg group by year, month, day Query submitted at: 2017-08-14 10:54:10 (Coordinator: http://tarmstrong-box:25000) Query progress can be monitored at: http://tarmstrong-box:25000/query_plan?query_id=3546da729d693928:a81fcd1b00000000 Socket error 104: Connection reset by peer [Not connected] > Goodbye tarmstrong So I guess it's allowed but broken. Arguably it might be better to disallow it instead of fixing it. -- To view, visit http://gerrit.cloudera.org:8080/7526 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ife90cf27989f98ffb5ef5c39f1e09ce92e8cb87c Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
