Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17638 )
Change subject: WiP: IMPALA-9495: Support struct in select list for ORC tables ...................................................................... Patch Set 4: (14 comments) Went through a subset of BE code. Looks pretty solid. Will try to review more in this week. http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/exprs/scalar-expr-evaluator.h File be/src/exprs/scalar-expr-evaluator.h: http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/exprs/scalar-expr-evaluator.h@262 PS4, Line 262: children_ nit. Maybe rename as childEvaluators_? http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/exprs/scalar-expr-evaluator.cc File be/src/exprs/scalar-expr-evaluator.cc: http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/exprs/scalar-expr-evaluator.cc@96 PS4, Line 96: children_expr_evals nit. Could we use children_ directly here? http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/descriptors.h File be/src/runtime/descriptors.h: http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/descriptors.h@492 PS4, Line 492: /// tuple that holds the topmost struct slot. nit. May draw a diagram to help illustrate the data structure. http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/descriptors.cc File be/src/runtime/descriptors.cc: http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/descriptors.cc@158 PS4, Line 158: bool SlotDescriptor::IsChildOfStruct() nit. inline? http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/descriptors.cc@342 PS4, Line 342: than nit. then http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/raw-value.cc File be/src/runtime/raw-value.cc: http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/raw-value.cc@223 PS4, Line 223: nit. I wonder if we need to handle value being null here. See line of code at 258. http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/raw-value.cc@246 PS4, Line 246: children_tuple nit. May renamed as children_tuple_desc. http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/raw-value.cc@260 PS4, Line 260: } else { : Write(src_child, dst, child_slot->type(), pool); : if (child_slot->type().IsVarLenStringType()) { : string_values->push_back(reinterpret_cast<StringValue*>(dst)); : } nit this block of code duplicates the block at 227 - 230. http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/tuple.h File be/src/runtime/tuple.h: http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/tuple.h@253 PS4, Line 253: /// Assumes 'slot_desc' as a struct slot. Sets 'slot_desc' to null in this tuple and nit. 'slot_desc' describes a struct slot in this tuple. http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/tuple.cc File be/src/runtime/tuple.cc: http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/tuple.cc@246 PS4, Line 246: SetNull(slot_desc->null_indicator_offset()); : if (slot_desc->type().IsStructType()) SetStructToNull(slot_desc); nit. May distinguish the two cases and call SetNull() or SetStructToNull(). See duplicated call SetNull at line 255. http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/tuple.cc@252 PS4, Line 252: const SlotDescriptor* slot_desc nit. May typed as "const SlotDescriptor* const" so that we know slot_desc will not be altered. http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/types.h File be/src/runtime/types.h: http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/types.h@236 PS4, Line 236: static inline int GetDecimalByteSize(int precision) { : DCHECK_GT(precision, 0); : if (precision <= MAX_DECIMAL4_PRECISION) return 4; : if (precision <= MAX_DECIMAL8_PRECISION) return 8; : return 16; : } nit. May not be necessary to relocate the method. http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/types.h@262 PS4, Line 262: /// Helper function for GetSlotSize() so that struct size could be calculated : /// recursively. : inline int GetSlotSize(const ColumnType& col_type) const { : switch (col_type.type) { : case TYPE_STRUCT: { : int struct_size = 0; : for (ColumnType child_type : col_type.children) { : struct_size += GetSlotSize(child_type); : } : return struct_size; : } : case TYPE_STRING: : case TYPE_VARCHAR: : return 12; : case TYPE_CHAR: : case TYPE_FIXED_UDA_INTERMEDIATE: : return col_type.len; : case TYPE_ARRAY: : case TYPE_MAP: : return 12; : default: : return GetByteSize(col_type); : } : } nit. It may not be necessary to relocate this method. Please refer to GetDecimalByteSize() in the original version. http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/types.h@293 PS4, Line 293: GetByteSize(child_type); nit. I wonder if this call can become expensive with a deeply nested struct. Would it be possible to cache the size? -- To view, visit http://gerrit.cloudera.org:8080/17638 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0fbe56bdcd372b72e99c0195d87a818e7fa4bc3a Gerrit-Change-Number: 17638 Gerrit-PatchSet: 4 Gerrit-Owner: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Comment-Date: Wed, 28 Jul 2021 01:44:28 +0000 Gerrit-HasComments: Yes
