Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17638 )

Change subject: WiP: IMPALA-9495: Support struct in select list for ORC tables
......................................................................


Patch Set 4:

(14 comments)

Went through a subset of BE code. Looks pretty solid.

Will try to review more in this week.

http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/exprs/scalar-expr-evaluator.h
File be/src/exprs/scalar-expr-evaluator.h:

http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/exprs/scalar-expr-evaluator.h@262
PS4, Line 262: children_
nit. Maybe rename as childEvaluators_?


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/exprs/scalar-expr-evaluator.cc
File be/src/exprs/scalar-expr-evaluator.cc:

http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/exprs/scalar-expr-evaluator.cc@96
PS4, Line 96: children_expr_evals
nit. Could we use children_ directly here?


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/descriptors.h@492
PS4, Line 492:   /// tuple that holds the topmost struct slot.
nit. May draw a diagram to help illustrate the data structure.


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/descriptors.cc@158
PS4, Line 158: bool SlotDescriptor::IsChildOfStruct()
nit. inline?


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/descriptors.cc@342
PS4, Line 342: than
nit. then


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/raw-value.cc
File be/src/runtime/raw-value.cc:

http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/raw-value.cc@223
PS4, Line 223:
nit. I wonder if we need to handle value being null here. See line of code at 
258.


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/raw-value.cc@246
PS4, Line 246: children_tuple
nit. May renamed as children_tuple_desc.


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/raw-value.cc@260
PS4, Line 260: } else {
             :       Write(src_child, dst, child_slot->type(), pool);
             :       if (child_slot->type().IsVarLenStringType()) {
             :         
string_values->push_back(reinterpret_cast<StringValue*>(dst));
             :       }
nit this block of code duplicates the block at 227 - 230.


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/tuple.h
File be/src/runtime/tuple.h:

http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/tuple.h@253
PS4, Line 253:   /// Assumes 'slot_desc' as a struct slot. Sets 'slot_desc' to 
null in this tuple and
nit. 'slot_desc' describes a struct slot in this tuple.


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/tuple.cc
File be/src/runtime/tuple.cc:

http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/tuple.cc@246
PS4, Line 246:  SetNull(slot_desc->null_indicator_offset());
             :       if (slot_desc->type().IsStructType()) 
SetStructToNull(slot_desc);
nit. May distinguish the two cases and call SetNull() or SetStructToNull(). See 
duplicated call SetNull at line 255.


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/tuple.cc@252
PS4, Line 252: const SlotDescriptor* slot_desc
nit. May typed as "const SlotDescriptor* const" so that we know slot_desc will 
not be altered.


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/types.h
File be/src/runtime/types.h:

http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/types.h@236
PS4, Line 236:  static inline int GetDecimalByteSize(int precision) {
             :     DCHECK_GT(precision, 0);
             :     if (precision <= MAX_DECIMAL4_PRECISION) return 4;
             :     if (precision <= MAX_DECIMAL8_PRECISION) return 8;
             :     return 16;
             :   }
nit. May not be necessary to relocate the method.


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/types.h@262
PS4, Line 262: /// Helper function for GetSlotSize() so that struct size could 
be calculated
             :   /// recursively.
             :   inline int GetSlotSize(const ColumnType& col_type) const {
             :     switch (col_type.type) {
             :       case TYPE_STRUCT: {
             :         int struct_size = 0;
             :         for (ColumnType child_type : col_type.children) {
             :           struct_size += GetSlotSize(child_type);
             :         }
             :         return struct_size;
             :       }
             :       case TYPE_STRING:
             :       case TYPE_VARCHAR:
             :         return 12;
             :       case TYPE_CHAR:
             :       case TYPE_FIXED_UDA_INTERMEDIATE:
             :         return col_type.len;
             :       case TYPE_ARRAY:
             :       case TYPE_MAP:
             :         return 12;
             :       default:
             :         return GetByteSize(col_type);
             :     }
             :   }
nit. It may not be necessary to relocate this method. Please refer to 
GetDecimalByteSize() in the original version.


http://gerrit.cloudera.org:8080/#/c/17638/4/be/src/runtime/types.h@293
PS4, Line 293: GetByteSize(child_type);
nit. I wonder if this call can become expensive with a deeply nested struct. 
Would it be possible to cache the size?



--
To view, visit http://gerrit.cloudera.org:8080/17638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fbe56bdcd372b72e99c0195d87a818e7fa4bc3a
Gerrit-Change-Number: 17638
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Comment-Date: Wed, 28 Jul 2021 01:44:28 +0000
Gerrit-HasComments: Yes

Reply via email to