Tim Armstrong has posted comments on this change. Change subject: IMPALA-5347: Parquet scanner microoptimizations ......................................................................
Patch Set 2: (7 comments) http://gerrit.cloudera.org:8080/#/c/6950/2/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 937: if (template_tuple_ == nullptr && tuple_byte_size_ <= CACHE_LINE_SIZE) { > Might be worth putting this into an InitBatch() function. Factored this out along with the above Reset() call, since they really go together. http://gerrit.cloudera.org:8080/#/c/6950/2/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: Line 486: uint8_t val_buf[sizeof(T)]; // Uninitialized stack allocation for temporary value. > Comment just seems to describe what this line is doing, but not why. Mentio Done http://gerrit.cloudera.org:8080/#/c/6950/2/be/src/util/bit-stream-utils.inline.h File be/src/util/bit-stream-utils.inline.h: Line 91: ALWAYS_INLINE inline bool BitReader::GetValue(int num_bits, T* v) { > remove inline? I tried initially but it turns out the the always_inline attribute doesn't imply one-definition-rule linkage in the same way that "inline" - there was a compiler warning. However, defining the function inside a class body implies "inline", so "inline" can be safely omitted in those cases. http://gerrit.cloudera.org:8080/#/c/6950/2/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: Line 291: ALWAYS_INLINE inline bool DictDecoder<T>::GetNextValue(T* value) { > remove inline? The definition is outside the class so it's not actually redundant (see earlier comment). Line 304: ALWAYS_INLINE inline bool DictDecoder<Decimal16Value>::GetNextValue( > remove inline? The definition is outside the class so it's not actually redundant (see earlier comment). http://gerrit.cloudera.org:8080/#/c/6950/2/be/src/util/rle-encoding.h File be/src/util/rle-encoding.h: Line 251: ALWAYS_INLINE inline bool RleDecoder::Get(T* val) { > remove inline? The definition is outside the class so it's not actually redundant (see earlier comment). Line 269: if (UNLIKELY(!bit_reader_.GetValue(bit_width_, val))) return false; > Really a critical change? It changes the behavior, so might be dangerous / It made a small but measurable difference, but probably not enough to worry about and there is an edge case where this is less correct so I reverted it. I think it's generally safe - it doesn't change the external behaviour except where the input data is malformed and the client repeatedly calls Get() after it returns false. Then maybe this GetValue() call could return false, but the one on line 280 will later return true and return some RLE-encoded data. -- To view, visit http://gerrit.cloudera.org:8080/6950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I49ec523a65542fdbabd53fbcc4a8901d769e5cd5 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: anujphadke <[email protected]> Gerrit-HasComments: Yes
