Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/7822 )
Change subject: IMPALA-2494: Support for byte array encoded decimals in Parquet scanner ...................................................................... Patch Set 5: (7 comments) http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/hdfs-parquet-scanner.h File be/src/exec/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/hdfs-parquet-scanner.h@45 PS5, Line 45: INTERNAL_TYPE > I should have mentioned this earlier, but I think we mostly use CamelCase f Done http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-common.h File be/src/exec/parquet-common.h: http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-common.h@235 PS5, Line 235: template <> : inline int ParquetPlainEncoder::ByteSize(const Decimal4Value&) { : DCHECK(false); : return -1; : } : template <> : inline int ParquetPlainEncoder::ByteSize(const Decimal8Value&) { : DCHECK(false); : return -1; : } : template <> : inline int ParquetPlainEncoder::ByteSize(const Decimal16Value&) { : DCHECK(false); : return -1; : } fixed code duplication here too. http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-common.h@284 PS5, Line 284: template <> : inline int ParquetPlainEncoder::Encode( : const int8_t& v, int fixed_len_size, uint8_t* buffer) { : int32_t val = v; : memcpy(buffer, &val, sizeof(int32_t)); : return ByteSize(v); : } : : template <> : inline int ParquetPlainEncoder::Encode( : const int16_t& v, int fixed_len_size, uint8_t* buffer) { : int32_t val = v; : memcpy(buffer, &val, sizeof(int32_t)); : return ByteSize(v); : } Also, duplicated methods for both int8 and int16, since both are written out as int32 http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-common.h@328 PS5, Line 328: inline int ParquetPlainEncoder::Encode( > It looks like we could also deduplicate the Encode/Decode methods for FIXED Done http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-common.h@350 PS5, Line 350: Decode<Decimal4Value,parquet::Type::FIXED_LEN_BYTE_ARRAY>(const uint8_t* buffer, > nit: missing space Done http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-plain-test.cc File be/src/exec/parquet-plain-test.cc: http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-plain-test.cc@112 PS5, Line 112: // Decimal4Value: General test case, > I find this a bit confusing still because the values being encoded aren't i As discussed, moving these test cases back to its original form http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-plain-test.cc@200 PS5, Line 200: sizeof(int32_t) > Is there any reason to keep the encoding overhead separate from the encoded see comment above. -- To view, visit http://gerrit.cloudera.org:8080/7822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2c0e881045109f337fecba53fec21f9cfb9e619e Gerrit-Change-Number: 7822 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Tue, 24 Oct 2017 00:29:53 +0000 Gerrit-HasComments: Yes
