[Impala-ASF-CR] IMPALA-2494: Support for byte array encoded decimals in Parquet scanner

Bikramjeet Vig (Code Review) Mon, 23 Oct 2017 17:30:08 -0700

Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7822 )


Change subject: IMPALA-2494: Support for byte array encoded decimals in Parquet 
scanner
......................................................................


Patch Set 5:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/hdfs-parquet-scanner.h@45
PS5, Line 45: INTERNAL_TYPE
> I should have mentioned this earlier, but I think we mostly use CamelCase f
Done


http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-common.h
File be/src/exec/parquet-common.h:

http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-common.h@235
PS5, Line 235: template <>
             : inline int ParquetPlainEncoder::ByteSize(const Decimal4Value&) {
             :   DCHECK(false);
             :   return -1;
             : }
             : template <>
             : inline int ParquetPlainEncoder::ByteSize(const Decimal8Value&) {
             :   DCHECK(false);
             :   return -1;
             : }
             : template <>
             : inline int ParquetPlainEncoder::ByteSize(const Decimal16Value&) {
             :   DCHECK(false);
             :   return -1;
             : }
fixed code duplication here too.


http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-common.h@284
PS5, Line 284: template <>
             : inline int ParquetPlainEncoder::Encode(
             :     const int8_t& v, int fixed_len_size, uint8_t* buffer) {
             :   int32_t val = v;
             :   memcpy(buffer, &val, sizeof(int32_t));
             :   return ByteSize(v);
             : }
             :
             : template <>
             : inline int ParquetPlainEncoder::Encode(
             :     const int16_t& v, int fixed_len_size, uint8_t* buffer) {
             :   int32_t val = v;
             :   memcpy(buffer, &val, sizeof(int32_t));
             :   return ByteSize(v);
             : }
Also, duplicated methods for both int8 and int16, since both are written out as 
int32


http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-common.h@328
PS5, Line 328: inline int ParquetPlainEncoder::Encode(
> It looks like we could also deduplicate the Encode/Decode methods for FIXED
Done


http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-common.h@350
PS5, Line 350: Decode<Decimal4Value,parquet::Type::FIXED_LEN_BYTE_ARRAY>(const 
uint8_t* buffer,
> nit: missing space
Done


http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-plain-test.cc
File be/src/exec/parquet-plain-test.cc:

http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-plain-test.cc@112
PS5, Line 112: //    Decimal4Value: General test case,
> I find this a bit confusing still because the values being encoded aren't i
As discussed, moving these test cases back to its original form


http://gerrit.cloudera.org:8080/#/c/7822/5/be/src/exec/parquet-plain-test.cc@200
PS5, Line 200: sizeof(int32_t)
> Is there any reason to keep the encoding overhead separate from the encoded
see comment above.



--
To view, visit http://gerrit.cloudera.org:8080/7822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c0e881045109f337fecba53fec21f9cfb9e619e
Gerrit-Change-Number: 7822
Gerrit-PatchSet: 5
Gerrit-Owner: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Tue, 24 Oct 2017 00:29:53 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-2494: Support for byte array encoded decimals in Parquet scanner

Reply via email to