Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/11000 )
Change subject: IMPALA-5542: Impala cannot scan Parquet decimal stored as int64_t/int32_t ...................................................................... Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/11000/2/be/src/exec/parquet-metadata-utils.cc File be/src/exec/parquet-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/11000/2/be/src/exec/parquet-metadata-utils.cc@199 PS2, Line 199: right precision this might be a bit confusing since the term "precision" when used in the context of parquet refers to the explicit "precision" field defined in the metadata http://gerrit.cloudera.org:8080/#/c/11000/2/be/src/exec/parquet-metadata-utils.cc@203 PS2, Line 203: slot_desc->type().GetByteSize() can you add check for schema_element.type_length, since this section of the code deals more with verifying the column metadata. I am not sure if the check at L216 is the same. http://gerrit.cloudera.org:8080/#/c/11000/2/be/src/exec/parquet-metadata-utils.cc@234 PS2, Line 234: // The other decimal metadata should be there but we don't need it. : if (!schema_element.__isset.precision) { : ErrorMsg msg(TErrorCode::PARQUET_MISSING_PRECISION, filename, schema_element.name); : RETURN_IF_ERROR(state->LogOrReturnError(msg)); : } else { : if (schema_element.precision != slot_desc->type().precision) { : // TODO: we could allow a mismatch and do a conversion at this step. : ErrorMsg msg(TErrorCode::PARQUET_WRONG_PRECISION, filename, schema_element.name, : schema_element.precision, slot_desc->type().precision); : RETURN_IF_ERROR(state->LogOrReturnError(msg)); : } : } can you also add similar checks for precision as well. Use RETURN_IF_ERROR(state->LogOrReturnError(msg)); to make sure it is consistent with other precision checks http://gerrit.cloudera.org:8080/#/c/11000/2/be/src/exec/parquet-plain-test.cc File be/src/exec/parquet-plain-test.cc: http://gerrit.cloudera.org:8080/#/c/11000/2/be/src/exec/parquet-plain-test.cc@38 PS2, Line 38: Handle special case of encoding decimal types stored as BYTE_ARRAY update comment -- To view, visit http://gerrit.cloudera.org:8080/11000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib8c41bfc7c1664bdba5099d3893dc8dbe4304794 Gerrit-Change-Number: 11000 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 31 Jul 2018 18:09:45 +0000 Gerrit-HasComments: Yes
