Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11000 )
Change subject: IMPALA-5542: Impala cannot scan Parquet decimal stored as int64_t/int32_t ...................................................................... IMPALA-5542: Impala cannot scan Parquet decimal stored as int64_t/int32_t The Decimal type in Parquet is a logical type. That means the Parquet file stores some physical/primitive type that is annotated by the DECIMAL tag to make it behave like decimals. The allowed physical types for decimals are INT32, INT64, FIXED, and BINARY. Before this commit Impala could only read decimals stored as FIXED or BINARY. Spark decided to write decimals as INT32 or INT64 when their precision allows it: (1 <= precision <= 9) ==> INT32 (10 <= precision <= 18) ==> INT64 I updated our column readers to accept INT32 and INT64 as valid physical types for decimals. Testing: * extended parquet-plain-test.cc * added Parquet files generated by Spark 2.3.1 and updated test_scanners.py Change-Id: Ib8c41bfc7c1664bdba5099d3893dc8dbe4304794 Reviewed-on: http://gerrit.cloudera.org:8080/11000 Reviewed-by: Zoltan Borok-Nagy <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-common.h M be/src/exec/parquet-metadata-utils.cc M be/src/exec/parquet-plain-test.cc M testdata/data/README A testdata/data/decimal_stored_as_int32.parquet A testdata/data/decimal_stored_as_int64.parquet M testdata/workloads/functional-query/queries/QueryTest/parquet-decimal-formats.test M tests/query_test/test_scanners.py 9 files changed, 109 insertions(+), 15 deletions(-) Approvals: Zoltan Borok-Nagy: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/11000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ib8c41bfc7c1664bdba5099d3893dc8dbe4304794 Gerrit-Change-Number: 11000 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
