Henry Robinson has uploaded a new patch set (#3). Change subject: IMPALA-2494: Support for byte array-encoded decimals in Parquet scanner ......................................................................
IMPALA-2494: Support for byte array-encoded decimals in Parquet scanner * Extend metadata checks to allow more than one possible physical type for a given logical type. * Change decimal decoding to handle non-fixed-length format in same path as fixed-length encoding. Testing: * Query test that decodes both plain and dictionary-encoded decimals using binary encoding. Perf: * Tested computing SUM(col) for 1 billion distinct dictionary-encoded decimal(12,2) values using FIXED_BYTE_ARRAY physical type encoding. * The overhead of decoding with the extra branch was measured at 1s; i.e. the per-decode overhead is 1ns. Change-Id: If95171e65aa48f08b08b8e87f4555dc75e867977 --- M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-common.h M be/src/exec/parquet-metadata-utils.cc A testdata/data/binary_decimal_dictionary.parquet A testdata/data/binary_decimal_no_dictionary.parquet A testdata/workloads/functional-query/queries/QueryTest/decimal-encodings.test M tests/query_test/test_scanners.py 7 files changed, 135 insertions(+), 58 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/5115/3 -- To view, visit http://gerrit.cloudera.org:8080/5115 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If95171e65aa48f08b08b8e87f4555dc75e867977 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]>
