Henry Robinson has uploaded a new patch set (#2).

Change subject: IMPALA-2494: Support for byte array-encoded decimals in Parquet 
scanner
......................................................................

IMPALA-2494: Support for byte array-encoded decimals in Parquet scanner

 * Extend metadata checks to allow more than one possible physical type
   for a given logical type.
 * Change decimal decoding to handle non-fixed-length format in same path
   as fixed-length encoding.

Testing:

 * Query test that decodes dictionary-encoded decimals using binary
   encoding.

Perf:

 * Tested computing SUM(col) for 1 billion distinct dictionary-encoded
   decimal(12,2) values using FIXED_BYTE_ARRAY physical type encoding.
 * The overhead of decoding with the extra branch was measured at 1s;
   i.e. the per-decode overhead is 1ns.

Change-Id: If95171e65aa48f08b08b8e87f4555dc75e867977
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-common.h
M be/src/exec/parquet-metadata-utils.cc
A testdata/data/byte_array_decimal_dict_encoded.parquet
A testdata/workloads/functional-query/queries/QueryTest/decimal-encodings.test
M tests/query_test/test_scanners.py
6 files changed, 119 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/5115/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5115
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If95171e65aa48f08b08b8e87f4555dc75e867977
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>

Reply via email to