Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9799
Change subject: IMPALA-4123 (prep): Parquet column reader cleanup ...................................................................... IMPALA-4123 (prep): Parquet column reader cleanup Some miscellaneous cleanup to make it easier to understand and make future changes to the Parquet scanner. A lot of the refactoring is about more cleanly separating functions so that they have clearer purpose, e.g.: * Functions that strictly do decoding, i.e. materialize values, convert and validate them. These are changed to operate on single values, not tuples. * Functions that are used for the non-batched decoding path (i.e. driven by CollectionColumnReader or BoolColumnReader). * Functions that dispatch to a templated implementation based on one or more runtime values. Other misc changes: * Move large functions out of class bodies. * Use parquet::Encoding instead of bool to indicate encoding. * Add some additional DCHECKs. Testing: * Ran exhaustive tests * Ran fuzz test in a loop Change-Id: Ibc00352df3a0b2d605f872ae7e43db2dc90faab1 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/util/bit-stream-utils.h M be/src/util/rle-encoding.h 5 files changed, 226 insertions(+), 166 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/9799/1 -- To view, visit http://gerrit.cloudera.org:8080/9799 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ibc00352df3a0b2d605f872ae7e43db2dc90faab1 Gerrit-Change-Number: 9799 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>