Amogh Margoor has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: [WIP] IMPALA-9873: Avoid materilization of columns for filtered 
out rows in Parquet table.
......................................................................

[WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in 
Parquet table.

Currently, entire row is materialized, before filtering upon it during
scan. Instead, cost can be saved if only the columns required for
filtering are materialized first and then rest of the columns are
materialized only for rows surviving after filter.

Performance:
Peformance measured for single daemon, single threaded impalad
upon TPCH scale 42 lineitem table with 252 million rows,
unsorted data. Upto 2.5x improvement for non-page indexed and
upto 4x improvement in page index seen. Queries for page index
borrowed from blog:
https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/
More details:
https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing

Testing: TBD

Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/hdfs-columnar-scanner-ir.cc
M be/src/exec/hdfs-columnar-scanner.cc
M be/src/exec/hdfs-columnar-scanner.h
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/parquet/parquet-collection-column-reader.cc
M be/src/exec/parquet/parquet-collection-column-reader.h
M be/src/exec/parquet/parquet-column-chunk-reader.cc
M be/src/exec/parquet/parquet-column-chunk-reader.h
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/exec/parquet/parquet-column-readers.h
M be/src/exec/scratch-tuple-batch.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
18 files changed, 774 insertions(+), 121 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/4
--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 4
Gerrit-Owner: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>

Reply via email to