Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
......................................................................

IMPALA-5842: Write page index in Parquet files

This commit builds on the previous work of
Pooja Nilangekar: https://gerrit.cloudera.org/#/c/7464/

The commit implements the write path of PARQUET-922:
"Add column indexes to parquet.thrift". As specified in the
parquet-format, Impala writes the page indexes just before
the footer. This allows much more efficient page filtering
than using the same information from the 'statistics' field
of DataPageHeader.

I updated Pooja's python tests as well.

Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Reviewed-on: http://gerrit.cloudera.org:8080/9693
Reviewed-by: Zoltan Borok-Nagy <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
---
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/parquet-column-stats.h
M be/src/exec/parquet-column-stats.inline.h
M be/src/util/CMakeLists.txt
A be/src/util/string-util-test.cc
A be/src/util/string-util.cc
A be/src/util/string-util.h
M common/thrift/parquet.thrift
M testdata/bin/load-dependent-tables.sql
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M tests/query_test/test_chars.py
A tests/query_test/test_parquet_page_index.py
M tests/util/get_parquet_metadata.py
14 files changed, 1,002 insertions(+), 113 deletions(-)

Approvals:
  Zoltan Borok-Nagy: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 21
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Anonymous Coward #248
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>

Reply via email to