Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12065 )
Change subject: IMPALA-5843: Use page index in Parquet files to skip pages ...................................................................... Patch Set 14: (2 comments) http://gerrit.cloudera.org:8080/#/c/12065/14//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12065/14//COMMIT_MSG@30 PS14, Line 30: Testing I looked into test_scanners_fuzz.py, and noticed there is no query with WHERE clause at all. This means that we can be sure that some parts of the page index logic are not tested with corrupted parquet files. This also means holes in the testing of existing logic, e.g. row group level min/max stats were also not exercised. I am ok with moving this task to a follow up Jira. http://gerrit.cloudera.org:8080/#/c/12065/14/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/12065/14/be/src/exec/parquet/hdfs-parquet-scanner.cc@639 PS14, Line 639: if (state_->query_options().parquet_read_page_index) { It is not useful to read the page index if there are no suitable predicates for min/max filtering ( == if min_max_conjunct_evals_ is empty ). -- To view, visit http://gerrit.cloudera.org:8080/12065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a Gerrit-Change-Number: 12065 Gerrit-PatchSet: 14 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Pooja Nilangekar <pooja.nilange...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Thu, 18 Apr 2019 10:27:42 +0000 Gerrit-HasComments: Yes