Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12065 )

Change subject: IMPALA-5843: Use page index in Parquet files to skip pages
......................................................................


Patch Set 14:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12065/14//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12065/14//COMMIT_MSG@30
PS14, Line 30: Testing
I looked into test_scanners_fuzz.py, and noticed there is no query with WHERE 
clause at all. This means that we can be sure that some parts of the page index 
logic are not tested with corrupted parquet files. This also means holes in the 
testing of existing logic, e.g. row group level min/max stats were also not 
exercised.

I am ok with moving this task to a follow up Jira.


http://gerrit.cloudera.org:8080/#/c/12065/14/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/12065/14/be/src/exec/parquet/hdfs-parquet-scanner.cc@639
PS14, Line 639:     if (state_->query_options().parquet_read_page_index) {
It is not useful to read the page index if there are no suitable predicates for 
min/max filtering ( == if min_max_conjunct_evals_ is empty ).



--
To view, visit http://gerrit.cloudera.org:8080/12065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a
Gerrit-Change-Number: 12065
Gerrit-PatchSet: 14
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Pooja Nilangekar <pooja.nilange...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Apr 2019 10:27:42 +0000
Gerrit-HasComments: Yes

Reply via email to