Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17478 )
Change subject: IMPALA-10709: Min/max filters should be enabled for joins on sorted columns in Parquet tables ...................................................................... Patch Set 33: Code-Review+1 (11 comments) The code looks great, only found a few nits. http://gerrit.cloudera.org:8080/#/c/17478/33/be/src/exec/parquet/hdfs-parquet-scanner.h File be/src/exec/parquet/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/17478/33/be/src/exec/parquet/hdfs-parquet-scanner.h@365 PS33, Line 365: end_age_idx end_page_idx http://gerrit.cloudera.org:8080/#/c/17478/33/be/src/exec/parquet/hdfs-parquet-scanner.h@366 PS33, Line 366: find not-null pages to be skipped This part might suggest that this function filter out not-null pages, while it is a precondition of this function IIUC, i.e. there are no null pages in the given range. http://gerrit.cloudera.org:8080/#/c/17478/33/be/src/exec/parquet/hdfs-parquet-scanner.h@366 PS33, Line 366: define a : /// range R for pages to be retained Probably mention that this function return the complement of R. http://gerrit.cloudera.org:8080/#/c/17478/33/be/src/exec/parquet/hdfs-parquet-scanner.h@377 PS33, Line 377: are. are retained. http://gerrit.cloudera.org:8080/#/c/17478/32/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17478/32/be/src/exec/parquet/hdfs-parquet-scanner.cc@1330 PS32, Line 1330: (3) << "Use fast code path to > In the case of a single page, we should still check against the filter, rig Sorry, it was my bad. http://gerrit.cloudera.org:8080/#/c/17478/33/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17478/33/be/src/exec/parquet/hdfs-parquet-scanner.cc@a1059 PS33, Line 1059: nit: old formatting could be retained http://gerrit.cloudera.org:8080/#/c/17478/33/be/src/exec/parquet/hdfs-parquet-scanner.cc@1119 PS33, Line 1119: MinMaxFilter* Could be const? http://gerrit.cloudera.org:8080/#/c/17478/33/be/src/exec/parquet/hdfs-parquet-scanner.cc@1235 PS33, Line 1235: i)); nit: fits prev line http://gerrit.cloudera.org:8080/#/c/17478/33/be/src/exec/parquet/hdfs-parquet-scanner.cc@1253 PS33, Line 1253: i)); nit: fits prev line http://gerrit.cloudera.org:8080/#/c/17478/33/be/src/exec/parquet/hdfs-parquet-scanner.cc@1326 PS33, Line 1326: int sz = min_vals.size(); : DCHECK_EQ(sz, max_vals.size()); nit: 'sz' is not used elsewhere, so it could be DCHECK_EQ(min_vals.size(), max_vals.size()); http://gerrit.cloudera.org:8080/#/c/17478/33/be/src/exec/parquet/hdfs-parquet-scanner.cc@1328 PS33, Line 1328: DCHECK(skipped_ranges && skipped_ranges->size() == 0); nit: maybe also add DCHECKs for start_page_idx >= 0, and end_page_idx < max_vals.size() -- To view, visit http://gerrit.cloudera.org:8080/17478 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28c19c4b39b01ffa7d275fb245be85c28e9b2963 Gerrit-Change-Number: 17478 Gerrit-PatchSet: 33 Gerrit-Owner: Qifan Chen <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 08 Jun 2021 10:24:18 +0000 Gerrit-HasComments: Yes
