Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17295 )
Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early ...................................................................... Patch Set 27: (12 comments) Found a few nits, but looks good overall. http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.h File be/src/exec/filter-context.h: http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.h@155 PS27, Line 155: static bool ShouldRejectFilterBasedOnColumnStats( nit: I'm OK with reformatting, but I'm not sure if it was intended http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.cc File be/src/exec/filter-context.cc: http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.cc@231 PS27, Line 231: example Could you please update this example? http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@984 PS27, Line 984: scalar_reader->offset_index_ nit: simply 'offset_index' from at L978? http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@997 PS27, Line 997: DCHECK nit: DCHECK_GE could be used. It has the advantage that in case of failure it prints the actual values. http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@1010 PS27, Line 1010: Expected Any reason why we don't want the error message to be logged here? http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@1107 PS27, Line 1107: nit: probably unintended space http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.cc File be/src/exec/parquet/parquet-column-stats.cc: http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.cc@281 PS27, Line 281: const int remainder = num_values % batch; nit: do we need to calculate remainder? In the second for-loop we could have for (int i = pos; i < num_values; ++i) Or, use pos itself: for (; pos < num_values; ++pos) http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.cc@321 PS27, Line 321: const int remainder = num_values % batch; nit: same as above http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.inline.h File be/src/exec/parquet/parquet-column-stats.inline.h: http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.inline.h@143 PS27, Line 143: DCHECK(buffer.size() == sizeof(int32_t)); : DCHECK(parquet_type == parquet::Type::INT32); nit: DCHECK_EQ http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.inline.h@159 PS27, Line 159: DCHECK(buffer.size() == sizeof(int32_t)); : DCHECK(parquet_type == parquet::Type::INT32); nit: DCHECK_EQ http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc@345 PS27, Line 345: not_useful = false; nit: I think it'd be a bit more readable if we decrease the negations, i.e. only call the variable 'useful'. http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/util/min-max-filter-ir.cc File be/src/util/min-max-filter-ir.cc: http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/util/min-max-filter-ir.cc@114 PS27, Line 114: predicion nit: prediction -- To view, visit http://gerrit.cloudera.org:8080/17295 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183 Gerrit-Change-Number: 17295 Gerrit-PatchSet: 27 Gerrit-Owner: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Tue, 18 May 2021 15:23:17 +0000 Gerrit-HasComments: Yes