Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
......................................................................


Patch Set 27:

(12 comments)

Found a few nits, but looks good overall.

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.h
File be/src/exec/filter-context.h:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.h@155
PS27, Line 155:   static bool ShouldRejectFilterBasedOnColumnStats(
nit: I'm OK with reformatting, but I'm not sure if it was intended


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.cc
File be/src/exec/filter-context.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.cc@231
PS27, Line 231: example
Could you please update this example?


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@984
PS27, Line 984: scalar_reader->offset_index_
nit: simply 'offset_index' from at L978?


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@997
PS27, Line 997: DCHECK
nit: DCHECK_GE could be used. It has the advantage that in case of failure it 
prints the actual values.


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@1010
PS27, Line 1010: Expected
Any reason why we don't want the error message to be logged here?


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@1107
PS27, Line 1107:
nit: probably unintended space


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.cc
File be/src/exec/parquet/parquet-column-stats.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.cc@281
PS27, Line 281:   const int remainder = num_values % batch;
nit: do we need to calculate remainder? In the second for-loop we could have

 for (int i = pos; i < num_values; ++i)

Or, use pos itself:

 for (; pos < num_values; ++pos)


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.cc@321
PS27, Line 321:   const int remainder = num_values % batch;
nit: same as above


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.inline.h
File be/src/exec/parquet/parquet-column-stats.inline.h:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.inline.h@143
PS27, Line 143:   DCHECK(buffer.size() == sizeof(int32_t));
              :   DCHECK(parquet_type == parquet::Type::INT32);
nit: DCHECK_EQ


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.inline.h@159
PS27, Line 159:   DCHECK(buffer.size() == sizeof(int32_t));
              :   DCHECK(parquet_type == parquet::Type::INT32);
nit: DCHECK_EQ


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc@345
PS27, Line 345: not_useful = false;
nit: I think it'd be a bit more readable if we decrease the negations, i.e. 
only call the variable 'useful'.


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/util/min-max-filter-ir.cc
File be/src/util/min-max-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/util/min-max-filter-ir.cc@114
PS27, Line 114: predicion
nit: prediction



--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 27
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Tue, 18 May 2021 15:23:17 +0000
Gerrit-HasComments: Yes

Reply via email to