Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/15683 )
Change subject: IMPALA-3741 [part 2]: Push runtime bloom filter to Kudu ...................................................................... Patch Set 9: (17 comments) http://gerrit.cloudera.org:8080/#/c/15683/9/be/CMakeLists.txt File be/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/15683/9/be/CMakeLists.txt@234 PS9, Line 234: "-DKUDU_HEADERS_NO_STUBS" > nit formatting: trailing whitespace and you can wrap this into the rest of will fix as suggested. http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/filter-context.cc File be/src/exec/filter-context.cc: http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/filter-context.cc@91 PS9, Line 91: //local_bloom_filter->Insert(val, expr_eval->root().type()); > I guess this was left by accident? Right, will remove these two lines. http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/filter-context.cc@106 PS9, Line 106: // An example of the generated code for TPCH-Q2: RF002 -> n_regionkey > This needs to be updated (though of course it will only change slightly). I will add LlvmCodeGen::Print(*fn) at the end of function to dump the codegen functions, then update the sample in the functions' comments. http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/kudu-scanner.cc File be/src/exec/kudu-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/kudu-scanner.cc@232 PS9, Line 232: } : else if > nit: formatting ('else if' should go on the same line as prior '}') Will fix it. http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/kudu-scanner.cc@234 PS9, Line 234: continue; > Brief comment, eg. 'This filter won't actually remove any rows so we don't will add comments http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/kudu-scanner.cc@236 PS9, Line 236: else > nit: it doesn't actually change how the code works, but it would be better will remove "else" http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/kudu-scanner.cc@238 PS9, Line 238: if (filter != nullptr) { > I think we can save ourselves some duplication and an indention level by ch Agree, will change the code. http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/kudu-scanner.cc@239 PS9, Line 239: auto it = ctx.filter->filter_desc().planid_to_target_ndx.find( : scan_node_->id()); : const TRuntimeFilterTargetDesc &target_desc = : ctx.filter->filter_desc().targets[it->second]; : const string &col_name = target_desc.kudu_col_name; : DCHECK(col_name != ""); > This is duplicated below, I think we can eliminate the duplication by movin That's right. Will remove the duplication. http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h File be/src/util/bloom-filter.h: http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h@53 PS9, Line 53: class BloomFilterBufferAllocator; > I don't think this is necessary will remove it. http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h@73 PS9, Line 73: BloomFilterBufferAllocator > nit: maybe name this ImpalaBloomFilterBufferAllocator to ensure there's no Will rename it as suggested http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h@91 PS9, Line 91: std::shared_ptr<kudu::BlockBloomFilterBufferAllocatorIf> Clone() const override { > If this is something that's really only used in Kudu's internal testing and Will add LOG(FATAL) and simplify the logic of allocate/close functions. http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h@111 PS9, Line 111: /// A BloomFilter stores sets of items and offers a query operation indicating whether or > This class comment could probably use some updating to reflect the fact tha Will update the comments. http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h@171 PS9, Line 171: void Insert(void* val, const ColumnType& col_type) noexcept; > Is this version of Insert() used anywhere? Same with the equivalent Find() Will remove them http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h@241 PS9, Line 241: HashAlgorithm hash_algorithm_; > As far as I can tell, this isn't actually getting used anywhere, and I don' Will remove it. http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.cc File be/src/util/bloom-filter.cc: http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.cc@205 PS9, Line 205: if (is_allocated_) Close(); > I suspect its probably supposed to be guaranteed that FreeBuffer() is alway Will add LOG(DFATAL). http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.cc@226 PS9, Line 226: Close(); // Ensure that any previously allocated memory is released. > I wonder if this is really necessary - I would hope that Kudu provides a gu Will LOG(DFALTAL) if is allocated. http://gerrit.cloudera.org:8080/#/c/15683/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/15683/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@723 PS9, Line 723: if (slotRef == null || slotRef.getDesc().getColumn() == null : || (targetExpr instanceof CastExpr) : || filter.getExprCompOp() == Operator.NOT_DISTINCT) { : continue; : } > This is almost identical to the 'if' below. I think we can reduce code dupl They are not identical. Min-max filter support integer casting, Bloom filter don't allow any casting. -- To view, visit http://gerrit.cloudera.org:8080/15683 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9100076f68ea299ddb6ec8bc027cac7a47f5d754 Gerrit-Change-Number: 15683 Gerrit-PatchSet: 9 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Bankim Bhavsar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Thu, 07 May 2020 18:13:37 +0000 Gerrit-HasComments: Yes
