Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15683 )

Change subject: IMPALA-3741 [part 2]: Push runtime bloom filter to Kudu
......................................................................


Patch Set 9:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/15683/9/be/CMakeLists.txt
File be/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/15683/9/be/CMakeLists.txt@234
PS9, Line 234:   "-DKUDU_HEADERS_NO_STUBS"
> nit formatting: trailing whitespace and you can wrap this into the rest of
will fix as suggested.


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/filter-context.cc
File be/src/exec/filter-context.cc:

http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/filter-context.cc@91
PS9, Line 91:     //local_bloom_filter->Insert(val, expr_eval->root().type());
> I guess this was left by accident?
Right, will remove these two lines.


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/filter-context.cc@106
PS9, Line 106: // An example of the generated code for TPCH-Q2: RF002 -> 
n_regionkey
> This needs to be updated (though of course it will only change slightly). I
will add LlvmCodeGen::Print(*fn) at the end of function to dump the codegen 
functions, then update the sample in the functions' comments.


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/kudu-scanner.cc
File be/src/exec/kudu-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/kudu-scanner.cc@232
PS9, Line 232: }
             :       else if
> nit: formatting ('else if' should go on the same line as prior '}')
Will fix it.


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/kudu-scanner.cc@234
PS9, Line 234:         continue;
> Brief comment, eg. 'This filter won't actually remove any rows so we don't
will add comments


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/kudu-scanner.cc@236
PS9, Line 236: else
> nit: it doesn't actually change how the code works, but it would be better
will remove "else"


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/kudu-scanner.cc@238
PS9, Line 238: if (filter != nullptr) {
> I think we can save ourselves some duplication and an indention level by ch
Agree, will change the code.


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/exec/kudu-scanner.cc@239
PS9, Line 239: auto it = ctx.filter->filter_desc().planid_to_target_ndx.find(
             :               scan_node_->id());
             :           const TRuntimeFilterTargetDesc &target_desc =
             :               ctx.filter->filter_desc().targets[it->second];
             :           const string &col_name = target_desc.kudu_col_name;
             :           DCHECK(col_name != "");
> This is duplicated below, I think we can eliminate the duplication by movin
That's right. Will remove the duplication.


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h
File be/src/util/bloom-filter.h:

http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h@53
PS9, Line 53: class BloomFilterBufferAllocator;
> I don't think this is necessary
will remove it.


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h@73
PS9, Line 73: BloomFilterBufferAllocator
> nit: maybe name this ImpalaBloomFilterBufferAllocator to ensure there's no
Will rename it as suggested


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h@91
PS9, Line 91:   std::shared_ptr<kudu::BlockBloomFilterBufferAllocatorIf> 
Clone() const override {
> If this is something that's really only used in Kudu's internal testing and
Will add LOG(FATAL) and simplify the logic of allocate/close functions.


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h@111
PS9, Line 111: /// A BloomFilter stores sets of items and offers a query 
operation indicating whether or
> This class comment could probably use some updating to reflect the fact tha
Will update the comments.


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h@171
PS9, Line 171:   void Insert(void* val, const ColumnType& col_type) noexcept;
> Is this version of Insert() used anywhere? Same with the equivalent Find()
Will remove them


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.h@241
PS9, Line 241: HashAlgorithm hash_algorithm_;
> As far as I can tell, this isn't actually getting used anywhere, and I don'
Will remove it.


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.cc
File be/src/util/bloom-filter.cc:

http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.cc@205
PS9, Line 205:   if (is_allocated_) Close();
> I suspect its probably supposed to be guaranteed that FreeBuffer() is alway
Will add LOG(DFATAL).


http://gerrit.cloudera.org:8080/#/c/15683/9/be/src/util/bloom-filter.cc@226
PS9, Line 226:   Close(); // Ensure that any previously allocated memory is 
released.
> I wonder if this is really necessary - I would hope that Kudu provides a gu
Will LOG(DFALTAL) if is allocated.


http://gerrit.cloudera.org:8080/#/c/15683/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/15683/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@723
PS9, Line 723: if (slotRef == null || slotRef.getDesc().getColumn() == null
             :               || (targetExpr instanceof CastExpr)
             :               || filter.getExprCompOp() == 
Operator.NOT_DISTINCT) {
             :             continue;
             :           }
> This is almost identical to the 'if' below. I think we can reduce code dupl
They are not identical. Min-max filter support integer casting, Bloom filter 
don't  allow any casting.



--
To view, visit http://gerrit.cloudera.org:8080/15683
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9100076f68ea299ddb6ec8bc027cac7a47f5d754
Gerrit-Change-Number: 15683
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Thu, 07 May 2020 18:13:37 +0000
Gerrit-HasComments: Yes

Reply via email to