Bankim Bhavsar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15373 )

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter-test.cc
File src/kudu/util/block_bloom_filter-test.cc:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter-test.cc@304
PS1, Line 304:   for (int i = 11; i < 50; ++i) ASSERT_TRUE(bf1->Find(i));
> if this fails, how to know what 'i' it was?
Good catch. This one was missed.


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h
File src/kudu/util/block_bloom_filter.h:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@153
PS1, Line 153: Or
> For more elegant syntax alternative, maybe introduce
I implemented operator |= after your feedback. However after incorporating 
other feedback related to invalid input arguments, changed return datatype of 
Or() function from void to Status. So implementing operator |= won't be 
possible unless we throw exception for invalid input case. Hence not 
implementing operator |=.


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@161
PS1, Line 161:   static constexpr BlockBloomFilter* const kAlwaysTrueFilter = 
nullptr;
> Does this need to be public? Why?
Yes.
Impala uses special case of an always true Bloom filter.
Some examples:
https://github.com/apache/impala/blob/master/be/src/exec/partitioned-hash-join-builder.cc#L805
https://github.com/apache/impala/blob/master/be/src/runtime/runtime-filter-bank.cc#L340


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@229
PS1, Line 229: AVX
> AVX2 ?
Done


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc
File src/kudu/util/block_bloom_filter.cc:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@50
PS1, Line 50: constexpr BlockBloomFilter* const 
BlockBloomFilter::kAlwaysTrueFilter;
> Does this need to be initialized to something meaningful? In the header it'
Since the variable is constexpr it needs to be initialized in the class 
declaration in the header and since it's a static it needs to be defined here 
in the .cc file.
This puzzled me as well but not initializing to nullptr in the class 
declaration led to a warning and initializing to nullptr in .cc file also led 
to a warning.
Eventually I just followed the code in Impala and no more warnings :).

https://github.com/apache/impala/blob/master/be/src/util/bloom-filter.h#L102
https://github.com/apache/impala/blob/master/be/src/util/bloom-filter.cc#L41


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@287
PS1, Line 287: DCHECK_NE(&other, kAlwaysTrueFilter);
> Why is this disallowed?  Could you add a short explanation?
Done


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@291
PS1, Line 291: DCHECK_EQ
> BTW, is this Or() method protected from inputs from the wild that might not
My bad, I copied the Impala implementation without giving more thought. I've 
changed the return type Status to return back an error in such cases.


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter_avx2.cc
File src/kudu/util/block_bloom_filter_avx2.cc:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter_avx2.cc@86
PS1, Line 86: DCHECK_EQ
> Is it guaranteed that debug builds provide enough coverage to make sure n %
Yes, checking in debug provides enough coverage. This is checking that the size 
of the directory structure is multiple of 32 bytes.
Directory consists of multiple buckets (at least 1) where each bucket comprises 
of 8 BucketWords and each BucketWord is 4 bytes.
https://github.com/apache/kudu/blob/master/src/kudu/util/block_bloom_filter.h#L157



--
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 1
Gerrit-Owner: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Mon, 09 Mar 2020 22:47:42 +0000
Gerrit-HasComments: Yes

Reply via email to