Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15373 )
Change subject: [util] Import "Or" function to BlockBloomFilter from Impala ...................................................................... Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter-test.cc File src/kudu/util/block_bloom_filter-test.cc: http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter-test.cc@304 PS1, Line 304: for (int i = 11; i < 50; ++i) ASSERT_TRUE(bf1->Find(i)); > if this fails, how to know what 'i' it was? Good catch. This one was missed. http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h File src/kudu/util/block_bloom_filter.h: http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@153 PS1, Line 153: Or > For more elegant syntax alternative, maybe introduce I implemented operator |= after your feedback. However after incorporating other feedback related to invalid input arguments, changed return datatype of Or() function from void to Status. So implementing operator |= won't be possible unless we throw exception for invalid input case. Hence not implementing operator |=. http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@161 PS1, Line 161: static constexpr BlockBloomFilter* const kAlwaysTrueFilter = nullptr; > Does this need to be public? Why? Yes. Impala uses special case of an always true Bloom filter. Some examples: https://github.com/apache/impala/blob/master/be/src/exec/partitioned-hash-join-builder.cc#L805 https://github.com/apache/impala/blob/master/be/src/runtime/runtime-filter-bank.cc#L340 http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@229 PS1, Line 229: AVX > AVX2 ? Done http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc File src/kudu/util/block_bloom_filter.cc: http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@50 PS1, Line 50: constexpr BlockBloomFilter* const BlockBloomFilter::kAlwaysTrueFilter; > Does this need to be initialized to something meaningful? In the header it' Since the variable is constexpr it needs to be initialized in the class declaration in the header and since it's a static it needs to be defined here in the .cc file. This puzzled me as well but not initializing to nullptr in the class declaration led to a warning and initializing to nullptr in .cc file also led to a warning. Eventually I just followed the code in Impala and no more warnings :). https://github.com/apache/impala/blob/master/be/src/util/bloom-filter.h#L102 https://github.com/apache/impala/blob/master/be/src/util/bloom-filter.cc#L41 http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@287 PS1, Line 287: DCHECK_NE(&other, kAlwaysTrueFilter); > Why is this disallowed? Could you add a short explanation? Done http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@291 PS1, Line 291: DCHECK_EQ > BTW, is this Or() method protected from inputs from the wild that might not My bad, I copied the Impala implementation without giving more thought. I've changed the return type Status to return back an error in such cases. http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter_avx2.cc File src/kudu/util/block_bloom_filter_avx2.cc: http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter_avx2.cc@86 PS1, Line 86: DCHECK_EQ > Is it guaranteed that debug builds provide enough coverage to make sure n % Yes, checking in debug provides enough coverage. This is checking that the size of the directory structure is multiple of 32 bytes. Directory consists of multiple buckets (at least 1) where each bucket comprises of 8 BucketWords and each BucketWord is 4 bytes. https://github.com/apache/kudu/blob/master/src/kudu/util/block_bloom_filter.h#L157 -- To view, visit http://gerrit.cloudera.org:8080/15373 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f Gerrit-Change-Number: 15373 Gerrit-PatchSet: 1 Gerrit-Owner: Bankim Bhavsar <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Bankim Bhavsar <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Mon, 09 Mar 2020 22:47:42 +0000 Gerrit-HasComments: Yes
