[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side This change switches the implementation of the ColumnPredicate to use the BlockBloomFilter for the BloomFilter predicate on the server side. Earlier implementation was still experimental and didn't provide public client APIs that actually use this BloomFilter predicate so taken the liberty to make incompatible wire protocol changes. Updated BlockBloomFilter to take hash_algorithm and hash_seed. This make serializing and deserializing the BlockBloomFilter convenient and removes the need of BloomFilterInner in ColumnPredicate. Added overloaded Insert()/Find() functions to BlockBloomFilter that take Slice parameter and hashes the key before insertion/lookup. Most of the change involves refactoring the implementation including the unit tests. Currently only FAST_HASH algorithm is supported since 32-bit versions of MURMUR2 and CITY_HASH are not currently implemented. Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Reviewed-on: http://gerrit.cloudera.org:8080/15034 Reviewed-by: Adar Dembo Tested-by: Adar Dembo Reviewed-by: helifu --- M src/kudu/common/column_predicate-test.cc M src/kudu/common/column_predicate.cc M src/kudu/common/column_predicate.h M src/kudu/common/common.proto M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/tablet/cfile_set-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/block_bloom_filter-test.cc M src/kudu/util/block_bloom_filter.cc M src/kudu/util/block_bloom_filter.h A src/kudu/util/block_bloom_filter.proto M src/kudu/util/hash.proto M src/kudu/util/hash_util-test.cc M src/kudu/util/hash_util.h 15 files changed, 488 insertions(+), 373 deletions(-) Approvals: Adar Dembo: Looks good to me, approved; Verified helifu: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 14 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 13: Code-Review+2 > Leaving open in case Lifu (or others) have more comments. Thanks. It looks good to me apart from the impact of the evaluation (it's very CPU intensive). It looks good to me -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 13 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Fri, 14 Feb 2020 03:11:51 + Gerrit-HasComments: No
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Adar Dembo has removed Kudu Jenkins from this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Removed reviewer Kudu Jenkins with the following votes: * Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 13 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 13: Verified+1 Code-Review+2 Leaving open in case Lifu (or others) have more comments. -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 13 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Fri, 14 Feb 2020 01:51:16 + Gerrit-HasComments: No
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 13: Could the reviewers take a look at the latest patchset please? -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 13 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Thu, 13 Feb 2020 19:20:08 + Gerrit-HasComments: No
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 13: > Patch Set 13: Verified-1 > > Build Failed > > http://jenkins.kudu.apache.org/job/kudu-gerrit/20374/ : FAILURE Unrelated test failure due to a RPC timeout. /home/jenkins-slave/workspace/kudu-master/3/src/kudu/integration-tests/location_assignment-itest.cc:187 Failed Bad status: Timed out: ListTabletServers RPC failed: ListTabletServers RPC to 127.1.75.62:41813 timed out after 0.736s (SENT) -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 13 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Wed, 12 Feb 2020 19:49:18 + Gerrit-HasComments: No
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 13: > > In general, the bloom filter is used as run-time filter, > especially in a hash-join case. For a computer engine, such as > impala, it will wait 1 second(default) to produce the bloom filter > and then push it down. So, I have another suggestion: maybe we can > push it down in the middle of the scan while the filter is not > arrived within the specified interval. It should be in the next > patch^_^ > > > > computer engine -> kudu client -> kudu server > >[1][2][3] > >throw away > >[0] > > That's interesting: are you suggesting we need the ability to add a > bloom filter predicate to an ongoing scan, rather than just when > starting a new scan? There's a lot of existing machinery that > treats predicates as just another aspect of scan configuration > (along with e.g. projections), to be applied to a new scanner but > immutable after that. I think it'd be a fair amount of work to > change that. > > As an alternative, would it be possible to delay the onset of a > Kudu scan from Impala until the bloom filter can be constructed? Or > will doing that stall the entire query pipeline? Nope, I mean that adding a bloom filter predicate to an ongoing scan is a new independent feature, and we can implement it in the next patch. No, it's impossible to delay the onset of a kudu scan. For example, if the time to wait for the bloom filter is greater than the time to scan all of the data, the waiting would not be the best option. -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 13 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Mon, 10 Feb 2020 01:49:09 + Gerrit-HasComments: No
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 13: > In general, the bloom filter is used as run-time filter, especially in a > hash-join case. For a computer engine, such as impala, it will wait 1 > second(default) to produce the bloom filter and then push it down. So, I have > another suggestion: maybe we can push it down in the middle of the scan while > the filter is not arrived within the specified interval. It should be in the > next patch^_^ > > computer engine -> kudu client -> kudu server >[1][2][3] >throw away >[0] That's interesting: are you suggesting we need the ability to add a bloom filter predicate to an ongoing scan, rather than just when starting a new scan? There's a lot of existing machinery that treats predicates as just another aspect of scan configuration (along with e.g. projections), to be applied to a new scanner but immutable after that. I think it'd be a fair amount of work to change that. As an alternative, would it be possible to delay the onset of a Kudu scan from Impala until the bloom filter can be constructed? Or will doing that stall the entire query pipeline? -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 13 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Sat, 08 Feb 2020 05:53:43 + Gerrit-HasComments: No
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Hello Kudu Jenkins, helifu, Yao Xu, Adar Dembo, ZhangYao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15034 to look at the new patch set (#13). Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side This change switches the implementation of the ColumnPredicate to use the BlockBloomFilter for the BloomFilter predicate on the server side. Earlier implementation was still experimental and didn't provide public client APIs that actually use this BloomFilter predicate so taken the liberty to make incompatible wire protocol changes. Updated BlockBloomFilter to take hash_algorithm and hash_seed. This make serializing and deserializing the BlockBloomFilter convenient and removes the need of BloomFilterInner in ColumnPredicate. Added overloaded Insert()/Find() functions to BlockBloomFilter that take Slice parameter and hashes the key before insertion/lookup. Most of the change involves refactoring the implementation including the unit tests. Currently only FAST_HASH algorithm is supported since 32-bit versions of MURMUR2 and CITY_HASH are not currently implemented. Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a --- M src/kudu/common/column_predicate-test.cc M src/kudu/common/column_predicate.cc M src/kudu/common/column_predicate.h M src/kudu/common/common.proto M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/tablet/cfile_set-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/block_bloom_filter-test.cc M src/kudu/util/block_bloom_filter.cc M src/kudu/util/block_bloom_filter.h A src/kudu/util/block_bloom_filter.proto M src/kudu/util/hash.proto M src/kudu/util/hash_util-test.cc M src/kudu/util/hash_util.h 15 files changed, 488 insertions(+), 373 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/15034/13 -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 13 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 12: (4 comments) http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/common/column_predicate.h File src/kudu/common/column_predicate.h: http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/common/column_predicate.h@379 PS12, Line 379: std::vector bloom_filters_; > I have discuss with helifu online a long time ago. A column contains more t Clients supplying multiple Bloom filters would be very rare in my opinion too. As Zhang mentioned, different hash algorithms can be used for the supplied Bloom filters which theoretically improves the false positive probability rate as the element in the table must be found in ALL the Bloom filters which effectively translates to AND operation. Providing multiple Bloom filters keeps the client more future-proof. I don't have a strong opinion on accepting multiple Bloom filters in the predicate. I'm basically keeping that part same as the original implementation. http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/column_predicate.cc File src/kudu/common/column_predicate.cc: http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/column_predicate.cc@113 PS11, Line 113: std:: > remove "std::" Done http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/common/column_predicate.cc File src/kudu/common/column_predicate.cc: http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/common/column_predicate.cc@119 PS12, Line 119: pred.Simplify(); > I think we need to check the size(GetSpaceUsed) of bloom filter because the The RPC transfer layer in such a case will return an error. I don't think adding such RPC payload size validation in the bloom filter predicate seems right. http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/util/hash_util.h File src/kudu/util/hash_util.h: http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/util/hash_util.h@122 PS12, Line 122: case FAST_HASH: > Why do you select FAST_HASH? Does it work faster than murmur or cityhash? A See this change: https://gerrit.cloudera.org/c/14934/ I took a look at results published by smhasher and compared the results with hashing functions with no quality problems and ones Kudu implements currently namely Murmur2, Cityhash, FastHash(Impala uses it), new ones Spooky, t1ha2, wyhash. https://github.com/rurban/smhasher#smhasher Columns: https://github.com/rurban/smhasher#columns Linked doc: https://drive.google.com/open?id=16jXDZWFADpt9MJ0pXmqOIweMaIZBi1F2 - Murmur2 has a bunch of quality problems as noted. - CityHash can be twice as fast as Fasthash but has quality problems except the one where only lower 32-bits of 64-bit hash function are considered. However size of CityHash is nearly 2x. Not sure I understand the "size" of the object here but the lower the better. So might affect fitting in cache? - Spooky is among the fastest but has among the largest size. - t1ha2_atonce (odd-name) is one of the fastest and lower on size. One caveat is implementation looks architecture specific and will give high performance only on little-endian machines. Not sure we support any big-endian architectures? Itt might be tricky to import and use. - wyhash is fast but has larger size. So overall FastHash and Cityhash look good choices, with Fasthash edging out Cityhash due to its size? -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 12 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Sat, 08 Feb 2020 00:02:23 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 12: (2 comments) http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/common/column_predicate.h File src/kudu/common/column_predicate.h: http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/common/column_predicate.h@379 PS12, Line 379: std::vector bloom_filters_; > In my understanding, there should be at most one bloom filter in a predicat I have discuss with helifu online a long time ago. A column contains more than one bloomfilter is rare. At the very beginning, I use std::vector because it can support more than one hash algorithm and sometimes it will be useful. Will we support more hash algorithms in the future? http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/util/hash_util.h File src/kudu/util/hash_util.h: http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/util/hash_util.h@122 PS12, Line 122: case FAST_HASH: Why do you select FAST_HASH? Does it work faster than murmur or cityhash? And do we have FAST_HASH in java client? -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 12 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Thu, 06 Feb 2020 15:00:58 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 12: (3 comments) Good job, Bankim! In general, the bloom filter is used as run-time filter, especially in a hash-join case. For a computer engine, such as impala, it will wait 1 second(default) to produce the bloom filter and then push it down. So, I have another suggestion: maybe we can push it down in the middle of the scan while the filter is not arrived within the specified interval. It should be in the next patch^_^ computer engine -> kudu client -> kudu server [1][2][3] throw away [0] http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/common/column_predicate.h File src/kudu/common/column_predicate.h: http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/common/column_predicate.h@379 PS12, Line 379: std::vector bloom_filters_; In my understanding, there should be at most one bloom filter in a predicate. Even if in a particular case, someone wants to push down some bloom filters on the same column at the same time, these bloom filters should have the same "ndv" and "log_space_bytes" for the current tablet/table, and all of them could be merged into one(AND operation, OR operation is not supported yet.). But in fact, nobody will do that since the bloom filter will take lots of CPU cycles while evaluating. So, could you please share your thoughts? http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/column_predicate.cc File src/kudu/common/column_predicate.cc: http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/column_predicate.cc@113 PS11, Line 113: std:: remove "std::" http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/common/column_predicate.cc File src/kudu/common/column_predicate.cc: http://gerrit.cloudera.org:8080/#/c/15034/12/src/kudu/common/column_predicate.cc@119 PS12, Line 119: pred.Simplify(); I think we need to check the size(GetSpaceUsed) of bloom filter because the default value of 'rpc_max_message_size' is 50Mib. -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 12 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Thu, 06 Feb 2020 12:31:40 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 12 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Wed, 05 Feb 2020 22:48:41 + Gerrit-HasComments: No
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Hello Kudu Jenkins, helifu, Yao Xu, Adar Dembo, ZhangYao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15034 to look at the new patch set (#12). Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side This change switches the implementation of the ColumnPredicate to use the BlockBloomFilter for the BloomFilter predicate on the server side. Earlier implementation was still experimental and didn't provide public client APIs that actually use this BloomFilter predicate so taken the liberty to make incompatible wire protocol changes. Updated BlockBloomFilter to take hash_algorithm and hash_seed. This make serializing and deserializing the BlockBloomFilter convenient and removes the need of BloomFilterInner in ColumnPredicate. Added overloaded Insert()/Find() functions to BlockBloomFilter that take Slice parameter and hashes the key before insertion/lookup. Most of the change involves refactoring the implementation including the unit tests. Currently only FAST_HASH algorithm is supported since 32-bit versions of MURMUR2 and CITY_HASH are not currently implemented. Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a --- M src/kudu/common/column_predicate-test.cc M src/kudu/common/column_predicate.cc M src/kudu/common/column_predicate.h M src/kudu/common/common.proto M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/tablet/cfile_set-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/block_bloom_filter-test.cc M src/kudu/util/block_bloom_filter.cc M src/kudu/util/block_bloom_filter.h A src/kudu/util/block_bloom_filter.proto M src/kudu/util/hash.proto M src/kudu/util/hash_util-test.cc M src/kudu/util/hash_util.h 15 files changed, 488 insertions(+), 373 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/15034/12 -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 12 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/wire_protocol.cc File src/kudu/common/wire_protocol.cc: http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/wire_protocol.cc@623 PS11, Line 623: auto allocator = arena->NewObject(arena); > Oh, this should be "auto*" then; I assumed 'allocator' was a stack allocate Done -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 11 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Wed, 05 Feb 2020 22:35:35 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/wire_protocol.cc File src/kudu/common/wire_protocol.cc: http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/wire_protocol.cc@623 PS11, Line 623: auto allocator = arena->NewObject(arena); > "allocator" is allocated in the arena like on heap and isn't deleted on exi Oh, this should be "auto*" then; I assumed 'allocator' was a stack allocated object. -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 11 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Mon, 03 Feb 2020 19:35:41 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/wire_protocol.cc File src/kudu/common/wire_protocol.cc: http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/wire_protocol.cc@623 PS11, Line 623: auto allocator = arena->NewObject(arena); > Looking at the code again, I'm a little confused as to how this works, beca "allocator" is allocated in the arena like on heap and isn't deleted on exiting the scope. So the "allocator" like the bloom filters allocated in the arena will be valid till the arena is destructed. -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 11 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Mon, 03 Feb 2020 18:29:16 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/wire_protocol.cc File src/kudu/common/wire_protocol.cc: http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/wire_protocol.cc@623 PS11, Line 623: auto allocator = arena->NewObject(arena); > IIUC, it's OK to "orphan" the allocator object (previously we would faithfu Looking at the code again, I'm a little confused as to how this works, because the BBF destructor leads to a call to buffer_allocator_->FreeBuffer(). If 'allocator' here has popped off the stack, why doesn't the FreeBuffer() call corrupt memory or crash the process? -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 11 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Fri, 31 Jan 2020 23:17:55 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/wire_protocol.cc File src/kudu/common/wire_protocol.cc: http://gerrit.cloudera.org:8080/#/c/15034/11/src/kudu/common/wire_protocol.cc@623 PS11, Line 623: auto allocator = arena->NewObject(arena); IIUC, it's OK to "orphan" the allocator object (previously we would faithfully preserve it alongside the BlockBloomFilters) because: 1. The arena used to allocate the bfs is preserved for the lifetime of the scan, which de facto exceeds the lifetime of the bfs themselves. 2. We don't need to make any further allocations in the bfs themselves, so it's OK if their allocator pointers are garbage. That last point gives me pause; it seems somewhat error prone to have live objects associated to garbage data. Perhaps we could mutate the bfs at this point to nullify their allocator pointers, so that any future allocations trigger a DCHECK? -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 11 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Fri, 31 Jan 2020 23:15:14 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Hello Kudu Jenkins, helifu, Yao Xu, Adar Dembo, ZhangYao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15034 to look at the new patch set (#11). Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side This change switches the implementation of the ColumnPredicate to use the BlockBloomFilter for the BloomFilter predicate on the server side. Earlier implementation was still experimental and didn't provide public client APIs that actually use this BloomFilter predicate so taken the liberty to make incompatible wire protocol changes. Updated BlockBloomFilter to take hash_algorithm and hash_seed. This make serializing and deserializing the BlockBloomFilter convenient and removes the need of BloomFilterInner in ColumnPredicate. Added overloaded Insert()/Find() functions to BlockBloomFilter that take Slice parameter and hashes the key before insertion/lookup. Most of the change involves refactoring the implementation including the unit tests. Currently only FAST_HASH algorithm is supported since 32-bit versions of MURMUR2 and CITY_HASH are not currently implemented. Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a --- M src/kudu/common/column_predicate-test.cc M src/kudu/common/column_predicate.cc M src/kudu/common/column_predicate.h M src/kudu/common/common.proto M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/tablet/cfile_set-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/block_bloom_filter-test.cc M src/kudu/util/block_bloom_filter.cc M src/kudu/util/block_bloom_filter.h A src/kudu/util/block_bloom_filter.proto M src/kudu/util/hash.proto M src/kudu/util/hash_util-test.cc M src/kudu/util/hash_util.h 15 files changed, 488 insertions(+), 373 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/15034/11 -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 11 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 8: Verified+1 Code-Review+2 Looks good to me, though not merging yet since I think you wanted to collect additional reviews? -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 8 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Tue, 21 Jan 2020 22:33:04 + Gerrit-HasComments: No
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Adar Dembo has removed Kudu Jenkins from this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Removed reviewer Kudu Jenkins with the following votes: * Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 8 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 8: > Patch Set 8: Verified-1 > > Build Failed > > http://jenkins.kudu.apache.org/job/kudu-gerrit/20217/ : FAILURE Bad status: Timed out: ListTabletServers RPC failed: ListTabletServers RPC to 127.1.100.62:38687 timed out after 0.430s (SENT) Unrelated test failure: http://jenkins.kudu.apache.org/job/kudu-gerrit/20217/BUILD_TYPE=RELEASE/testReport/junit/(root)/TsLocationAssignmentITest/Basic_3/ -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 8 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Tue, 21 Jan 2020 22:20:09 + Gerrit-HasComments: No
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Hello Kudu Jenkins, helifu, Yao Xu, Adar Dembo, ZhangYao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15034 to look at the new patch set (#8). Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side This change switches the implementation of the ColumnPredicate to use the BlockBloomFilter for the BloomFilter predicate on the server side. Earlier implementation was still experimental and didn't provide public client APIs that actually use this BloomFilter predicate so taken the liberty to make incompatible wire protocol changes. Updated BlockBloomFilter to take hash_algorithm and hash_seed. This make serializing and deserializing the BlockBloomFilter convenient and removes the need of BloomFilterInner in ColumnPredicate. Added overloaded Insert()/Find() functions to BlockBloomFilter that take Slice parameter and hashes the key before insertion/lookup. Most of the change involves refactoring the implementation including the unit tests. Currently only FAST_HASH algorithm is supported since 32-bit versions of MURMUR2 and CITY_HASH are not currently implemented. Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a --- M src/kudu/common/column_predicate-test.cc M src/kudu/common/column_predicate.cc M src/kudu/common/column_predicate.h M src/kudu/common/common.proto M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/tablet/cfile_set-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/block_bloom_filter-test.cc M src/kudu/util/block_bloom_filter.cc M src/kudu/util/block_bloom_filter.h A src/kudu/util/block_bloom_filter.proto M src/kudu/util/hash.proto M src/kudu/util/hash_util-test.cc M src/kudu/util/hash_util.h 15 files changed, 514 insertions(+), 390 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/15034/8 -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 8 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/15034/7/src/kudu/util/block_bloom_filter.cc File src/kudu/util/block_bloom_filter.cc: http://gerrit.cloudera.org:8080/#/c/15034/7/src/kudu/util/block_bloom_filter.cc@74 PS7, Line 74: DCHECK(directory_ == nullptr) << : "Close() should have been called before the object is destroyed."; > If the destructor is going to call Close(), doesn't seem like we need this Done http://gerrit.cloudera.org:8080/#/c/15034/7/src/kudu/util/block_bloom_filter.cc@220 PS7, Line 220: void BlockBloomFilter::Insert(const Slice& key) noexcept { : Insert(HashUtil::ComputeHash32(key, hash_algorithm_, hash_seed_)); : } > This and Find() could be inlined in the header. Done -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 7 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Tue, 21 Jan 2020 21:42:08 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/15034/7/src/kudu/util/block_bloom_filter.cc File src/kudu/util/block_bloom_filter.cc: http://gerrit.cloudera.org:8080/#/c/15034/7/src/kudu/util/block_bloom_filter.cc@74 PS7, Line 74: DCHECK(directory_ == nullptr) << : "Close() should have been called before the object is destroyed."; If the destructor is going to call Close(), doesn't seem like we need this DHECK anymore. http://gerrit.cloudera.org:8080/#/c/15034/7/src/kudu/util/block_bloom_filter.cc@220 PS7, Line 220: void BlockBloomFilter::Insert(const Slice& key) noexcept { : Insert(HashUtil::ComputeHash32(key, hash_algorithm_, hash_seed_)); : } This and Find() could be inlined in the header. -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 7 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Sat, 18 Jan 2020 00:28:34 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Hello Kudu Jenkins, helifu, Yao Xu, Adar Dembo, ZhangYao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15034 to look at the new patch set (#7). Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side This change switches the implementation of the ColumnPredicate to use the BlockBloomFilter for the BloomFilter predicate on the server side. Earlier implementation was still experimental and didn't provide public client APIs that actually use this BloomFilter predicate so taken the liberty to make incompatible wire protocol changes. Updated BlockBloomFilter to take hash_algorithm and hash_seed. This make serializing and deserializing the BlockBloomFilter convenient and removes the need of BloomFilterInner in ColumnPredicate. Added overloaded Insert()/Find() functions to BlockBloomFilter that take Slice parameter and hashes the key before insertion/lookup. Most of the change involves refactoring the implementation including the unit tests. Currently only FAST_HASH algorithm is supported since 32-bit versions of MURMUR2 and CITY_HASH are not currently implemented. Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a --- M src/kudu/common/column_predicate-test.cc M src/kudu/common/column_predicate.cc M src/kudu/common/column_predicate.h M src/kudu/common/common.proto M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/tablet/cfile_set-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/block_bloom_filter-test.cc M src/kudu/util/block_bloom_filter.cc M src/kudu/util/block_bloom_filter.h A src/kudu/util/block_bloom_filter.proto M src/kudu/util/hash.proto M src/kudu/util/hash_util.h 14 files changed, 480 insertions(+), 384 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/15034/7 -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 7 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Hello Kudu Jenkins, helifu, Yao Xu, Adar Dembo, ZhangYao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15034 to look at the new patch set (#6). Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side This change switches the implementation of the ColumnPredicate to use the BlockBloomFilter for the BloomFilter predicate on the server side. Earlier implementation was still experimental and didn't provide public client APIs that actually use this BloomFilter predicate so taken the liberty to make incompatible wire protocol changes. Updated BlockBloomFilter to take hash_algorithm and hash_seed. This make serializing and deserializing the BlockBloomFilter convenient and removes the need of BloomFilterInner in ColumnPredicate. Added overloaded Insert()/Find() functions to BlockBloomFilter that take Slice parameter and hashes the key before insertion/lookup. Most of the change involves refactoring the implementation including the unit tests. Currently only FAST_HASH algorithm is supported since 32-bit versions of MURMUR2 and CITY_HASH are not currently implemented. Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a --- M src/kudu/common/column_predicate-test.cc M src/kudu/common/column_predicate.cc M src/kudu/common/column_predicate.h M src/kudu/common/common.proto M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/tablet/cfile_set-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/block_bloom_filter-test.cc M src/kudu/util/block_bloom_filter.cc M src/kudu/util/block_bloom_filter.h A src/kudu/util/block_bloom_filter.proto M src/kudu/util/hash.proto M src/kudu/util/hash_util.h 14 files changed, 503 insertions(+), 368 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/15034/6 -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 6 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/15034/4/src/kudu/common/column_predicate.h File src/kudu/common/column_predicate.h: http://gerrit.cloudera.org:8080/#/c/15034/4/src/kudu/common/column_predicate.h@152 PS4, Line 152: std::vector&& bfs, > I see. That runs afoul of the Google Style Guide's rules on rvalue referenc Agreeing a bit reluctantly. Though one benefit I see is that it makes writing unit tests easier and avoids creating copies :) http://gerrit.cloudera.org:8080/#/c/15034/5/src/kudu/common/wire_protocol.cc File src/kudu/common/wire_protocol.cc: http://gerrit.cloudera.org:8080/#/c/15034/5/src/kudu/common/wire_protocol.cc@633 PS5, Line 633: if (!bf_src.has_log_space_bytes() || !bf_src.has_bloom_data() || : !bf_src.has_hash_algorithm() || bf_src.hash_algorithm() == UNKNOWN_HASH || : !bf_src.has_always_false()) { : return Status::InvalidArgument("Invalid in bloom filter predicate on column: " : "missing bloom filter details", col.name()); : } > Should this be moved into BlockBloomFilter::InitFromPB? Thanks for bringing this up. BlockBloomFilter doesn't contain the hash_algorithm and hash_seed fields and hence these checks were outside in the caller here. I refactored the code to include hash_algorithm and hash_seed inside BlockBloomFilter. This way BlockBloomFilter can be completely initialized from the protobuf message and makes it clean. Also removes the need of BloomFilterInner which looked like a crutch. Now that hash_algorithm and hash_seed are known to BlockBloomFilter, added Insert() and Find() convenience functions in BlockBloomFilter that takes a Slice and hashes prior to insertion/lookup. To keep BlockBloomFilter usable from Impala, direct insertion and lookup using 32-bit hash values is still made available. Also there is a slight performance optimization in case hash of the key is already known. There is possible risk of caller using direct 32-bit hash values but specifies different hash_algorithm to the BlockBloomFilter. To reduce that risk, made the constructor explicitly specify the hash function and seed. http://gerrit.cloudera.org:8080/#/c/15034/5/src/kudu/util/CMakeLists.txt File src/kudu/util/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/15034/5/src/kudu/util/CMakeLists.txt@21 PS5, Line 21: > Can you fix the indentation for the continuation lines? Check out how the o Done http://gerrit.cloudera.org:8080/#/c/15034/5/src/kudu/util/block_bloom_filter.cc File src/kudu/util/block_bloom_filter.cc: http://gerrit.cloudera.org:8080/#/c/15034/5/src/kudu/util/block_bloom_filter.cc@113 PS5, Line 113: memcpy(directory_, bf_src.bloom_data().data(), bf_src.bloom_data().size()); > Shouldn't we just do this copy to faithfully respect the state of bf_src, r Done -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 5 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Fri, 17 Jan 2020 23:15:14 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/15034/4/src/kudu/common/column_predicate.h File src/kudu/common/column_predicate.h: http://gerrit.cloudera.org:8080/#/c/15034/4/src/kudu/common/column_predicate.h@152 PS4, Line 152: std::vector&& bfs, > I understand but using rvalue reference as type for bfs forces caller to pa I see. That runs afoul of the Google Style Guide's rules on rvalue references though: https://google.github.io/styleguide/cppguide.html#Rvalue_references If you grep across the Kudu codebase, all of the rvalue reference argument types are either for move constructors, move assignment operators, or perfect forwarding. http://gerrit.cloudera.org:8080/#/c/15034/5/src/kudu/common/wire_protocol.cc File src/kudu/common/wire_protocol.cc: http://gerrit.cloudera.org:8080/#/c/15034/5/src/kudu/common/wire_protocol.cc@633 PS5, Line 633: if (!bf_src.has_log_space_bytes() || !bf_src.has_bloom_data() || : !bf_src.has_hash_algorithm() || bf_src.hash_algorithm() == UNKNOWN_HASH || : !bf_src.has_always_false()) { : return Status::InvalidArgument("Invalid in bloom filter predicate on column: " : "missing bloom filter details", col.name()); : } Should this be moved into BlockBloomFilter::InitFromPB? http://gerrit.cloudera.org:8080/#/c/15034/5/src/kudu/util/CMakeLists.txt File src/kudu/util/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/15034/5/src/kudu/util/CMakeLists.txt@21 PS5, Line 21: Can you fix the indentation for the continuation lines? Check out how the other PROTOBUF_GENERATE_CPP do it in this file. http://gerrit.cloudera.org:8080/#/c/15034/5/src/kudu/util/block_bloom_filter.cc File src/kudu/util/block_bloom_filter.cc: http://gerrit.cloudera.org:8080/#/c/15034/5/src/kudu/util/block_bloom_filter.cc@113 PS5, Line 113: memcpy(directory_, bf_src.bloom_data().data(), bf_src.bloom_data().size()); Shouldn't we just do this copy to faithfully respect the state of bf_src, regardless of the value of always_false? -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 5 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Fri, 17 Jan 2020 00:31:31 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Hello Kudu Jenkins, helifu, Yao Xu, Adar Dembo, ZhangYao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15034 to look at the new patch set (#5). Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side This change switches the implementation of the ColumnPredicate to use the BlockBloomFilter for the BloomFilter predicate on the server side. Earlier implementation was still experimental and didn't provide public client APIs that actually use this BloomFilter predicate so taken the liberty to make incompatible wire protocol changes. Most of the change involves refactoring the implementation including the unit tests. Currently only FAST_HASH algorithm is supported since 32-bit versions of MURMUR2 and CITY_HASH are not currently implemented. Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a --- M src/kudu/common/column_predicate-test.cc M src/kudu/common/column_predicate.cc M src/kudu/common/column_predicate.h M src/kudu/common/common.proto M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/tablet/cfile_set-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/block_bloom_filter.cc M src/kudu/util/block_bloom_filter.h A src/kudu/util/block_bloom_filter.proto M src/kudu/util/hash.proto M src/kudu/util/hash_util.h 13 files changed, 476 insertions(+), 314 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/15034/5 -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 5 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/15034/4/src/kudu/common/column_predicate.h File src/kudu/common/column_predicate.h: http://gerrit.cloudera.org:8080/#/c/15034/4/src/kudu/common/column_predicate.h@152 PS4, Line 152: std::vector&& bfs, > Here and below you don't need the rvalue reference; could just pass std::ve I understand but using rvalue reference as type for bfs forces caller to pass rvalue(using std::move or directly) and avoid the costly copy of a vector by mistake. Hence this is deliberate. http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h File src/kudu/util/block_bloom_filter.h: http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h@78 PS3, Line 78: // Initialize the internal data structures using the supplied arguments. : // Useful for de-serializing the BlockBloomFilter. : Status Init(int log_space_bytes, const void* src_data, size_t src_len, bool always_false); > Good point. Thanks for the tip. Done. http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h@121 PS3, Line 121: // Returns amount of space used in log2 bytes. : int GetLogSpaceBytes() const { : return log_num_buckets_ + kLogBucketByteSize; : } > Same as above. Done http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.cc File src/kudu/util/block_bloom_filter.cc: http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.cc@108 PS3, Line 108: > Ths Init variant calls InitInternal (memset on L91) and then calls memcpy ( Intent was to move the memset() outside of InitInternal() but missed removing that line from InitInternal(). Fixed. -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 4 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Fri, 17 Jan 2020 00:08:51 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15034 ) Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/15034/4/src/kudu/common/column_predicate.h File src/kudu/common/column_predicate.h: http://gerrit.cloudera.org:8080/#/c/15034/4/src/kudu/common/column_predicate.h@152 PS4, Line 152: std::vector&& bfs, Here and below you don't need the rvalue reference; could just pass std::vector<...> as-is and still move it. http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h File src/kudu/util/block_bloom_filter.h: http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h@78 PS3, Line 78: // Initialize the internal data structures using the supplied arguments. : // Useful for de-serializing the BlockBloomFilter. : Status Init(int log_space_bytes, const void* src_data, size_t src_len, bool always_false); > Passing the entire kudu::ColumnPredicatePB_BlockBloomFilter would require i Good point. What do you think about moving the PB representation of BlockBloomFilter into kudu_util? There's some precedence for that (see hash.proto). http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.cc File src/kudu/util/block_bloom_filter.cc: http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.cc@108 PS3, Line 108: > Done Ths Init variant calls InitInternal (memset on L91) and then calls memcpy (L115). Likewise, the other Init variant calls InitInternal (memset) and does another memset on L98. So it doesn't seem like we've gained here; if anything, we're duplicating work more now. Am I missing something? -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 4 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu Gerrit-Comment-Date: Thu, 16 Jan 2020 04:35:30 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side
Hello Kudu Jenkins, helifu, Yao Xu, Adar Dembo, ZhangYao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15034 to look at the new patch set (#4). Change subject: KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side .. KUDU-2483 Integrate BlockBloomFilter with ColumnPredicate on server side This change switches the implementation of the ColumnPredicate to use the BlockBloomFilter for the BloomFilter predicate on the server side. Earlier implementation was still experimental and didn't provide public client APIs that actually use this BloomFilter predicate so taken the liberty to make incompatible wire protocol changes. Most of the change involves refactoring the implementation including the unit tests. Currently only FAST_HASH algorithm is supported since 32-bit versions of MURMUR2 and CITY_HASH are not currently implemented. Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a --- M src/kudu/common/column_predicate-test.cc M src/kudu/common/column_predicate.cc M src/kudu/common/column_predicate.h M src/kudu/common/common.proto M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/tablet/cfile_set-test.cc M src/kudu/util/block_bloom_filter.cc M src/kudu/util/block_bloom_filter.h M src/kudu/util/hash.proto M src/kudu/util/hash_util.h 11 files changed, 441 insertions(+), 308 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/15034/4 -- To view, visit http://gerrit.cloudera.org:8080/15034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a Gerrit-Change-Number: 15034 Gerrit-PatchSet: 4 Gerrit-Owner: Bankim Bhavsar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu Gerrit-Reviewer: ZhangYao Gerrit-Reviewer: helifu