[kudu-CR] Integrate BlockBloomFilter with ColumnPredicate on server side

2020-01-15 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15034 )

Change subject: Integrate BlockBloomFilter with ColumnPredicate on server side
..


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/15034/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15034/3//COMMIT_MSG@14
PS3, Line 14: didn't ship a client
> Nit: didn't provide public client APIs
Done


http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/common/column_predicate.cc
File src/kudu/common/column_predicate.cc:

http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/common/column_predicate.cc@78
PS3, Line 78:   bloom_filters_.swap(*bfs);
> Could maybe pass bfs by value and std::move() into bloom_filters_ as part o
Done


http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/common/common.proto
File src/kudu/common/common.proto:

http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/common/common.proto@374
PS3, Line 374: optional HashAlgorithm hash_algorithm = 3 [default = 
FAST_HASH];
 : optional uint32 hash_seed = 4;
 : optional bool always_false = 5;
> Could you doc these fields too?
Done


http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h
File src/kudu/util/block_bloom_filter.h:

http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h@78
PS3, Line 78:   // Initialize the internal data structures using the supplied 
arguments.
:   // Useful for de-serializing the BlockBloomFilter.
:   Status Init(int log_space_bytes, const void* src_data, size_t 
src_len, bool always_false);
> It'd be more clear if you called this FromPB and passed the entire BlockBlo
Passing the entire kudu::ColumnPredicatePB_BlockBloomFilter would require 
including kudu/common/common.pb.h from the generated code in this kudu util 
directory. That'll make importing kudu util to other projects like Impala 
difficult/not possible.


http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h@118
PS3, Line 118:   int GetLogSpaceBytes() const {
> Doc?
Done


http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h@121
PS3, Line 121:
 :   // Assign the internal directory data structure to the out 
parameter "bloom_data".
 :   // Useful for serializing the BlockBloomFilter.
 :   void AssignDirectory(std::string* bloom_data) const;
> Likewise, maybe model as ToPB()?
Same as above.

For readability, moved all functions related to serialization together and 
added a comment.


http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.cc
File src/kudu/util/block_bloom_filter.cc:

http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.cc@108
PS3, Line 108: memcpy(directory_, src_data, src_len);
> Would be nice to avoid the memset() in the other Init() overload if possibl
Done



--
To view, visit http://gerrit.cloudera.org:8080/15034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a
Gerrit-Change-Number: 15034
Gerrit-PatchSet: 3
Gerrit-Owner: Bankim Bhavsar 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yao Xu 
Gerrit-Reviewer: ZhangYao 
Gerrit-Reviewer: helifu 
Gerrit-Comment-Date: Thu, 16 Jan 2020 00:03:16 +
Gerrit-HasComments: Yes


[kudu-CR] Integrate BlockBloomFilter with ColumnPredicate on server side

2020-01-15 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15034 )

Change subject: Integrate BlockBloomFilter with ColumnPredicate on server side
..


Patch Set 3:

Adding original authors/reviewers of KUDU-2483.


--
To view, visit http://gerrit.cloudera.org:8080/15034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a
Gerrit-Change-Number: 15034
Gerrit-PatchSet: 3
Gerrit-Owner: Bankim Bhavsar 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yao Xu 
Gerrit-Reviewer: ZhangYao 
Gerrit-Reviewer: helifu 
Gerrit-Comment-Date: Wed, 15 Jan 2020 18:33:59 +
Gerrit-HasComments: No


[kudu-CR] Integrate BlockBloomFilter with ColumnPredicate on server side

2020-01-14 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15034 )

Change subject: Integrate BlockBloomFilter with ColumnPredicate on server side
..


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/15034/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15034/3//COMMIT_MSG@14
PS3, Line 14: didn't ship a client
Nit: didn't provide public client APIs


http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/common/column_predicate.cc
File src/kudu/common/column_predicate.cc:

http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/common/column_predicate.cc@78
PS3, Line 78:   bloom_filters_.swap(*bfs);
Could maybe pass bfs by value and std::move() into bloom_filters_ as part of 
the constructor's argument list?


http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/common/common.proto
File src/kudu/common/common.proto:

http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/common/common.proto@374
PS3, Line 374: optional HashAlgorithm hash_algorithm = 3 [default = 
FAST_HASH];
 : optional uint32 hash_seed = 4;
 : optional bool always_false = 5;
Could you doc these fields too?


http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h
File src/kudu/util/block_bloom_filter.h:

http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h@78
PS3, Line 78:   // Initialize the internal data structures using the supplied 
arguments.
:   // Useful for de-serializing the BlockBloomFilter.
:   Status Init(int log_space_bytes, const void* src_data, size_t 
src_len, bool always_false);
It'd be more clear if you called this FromPB and passed the entire 
BlockBloomFilter PB message into it as its sole argument.


http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h@118
PS3, Line 118:   int GetLogSpaceBytes() const {
Doc?


http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.h@121
PS3, Line 121:
 :   // Assign the internal directory data structure to the out 
parameter "bloom_data".
 :   // Useful for serializing the BlockBloomFilter.
 :   void AssignDirectory(std::string* bloom_data) const;
Likewise, maybe model as ToPB()?


http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.cc
File src/kudu/util/block_bloom_filter.cc:

http://gerrit.cloudera.org:8080/#/c/15034/3/src/kudu/util/block_bloom_filter.cc@108
PS3, Line 108: memcpy(directory_, src_data, src_len);
Would be nice to avoid the memset() in the other Init() overload if possible.



--
To view, visit http://gerrit.cloudera.org:8080/15034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a
Gerrit-Change-Number: 15034
Gerrit-PatchSet: 3
Gerrit-Owner: Bankim Bhavsar 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 15 Jan 2020 01:18:11 +
Gerrit-HasComments: Yes


[kudu-CR] Integrate BlockBloomFilter with ColumnPredicate on server side

2020-01-14 Thread Bankim Bhavsar (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15034

to look at the new patch set (#3).

Change subject: Integrate BlockBloomFilter with ColumnPredicate on server side
..

Integrate BlockBloomFilter with ColumnPredicate on server side

This change switches the implementation of the ColumnPredicate
to use the BlockBloomFilter for the BloomFilter predicate
on the server side.

Earlier implementation was still experimental
and didn't ship a client that actually uses this BloomFilter
predicate so taken the liberty to make incompatible wire protocol changes.

Most of the change involves refactoring the implementation
including the unit tests.

Currently only FAST_HASH algorithm is supported since
32-bit versions of MURMUR2 and CITY_HASH are not currently
implemented.

Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a
---
M src/kudu/common/column_predicate-test.cc
M src/kudu/common/column_predicate.cc
M src/kudu/common/column_predicate.h
M src/kudu/common/common.proto
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/tablet/cfile_set-test.cc
M src/kudu/util/block_bloom_filter.cc
M src/kudu/util/block_bloom_filter.h
M src/kudu/util/hash.proto
M src/kudu/util/hash_util.h
11 files changed, 412 insertions(+), 300 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/15034/3
--
To view, visit http://gerrit.cloudera.org:8080/15034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a
Gerrit-Change-Number: 15034
Gerrit-PatchSet: 3
Gerrit-Owner: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] Integrate BlockBloomFilter with ColumnPredicate on server side

2020-01-14 Thread Bankim Bhavsar (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15034

to look at the new patch set (#2).

Change subject: Integrate BlockBloomFilter with ColumnPredicate on server side
..

Integrate BlockBloomFilter with ColumnPredicate on server side

This change switches the implementation of the ColumnPredicate
to use the BlockBloomFilter for the BloomFilter predicate
on the server side.

Earlier implementation was still experimental
and didn't ship a client that actually uses this BloomFilter
predicate so taken the liberty to make incompatible wire protocol changes.

Most of the change involves refactoring the implementation
including the unit tests.

Currently only FAST_HASH algorithm is supported since
32-bit versions of MURMUR2 and CITY_HASH are not currently
implemented.

Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a
---
M src/kudu/common/column_predicate-test.cc
M src/kudu/common/column_predicate.cc
M src/kudu/common/column_predicate.h
M src/kudu/common/common.proto
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/tablet/cfile_set-test.cc
M src/kudu/util/block_bloom_filter-test.cc
M src/kudu/util/block_bloom_filter.cc
M src/kudu/util/block_bloom_filter.h
M src/kudu/util/hash.proto
M src/kudu/util/hash_util.h
12 files changed, 412 insertions(+), 301 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/15034/2
--
To view, visit http://gerrit.cloudera.org:8080/15034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a
Gerrit-Change-Number: 15034
Gerrit-PatchSet: 2
Gerrit-Owner: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] Integrate BlockBloomFilter with ColumnPredicate on server side

2020-01-14 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15034


Change subject: Integrate BlockBloomFilter with ColumnPredicate on server side
..

Integrate BlockBloomFilter with ColumnPredicate on server side

This change switches the implementation of the ColumnPredicate
to use the BlockBloomFilter for the BloomFilter predicate
on the server side.

Earlier implementation was still experimental
and didn't ship a client that actually uses this BloomFilter
predicate so taken the liberty to make incompatible wire protocol changes.

Most of the change involves refactoring the implementation
including the unit tests.

Currently only FAST_HASH algorithm is supported since
32-bit versions of MURMUR2 and CITY_HASH are not currently
implemented.

Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a
---
M src/kudu/common/column_predicate-test.cc
M src/kudu/common/column_predicate.cc
M src/kudu/common/column_predicate.h
M src/kudu/common/common.proto
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/tablet/cfile_set-test.cc
M src/kudu/util/block_bloom_filter-test.cc
M src/kudu/util/block_bloom_filter.cc
M src/kudu/util/block_bloom_filter.h
M src/kudu/util/hash.proto
M src/kudu/util/hash_util.h
12 files changed, 413 insertions(+), 300 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/15034/1
--
To view, visit http://gerrit.cloudera.org:8080/15034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7ecfd67e9c5fbe459c5b4aed91e0be2a194d433a
Gerrit-Change-Number: 15034
Gerrit-PatchSet: 1
Gerrit-Owner: Bankim Bhavsar