Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13703 )

Change subject: KUDU-2483 (part 1/3) Add bloom filter predicate for Java API
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13703/1/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java
File java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/13703/1/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@31
PS1, Line 31: space-efficient filter
unrelated to this patch, but for best performance, maybe we should consider one 
of the cacheline-aware bloom alternatives? 
http://www.vldb.org/pvldb/vol12/p502-lang.pdf is a recent reference on this 
topic that specifically pertains to this scenario (bloom pushdown across joins)

Given we haven't made use of the bloom stuff yet in Impala or Spark (afaik 
outside of experimental branches) maybe it's not too late to change this?


http://gerrit.cloudera.org:8080/#/c/13703/1/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@292
PS1, Line 292:       long bitPos = h % nBits;
nit: can we constrain the number of bits to be a power of two, so that we can 
optimize this to be a bit mask operation? or is that not feasible?



--
To view, visit http://gerrit.cloudera.org:8080/13703
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If96ec164e86091f8fb3b7f92b3623ca3e728edfd
Gerrit-Change-Number: 13703
Gerrit-PatchSet: 1
Gerrit-Owner: Yao Xu <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Yao Xu <[email protected]>
Gerrit-Comment-Date: Tue, 25 Jun 2019 06:05:08 +0000
Gerrit-HasComments: Yes

Reply via email to