Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/13703 )
Change subject: KUDU-2483 (part 1/3) Add bloom filter predicate for Java API ...................................................................... Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/13703/1/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java File java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java: http://gerrit.cloudera.org:8080/#/c/13703/1/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@31 PS1, Line 31: space-efficient filter unrelated to this patch, but for best performance, maybe we should consider one of the cacheline-aware bloom alternatives? http://www.vldb.org/pvldb/vol12/p502-lang.pdf is a recent reference on this topic that specifically pertains to this scenario (bloom pushdown across joins) Given we haven't made use of the bloom stuff yet in Impala or Spark (afaik outside of experimental branches) maybe it's not too late to change this? http://gerrit.cloudera.org:8080/#/c/13703/1/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@292 PS1, Line 292: long bitPos = h % nBits; nit: can we constrain the number of bits to be a power of two, so that we can optimize this to be a bit mask operation? or is that not feasible? -- To view, visit http://gerrit.cloudera.org:8080/13703 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If96ec164e86091f8fb3b7f92b3623ca3e728edfd Gerrit-Change-Number: 13703 Gerrit-PatchSet: 1 Gerrit-Owner: Yao Xu <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Yao Xu <[email protected]> Gerrit-Comment-Date: Tue, 25 Jun 2019 06:05:08 +0000 Gerrit-HasComments: Yes
