Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11333 )

Change subject: [KUDU-2521] Java Implementation for BloomFilter
......................................................................


Patch Set 5:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java
File java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@30
PS5, Line 30: public class BloomFilter {
Could you add Javadocs to the class as well as to all public 
non-visiblefortesting methods, explaining how they work and how to use them?


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@33
PS5, Line 33:   private byte[] bitmap;
Curious why you chose to implement your own bitmap instead of reusing 
java.util.BitSet?


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@36
PS5, Line 36:   private HashFunction hashFunction;
Some of these fields can be marked final because they're not modified after the 
constructor call, right?


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@39
PS5, Line 39:     this.nBits = nBits;
Should we precondition that this is at least 8?


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@40
PS5, Line 40:     this.bitmap = bitmap;
Likewise, should we precondition that this has space for at least 8 bits?


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@65
PS5, Line 65: nBytes
Shouldn't this be nBits? I'm surprised this worked as-is; could you add a unit 
test that would fail or malfunction when this is nBytes?


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@74
PS5, Line 74:     if (data) {
            :       byteBuffer[0] = 1;
            :     } else {
            :       byteBuffer[0] = 0;
            :     }
Nit: byteBuffer[0] = data ? 1 : 0;


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@143
PS5, Line 143:   private void updateBitmap(byte[] byteBuffer, int length) {
Should we precondition that bitmap.size >= length?


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@151
PS5, Line 151:       tmp = tmp + h2;
Nit: tmp += h2;


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@155
PS5, Line 155:   // This is for test.
Methods whose visibility has been increased for testing should be annotated 
with:

  @InterfaceAudience.LimitedPrivate("Test”)

Then you don't need the "This is for test" comments either.

See 
https://github.com/apache/kudu/commit/abbde75e12f2275e1a286df60d788e9c5b411bcb 
for more details.


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@225
PS5, Line 225:   private boolean checkIfContains(byte[] bytes) {
Likewise, precondition on bytes.length vs. bitmap size here?


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@233
PS5, Line 233: bitpos
Nit: bitPos (camel-case)


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@237
PS5, Line 237:       tmp = tmp + h2;
Nit: tmp += h2;


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@243
PS5, Line 243:   private static double kNaturalLog2= 0.69314;
Nit: reformat as:

  private static double kNaturalLog2 = 0.69314;

(separate the assignment operator '=' from the variable name 'kNaturalLog2' 
with a space)


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@257
PS5, Line 257: n_bits,
Nit: camel-case in Java, so 'nBits'


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@271
PS5, Line 271:   private static long pickBit(long hash, int nBits) {
Nit: maybe rename to GetBitIndex or some such, to emphasize that what's 
returned is an index into a bitmap?


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@276
PS5, Line 276:   public String toString() {
Nit: mind implementing this via MoreObjects.toStringHelper(), or even just a 
StringBuilder? Easier to follow, I think.


http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/test/java/org/apache/kudu/client/TestBloomFilter.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestBloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/11333/5/java/kudu-client/src/test/java/org/apache/kudu/client/TestBloomFilter.java@29
PS5, Line 29:   private int kRandomSeed = 0xdeadbeef;
By reusing the same seed, you'll get the exact same PRNG sequencing with every 
test run, so the test coverage will remain static. Why not seed using the 
system time or something like that?



--
To view, visit http://gerrit.cloudera.org:8080/11333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32673c008f9958088d281c2c198c543bfea2eb8e
Gerrit-Change-Number: 11333
Gerrit-PatchSet: 5
Gerrit-Owner: jinxing6...@126.com
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: jinxing6...@126.com
Gerrit-Comment-Date: Tue, 11 Sep 2018 22:06:12 +0000
Gerrit-HasComments: Yes

Reply via email to