Todd Lipcon has posted comments on this change. Change subject: [java-client] implement scan token API ......................................................................
Patch Set 6: (12 comments) http://gerrit.cloudera.org:8080/#/c/2592/6/java/kudu-client/src/main/java/org/kududb/client/Bytes.java File java/kudu-client/src/main/java/org/kududb/client/Bytes.java: Line 1105: byte[] buf = new byte[b.limit() - b.position()]; is that equivalent to b.remaining()? Line 1108: writeByteArray(dataOutput, buf); is this perf-sensitive? If so, you might want to optimize this by checking b.hasArray() and then using the underlying array in the case that it does (saves the allocation and copy) http://gerrit.cloudera.org:8080/#/c/2592/6/java/kudu-client/src/main/java/org/kududb/client/ScanToken.java File java/kudu-client/src/main/java/org/kududb/client/ScanToken.java: Line 48: * method. This allows scan tokens to be created in a context separate of the "context separate of the scan execution" is a little vague sounding. I think a specific example would be better -- eg "This allows use cases such as generating scan tokens in the planner component of a query engine, then sending the tokens to execution nodes based on locality, and then instantiating the scanners on those nodes." Line 87: public ByteBuffer serialize() throws IOException { I generally find ByteBuffers harder to work with than byte[]. Is there a particular reason you've gone with it here? Line 103: public static KuduScanner deserializeIntoScanner(ByteBuffer buf, One thing that strikes me here is that if you return a Scanner and not a ScannerBuilder, the "execution node" can't make any changes to the spec before starting the scan, right? I don't see any immediate use case for it, but worth considering in terms of locking down the API. Line 104: KuduClient client) throws Exception { throwing Exception is pretty evil for a public API Line 176: // TODO can you add a bit more to this and the above TODO about what's missing and why it's not implemented yet? Line 193: public static class ScanTokenBuilder needs an interface annotation (inner classes don't inherit it from their containing class) http://gerrit.cloudera.org:8080/#/c/2592/6/java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java File java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java: Line 321: public void testScanTokens() throws Exception { any test for serializing and deserializing a token? Line 364: count.addAndGet(localCount); would it be correct to assert that localCount > 0? ie that all of the scanners did some work? Otherwise you could have a bug where 15 of the 16 tokens are empty and one has all the data http://gerrit.cloudera.org:8080/#/c/2592/6/java/kudu-mapreduce/src/main/java/org/kududb/mapreduce/KuduTableInputFormat.java File java/kudu-mapreduce/src/main/java/org/kududb/mapreduce/KuduTableInputFormat.java: Line 101: "kudu.mapreduce.encoded.predicates"; we need to make sure we release-note this change (it's incompatible) Line 281: private byte[] partitionKey; should specify that this is the 'start' key -- To view, visit http://gerrit.cloudera.org:8080/2592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I20eff9bf51e893226fc3bc47726565ca62c054e3 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
