Adar Dembo has posted comments on this change. Change subject: KUDU-1309: [java client] support tables with non-covering partition-key ranges ......................................................................
Patch Set 2: (35 comments) http://gerrit.cloudera.org:8080/#/c/3388/2/java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java: PS2, Line 139: This map : * is always the first to be updated, because that's the map from : * which all the lookups are done in the fast-path of the requests : * that need to locate a tablet. The second map to be updated is : * tablet2client, because it comes second in the fast-path : * of every requests that need to locate a tablet. The third map : * is only used to handle TabletServer disconnections gracefully. Nit: reading this again (I reviewed it eons ago), I think this part of the comment should broken up and spread across the different map's comments. Line 155: * Currently once a non-covered range is added to the cache, it is never Nit: prefix this part of the comment with "TODO:" since it is expected to change soon. Line 576: Deferred<AsyncKuduScanner.Response> openScanner(final AsyncKuduScanner scanner) { Do we need to preserve this indirection due to class/method visibility? If not, perhaps we can remove it? Line 685: // Otherwise fall through to below where a GetTableLocations lookup will occur Nit: end with a period. Line 1148: Pair<byte[], byte[]> nonCoveredRange = getNonCoveredRange(table.getTableId(), key); Nit: if you want, you can call getTableId() once outside of the loop, store it in a local variable, and use it here and on L1141. PS2, Line 1355: NonCoveredRangeCache nonCoveredRanges = nonCoveredRangeCaches.get(tableId); : if (nonCoveredRanges == null) { : nonCoveredRanges = new NonCoveredRangeCache(); : NonCoveredRangeCache oldCache = nonCoveredRangeCaches.putIfAbsent(tableId, nonCoveredRanges); : if (oldCache != null) { : nonCoveredRanges = oldCache; : } : } How about just: NonCoveredRangeCache nonCoveredRanges = nonCoveredRangeCaches.putIfAbsent(tableId, new NonCoveredRangeCache()); Line 1378: nonCoveredRanges.addNonCoveredRange(EMPTY_ARRAY, firstStartKey); Don't we need all of these calls to addNonCoveredRange() to be atomic? Without that, isn't it possible for the non-covered range view of a table to become inconsistent? Imagine two callers in this function. One has an up-to-date locations list, and the other has a stale one (i.e. some add/drop range calls happened in between). All of these addNonCoveredRange() calls get interspersed. The resulting view is totally wacky, no? http://gerrit.cloudera.org:8080/#/c/3388/2/java/kudu-client/src/main/java/org/kududb/client/AsyncKuduScanner.java File java/kudu-client/src/main/java/org/kududb/client/AsyncKuduScanner.java: Line 474: invalidate(); Nit: invalidate() is called in both branches, can we pull it out? http://gerrit.cloudera.org:8080/#/c/3388/2/java/kudu-client/src/main/java/org/kududb/client/AsyncKuduSession.java File java/kudu-client/src/main/java/org/kududb/client/AsyncKuduSession.java: Line 127: private final List<OperationResponse> failedLookups = new ArrayList<>(); I think you can use a CopyOnWriteArrayList here and omit the use of a lock if you're careful to add all of the failed lookups en masse. That is, add them to an intermediary list, construct a new CopyOnWriteArrayList using the intermediary (because you don't want to add entries one by one to a CopyOnWriteArrayList), then assign it to failedLookups. On second thought, I'm not sure this is possible because of the atomic "get all and clear" operation in ConvertBatchToListOfResponsesCB. I don't see how to handle that without wrapping the list in an AtomicReference which, while unintuitive, isn't necessarily a bad idea. Line 279: if (!operationsInLookup.isEmpty()) { Deferreds are weird, man. I've read the code in the two branches here over and over, and I still can't figure out how the mere semantic change of "call flushAllBatches() immediately" vs. "return a new Deferred with a chained callback to flushAllBatches() via OperationsInLookupDoneCB" is enough to handle the case of outstanding lookups. OK, I think I figured it out. lookupsDone is a class field and we do other stuff with it elsewhere. Line 303: Callback<Deferred<List<OperationResponse>>, ArrayList<BatchResponse>> { Nit: not related to your change, but why is ArrayList the second type and not List? Ahh, it's because Deferred.group() in flushAllBatches() returns a Deferred<ArrayList>. Wonderful. Line 306: batchResponses = batchResponses == null ? new ArrayList<BatchResponse>() : batchResponses; Is a null batchResponses actually possible? How? PS2, Line 308: // flushTablet() can return null when a tablet we wanted to flush was already flushed. Those : // nulls along with BatchResponses are then put in a list by Deferred.group(). We first need : // to filter the nulls out. OMG. Line 320: List<OperationResponse> responsesList = new ArrayList<>(size); If we're going to go through the trouble of precomputing the size, we ought to include failedLookups too. Line 397: if (tablet == null) { Why don't we perform this check to AUTO_FLUSH_SYNC on L377-384? Seems useful to figure out locally that we're going to write to a non-covered range than to make the round trip to the TS to learn the same thing. PS2, Line 399: new RowError(Status.NotFound("Write to a non-covered range partition"), : operation, null); Could you add a two-arg constructor for this case? PS2, Line 401: new OperationResponse(0, null, 0, operation, rowError); Likewise, perhaps add a new constructor for this? PS2, Line 416: if (lookupsDone != null && operationsInLookup.isEmpty()) { : lookupsDoneCopy = lookupsDone; : lookupsDone = null; : } Copied from addToBuffer(), right? Can it be decomposed into a helper with an intuitive name? Ideally it'd do the remove() too, but addToBuffer() wants to know if remove() returned anything, so then the helper would need to return a tuple (or a special pair class, or whatever). Line 422: operation.callback(response); Why do we do this? http://gerrit.cloudera.org:8080/#/c/3388/2/java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java File java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java: Line 57: * The table creator takes ownership of the rows. If either row is empty, then Is this copied from the C++ client? What does 'ownership' mean here? In any event, why not make copies of the rows like addSplitRow()? http://gerrit.cloudera.org:8080/#/c/3388/2/java/kudu-client/src/main/java/org/kududb/client/KuduRpc.java File java/kudu-client/src/main/java/org/kududb/client/KuduRpc.java: Line 77: return null; Why return null instead of throwing some sort of "unsupported operation" exception, here in the base class? http://gerrit.cloudera.org:8080/#/c/3388/2/java/kudu-client/src/main/java/org/kududb/client/NonCoveredRangeCache.java File java/kudu-client/src/main/java/org/kududb/client/NonCoveredRangeCache.java: Line 39: final class NonCoveredRangeCache { Nit: why 'final'? The class is package-private, so it seems like the only classes whom we're preventing from overriding are classes that are part of the client itself. Line 40: public static final Logger LOG = LoggerFactory.getLogger(NonCoveredRangeCache.class); Nit: can be private Line 59: if (COMPARATOR.compare(partitionKey, range.getKey()) >= 0 && Isn't this first comparison implied by the floorEntry() call? Line 61: return new Pair<>(range.getKey(), range.getValue()); Why can't we return the range as-is? Why do we need to make a copy? Line 77: if (nonCoveredRanges.put(startPartitionKey, endPartitionKey) == null) { May want to comment on the concurrent behavior (i.e. multiple callers in discoverNonCoveredRanges() at the same time). Seems like there'll be unnecessary put() calls here, but assuming the tablet listing is the same in each caller, in the end the map should have the right contents. Line 84: public String toString() { Looks like this will return an inconsistent string representation if nonCoveredRanges changes mid-call (due to the separation of entrySet() and size()). Can you fix that? Line 89: if (i++ < nonCoveredRanges.size()) sb.append(", "); If you used a Guava Joiner to convert this list of Strings into a single comma-separated String, you could omit the call to size() altogether. http://gerrit.cloudera.org:8080/#/c/3388/2/java/kudu-client/src/main/java/org/kududb/client/NonCoveredRangeException.java File java/kudu-client/src/main/java/org/kududb/client/NonCoveredRangeException.java: Line 26: @InterfaceStability.Unstable Why not evolving? http://gerrit.cloudera.org:8080/#/c/3388/2/java/kudu-client/src/main/java/org/kududb/client/RowResultIterator.java File java/kudu-client/src/main/java/org/kududb/client/RowResultIterator.java: Line 35: private static final RowResultIterator EMPTY = new RowResultIterator(0, null, null, null, null); If this is for tests only, perhaps make it more visible, annotate it with @VisibleForTesting, and omit the method exposing it? http://gerrit.cloudera.org:8080/#/c/3388/2/java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java File java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java: PS2, Line 243: * * [ 0, 50) : * * [ 50, 100) : * * [200, 300) Nit: what does the extra '*' mean? http://gerrit.cloudera.org:8080/#/c/3388/2/java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java File java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java: Line 104: KuduScanner s = syncClient.newScannerBuilder(table).scanRequestTimeout(9999999).build(); That's an odd choice of timeout. Why? And if you're looking for a large number, why not INT_MAX or something equivalent? http://gerrit.cloudera.org:8080/#/c/3388/2/java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java File java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java: Line 457: if (key == 0) session.flush(); Why flush after the very first operation? What's the significance of that? http://gerrit.cloudera.org:8080/#/c/3388/2/java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java File java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java: Line 35: tableName = TestKuduClient.class.getName() + "-" + System.currentTimeMillis(); FWIW, even though this is guaranteed to run with every test, if a test finishes in under one ms, the next test could end up with the same table name. Maybe use https://github.com/junit-team/junit4/wiki/Rules#testname-rule to help? Line 221: createOptions.setNumReplicas(1); Why do the new tests use only one replica, but the old ones don't? -- To view, visit http://gerrit.cloudera.org:8080/3388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id65a1f8a95bb16fc0360a17021a391afd3c9d03f Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert <d...@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Dan Burkert <d...@cloudera.com> Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes