Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12323 )
Change subject: KUDU-2670: Part 1: Build ScanToken by KeyRange ...................................................................... Patch Set 20: (18 comments) http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java: http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@1942 PS20, Line 1942: * @param startPrimaryKey start splitting at the beginning of the tablet : * @param endPrimaryKey stop splitting at the end of the tablet Same. http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@1978 PS20, Line 1978: * @param startPrimaryKey start splitting at the beginning of the tablet : * @param endPrimaryKey stop splitting at the end of the tablet See my other comments about this in KuduTable. http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@1990 PS20, Line 1990: * @throws Exception MasterErrorException if the table doesn't exist Not your fault, but MasterErrorException hasn't existed since commit 0a792366e. Could you update this (and wherever you copied this from) to reflect that? http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@2015 PS20, Line 2015: * Nit: unnecessary empty line here. http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@2016 PS20, Line 2016: * @param startPrimaryKey start splitting at the beginning of the tablet : * @param endPrimaryKey stop splitting at the end of the tablet Same. http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@2080 PS20, Line 2080: }).join()); The join here means that this isn't an async call, and will likely muck with the internal machinery of the Kudu client. It should be possible to avoid it. Logically you're kicking off several independent asynchronous operations (getTableKeyRanges), converting each response into a list of KeyRanges, then combining the lists into one and returning that to the client. Deferred.group might be able to help here. You may also find it helpful to declare a new Deferred locally, capture it in one of these lambdas, and signal it only when the entire operation is done. See ConnectToCluster for an example of that pattern. http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/KeyRange.java File java/kudu-client/src/main/java/org/apache/kudu/client/KeyRange.java: http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/KeyRange.java@35 PS20, Line 35: * @param primaryKeyEnd the encoded priamry key wheere to stop in the key range (exclusive) primary http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/KeyRange.java@38 PS20, Line 38: public KeyRange(LocatedTablet tablet, Doc 'tablet'. Based on toString() it looks like it could be null; when does that happen and what does it mean? http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java File java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java: http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java@312 PS20, Line 312: It was used split tablet's It is used to split the tablet's http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java@416 PS20, Line 416: keyRanges Shouldn't this be newKeyRanges? Seems like a pretty serious bug (L419 could throw an IndexOutOfRangeException); why didn't any unit tests catch this? http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java@435 PS20, Line 435: keyRange.getPrimaryKeyStart() Could use primaryKeyStart here (local declared on L432). Same for primaryKeyEnd below. http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/KuduTable.java File java/kudu-client/src/main/java/org/apache/kudu/client/KuduTable.java: http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/KuduTable.java@226 PS20, Line 226: range ranges http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/KuduTable.java@230 PS20, Line 230: * @param primaryKeyStart start splitting at the beginning of the tablet : * @param primaryKeyEnd stop splitting at the end of the tablet This documentation is confusing: is this saying that if these are null, we will start at the beginning and/or stop at the end? If so, could you also add the meaning of the parameters when not null? http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/SplitKeyRangeRequest.java File java/kudu-client/src/main/java/org/apache/kudu/client/SplitKeyRangeRequest.java: http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/main/java/org/apache/kudu/client/SplitKeyRangeRequest.java@55 PS20, Line 55: * @param timeoutMillis Doc this one? http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/test/java/org/apache/kudu/client/TestSplitKeyRange.java File java/kudu-client/src/test/java/org/apache/kudu/client/TestSplitKeyRange.java: http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/test/java/org/apache/kudu/client/TestSplitKeyRange.java@20 PS20, Line 20: import static org.apache.kudu.test.ClientTestUtil.*; Please expand this. http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/test/java/org/apache/kudu/client/TestSplitKeyRange.java@121 PS20, Line 121: // 3.2 Coverage, but no data. RPC return a key range for tablet's MRS. If we flushed the MRS by sleeping on L59, how is this querying the MRS? http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-client/src/test/java/org/apache/kudu/client/TestSplitKeyRange.java@123 PS20, Line 123: primaryKeyEnd = KeyEncoder.encodePrimaryKey(partialRowEnd); Nit: indentation. http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala: http://gerrit.cloudera.org:8080/#/c/12323/20/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala@1043 PS20, Line 1043: // Add a SparkListener to count the number of tasks that end. : var actualNumTasks = 0 : val listener = new SparkListener { : override def onTaskEnd(taskEnd: SparkListenerTaskEnd): Unit = { : actualNumTasks += 1 : } : } : ss.sparkContext.addSparkListener(listener) This has turned out to be a flaky test pattern. See https://gerrit.cloudera.org/c/13263/ for details and a workaround. -- To view, visit http://gerrit.cloudera.org:8080/12323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0502f5d64569e8b1d45e88de3cb36aa2e01234d0 Gerrit-Change-Number: 12323 Gerrit-PatchSet: 20 Gerrit-Owner: yangz <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu <[email protected]> Gerrit-Reviewer: yangz <[email protected]> Gerrit-Comment-Date: Mon, 13 May 2019 18:55:43 +0000 Gerrit-HasComments: Yes
