Yao Xu has uploaded a new patch set (#23) to the change originally created by yangz. ( http://gerrit.cloudera.org:8080/12323 )
Change subject: KUDU-2670: Part 1: Build ScanToken by KeyRange ...................................................................... KUDU-2670: Part 1: Build ScanToken by KeyRange When reading data in a kudu table using spark, if there is a large amount of data in the tablet, reading the data takes a long time. The reason is that KuduRDD uses a tablet to generate the scanToken, so a spark task needs to process all the data in a tablet. We send SplitKeyRange RPC to TServer, split tablet's primary key range into multiple primary key ranges by size, and generate the scanToken by primary key ranges. Change-Id: I0502f5d64569e8b1d45e88de3cb36aa2e01234d0 --- M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java A java/kudu-client/src/main/java/org/apache/kudu/client/KeyRange.java M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java M java/kudu-client/src/main/java/org/apache/kudu/client/KuduTable.java A java/kudu-client/src/main/java/org/apache/kudu/client/SplitKeyRangeRequest.java A java/kudu-client/src/main/java/org/apache/kudu/client/SplitKeyRangeResponse.java A java/kudu-client/src/test/java/org/apache/kudu/client/TestSplitKeyRange.java M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduReadOptions.scala M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/KuduTestSuite.scala M java/kudu-test-utils/src/main/java/org/apache/kudu/test/ClientTestUtil.java 13 files changed, 798 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/23/12323/23 -- To view, visit http://gerrit.cloudera.org:8080/12323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0502f5d64569e8b1d45e88de3cb36aa2e01234d0 Gerrit-Change-Number: 12323 Gerrit-PatchSet: 23 Gerrit-Owner: yangz <zhe...@gmail.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu <oclarms....@gmail.com> Gerrit-Reviewer: yangz <zhe...@gmail.com>