Grant Henke has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12484 )

Change subject: KUDU-2672: [spark] Optionally repartition to match Kudu 
partitions
......................................................................

KUDU-2672: [spark] Optionally repartition to match Kudu partitions

Adds a write option to repartition the data to match
the target Kudu partitions. Additionally provides the
option to sort while repartitioning.

Repartitioning ensures that one task/client is only
writing to a single tablet. This improves throughput
by improving batching especially for tables with a large
number of partitions.

Additionally sorting before writing to Kudu reduces the
amount of compactions needed and can improve
sustained throughput.

Change-Id: I8763615997bccc08901235841149fc3bacb321e7
Reviewed-on: http://gerrit.cloudera.org:8080/12484
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <[email protected]>
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
M 
java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduWriteOptions.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/RowConverter.scala
M 
java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
5 files changed, 184 insertions(+), 22 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Adar Dembo: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/12484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8763615997bccc08901235841149fc3bacb321e7
Gerrit-Change-Number: 12484
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Will Berkeley <[email protected]>

Reply via email to