Thomas Tauber-Marshall has uploaded a new patch set (#8). Change subject: IMPALA-3742: partitions DMLs for Kudu tables ......................................................................
IMPALA-3742: partitions DMLs for Kudu tables Bulk DMLs (INSERT, UPSERT, UPDATE, and DELETE) for Kudu are currently painful because we just send rows randomly, which creates a lot of work for Kudu since it partitions and sorts data before writing, causing writes to be slow. We can alleviate this by sending the rows to Kudu already partitioned and sorted. This patch partitions the rows according to Kudu's partitioning scheme. A followup patch will deal with sorting. It accomplishes this by inserting an exchange node into the plan before the DML operation. The DataStreamSender then uses a new abstraction, DataStreamPartitioner, that calls into the Kudu client to determine the partition for each row. Testing: - Updated planner tests. - Manually verified the partitioning works as expected. Change-Id: Ic10b3295159354888efcde3df76b0edb24161515 --- M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-util.cc M be/src/exec/kudu-util.h M be/src/runtime/CMakeLists.txt M be/src/runtime/coordinator.cc A be/src/runtime/data-stream-partitioner.cc A be/src/runtime/data-stream-partitioner.h M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-sender.h M be/src/scheduling/scheduler.cc M bin/impala-config.sh M common/thrift/Partitions.thrift M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/TableSink.java M testdata/workloads/functional-planner/queries/PlannerTest/kudu-delete.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-update.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test 23 files changed, 586 insertions(+), 104 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/6037/8 -- To view, visit http://gerrit.cloudera.org:8080/6037 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic10b3295159354888efcde3df76b0edb24161515 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
