Thomas Tauber-Marshall has uploaded a new patch set (#2). Change subject: DRAFT - IMPALA-5498: Support for partial sorts ......................................................................
DRAFT - IMPALA-5498: Support for partial sorts Impala currently supports total sorts (the entire set of data is sorted) and top-n sorts (only the highest/lowest n elements are sorted). This patch adds the ability to do partial sorts, where the data is divided up into some number of subsets, each of which is sorted individually. It accomplishes this by adding a new exec node, PartialSortNode. When PartialSortNode::GetNext() is called, it retrieves input up to its memory limit, uses the existing Sorter class to sort it, and outputs it. This is faster than a total sort with SortNode as it avoids the need to spill if the input is larger than the memory limit. In the planner, the SortNode plan node is used, with an enum value indicating if it is a total or partial sort. As a first use case, partial sort is used where a total sort was used previously for inserts into Kudu. This patch is a work in progress, and needs to be polished and tested. Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38 --- M be/src/exec/CMakeLists.txt M be/src/exec/exec-node.cc A be/src/exec/partial-sort-node.cc A be/src/exec/partial-sort-node.h M be/src/runtime/sorter.cc M be/src/runtime/sorter.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test 13 files changed, 388 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/7267/2 -- To view, visit http://gerrit.cloudera.org:8080/7267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com> Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>