Csaba Ringhofer has submitted this change and it was merged. (
http://gerrit.cloudera.org:8080/18184 )
Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
......................................................................
IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Based on a 3-way partitioning implementation by Kurt Deschler.
3-way quicksort performs much better on data with large number of
duplicates, but has a small regression in case of large NDV.
This adaptive implementation keeps the advantages of both 2-way
and 3-way quicksort. If duplicates are found during pivot selection
(among the 3 randomly selected candidates),the 3-way partitioning
function is called in SortHelper, otherwise partitioning goes 2-way.
Some benchmark results:
On a view created from 4 tpch_parquet lineitem tables
Full sort, 1 node, 1 run - no spills (only in-memory sort is changed)
Time of sorting adaptively during query execution compared to
the original implementation (sort node profile):
+----------------------------------------------+----------------+--------------------+
| Test | Original 2-way | Adaptive
Quicksort |
+----------------------------------------------+----------------+--------------------+
| select * order by l_linestatus, NDV=2: | 1 |
0.67 |
| select l_shipmode order by l_shipmode, NDV=7 | 1 |
0.42 |
| select * order by l_shipmode, NDV=7 | 1 |
0.57 |
| large NDV, unique data | 1 |
1 | (no difference)
+----------------------------------------------+----------------+--------------------+
Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202
Reviewed-on: http://gerrit.cloudera.org:8080/18184
Reviewed-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Csaba Ringhofer <[email protected]>
Tested-by: Csaba Ringhofer <[email protected]>
---
M be/src/runtime/sorter-internal.h
M be/src/runtime/sorter-ir.cc
M be/src/runtime/sorter.cc
M be/src/util/tuple-row-compare.h
4 files changed, 189 insertions(+), 50 deletions(-)
Approvals:
Impala Public Jenkins: Looks good to me, approved
Csaba Ringhofer: Looks good to me, approved; Verified
--
To view, visit http://gerrit.cloudera.org:8080/18184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202
Gerrit-Change-Number: 18184
Gerrit-PatchSet: 13
Gerrit-Owner: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>