Noemi Pap-Takacs has uploaded a new patch set (#24). ( http://gerrit.cloudera.org:8080/18393 )
Change subject: IMPALA-4530: Implement in-memory merge of quicksorted runs ...................................................................... IMPALA-4530: Implement in-memory merge of quicksorted runs This change aims to decrease back-pressure in the sorter. It offers an alternative for the in-memory run formation strategy and sorting algorithm by introducing a new in-memory merge level between the in-memory quicksort and the external merge phase. Instead of forming one big run, it produces many smaller in-memory runs (called miniruns), sorts those with quicksort, then merges them in memory, before spilling or serving GetNext(). The external merge phase remains the same. Works with MAX_SORT_RUN_SIZE development query option that determines the maximum number of pages in a 'minirun'. The default value of MAX_SORT_RUN_SIZE is 0, which keeps the original implementation of 1 big initial in-memory run. Other options are integers of 2 and above. The recommended value is 10 or more, to avoid high fragmentation in case of large workloads and variable length data. Testing: - added MAX_SORT_RUN_SIZE as an additional test dimension to test_sort.py with values [0, 2, 20] - additional partial sort test case (inserting into partitioned kudu table) - manual E2E testing Change-Id: I58c0ae112e279b93426752895ded7b1a3791865c --- M be/src/exec/partial-sort-node.cc M be/src/exec/partial-sort-node.h M be/src/runtime/sorter-internal.h M be/src/runtime/sorter.cc M be/src/runtime/sorter.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M bin/perf_tools/perf-query.sh M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M tests/query_test/test_sort.py 12 files changed, 457 insertions(+), 64 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/18393/24 -- To view, visit http://gerrit.cloudera.org:8080/18393 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I58c0ae112e279b93426752895ded7b1a3791865c Gerrit-Change-Number: 18393 Gerrit-PatchSet: 24 Gerrit-Owner: Noemi Pap-Takacs <npaptak...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com> Gerrit-Reviewer: Tamas Mate <tma...@apache.org>