Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14381


Change subject: WIP: IMPALA-9015: improve mt_dop scan scheduling
......................................................................

WIP: IMPALA-9015: improve mt_dop scan scheduling

Implement longest-processing time algorithm for assigning scan ranges
to instances within a host. This is a standard algorithm that works
well in practice and solves some specific bugs in the current
algorithm.

The previous approach tended to assign multiple ranges to the
first instance and induce skew. E.g. if the ranges were
[3, 4, 5, 6] and it had 4 instances, it would assign
[3, 4], [5], [6], []. This also had the unfortunate consequence
that not all instances actually got allocated scan ranges,
making scheduling hard to reason about.

Testing:
Updated admission test to reflect this change.

Perf:
The algorithm is O(n log n) instead of O(n), where n is the
number of scan ranges allocated to a backend. This seems
worthwhile to get more even work distribution.

TODO - sanity test to make sure that performance is the
same or better

Change-Id: I45ed2dab835efeb64bb74891cb43065894892682
---
M be/src/scheduling/scheduler.cc
M 
testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-admission-slots.test
2 files changed, 72 insertions(+), 57 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/14381/1
--
To view, visit http://gerrit.cloudera.org:8080/14381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I45ed2dab835efeb64bb74891cb43065894892682
Gerrit-Change-Number: 14381
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong <[email protected]>

Reply via email to