Jinfeng Ni created DRILL-5223:
---------------------------------
Summary: Drill should ensure balanced workload assignment at node
level in order to get better query performance
Key: DRILL-5223
URL: https://issues.apache.org/jira/browse/DRILL-5223
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Reporter: Jinfeng Ni
Drill's work assignment logic currently aims to achieve workload balance across
different minor fragment (or slices) and honor data affinity in order to get as
much local read as possible.
However, when the # of work units could not be evenly divided by # of minor
fragments, the remaining work units would tender to go to the first subset of
drill endpoints. This means the drill endpoints assigned with the remaining
work units could have larger workload than the rest of them. When MuxExchange
is enabled (by default), all the minor fragments on the same node have to send
data to a single Muxer per node, and unbalanced workload assignment at node
level could impact query elapse time. which is essentially decided by the
slowest drill endpoint.
Some prototype experimental run shows that with more balanced workload
assignment, Drill shows quite significant improvement for most of TPC-H
queries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)