Weijie Tong created DRILL-7607:
----------------------------------
Summary: Dynamic credit based flow control
Key: DRILL-7607
URL: https://issues.apache.org/jira/browse/DRILL-7607
Project: Apache Drill
Issue Type: New Feature
Components: Server, Execution - RPC
Reporter: Weijie Tong
Assignee: Weijie Tong
Drill current has a static credit based flow control between the batch sender
and receiver. That means ,all the sender send out their batch through the
DataTunnel by a static 3 semaphore. To the receiver side , there's two cases,
the UnlimitedRawBatchBuffer has a 6 * fragmentCount receiver semaphore, the
SpoolingRawBatchBuffer acts as having unlimited receiving semaphore as it could
flush data to disk.
The static credit has the following weak points:
1. While the send batch data size is low(e.g. it has only one column bigint
data) and the receiver has larger memory space, the sender still could not send
out its data rapidly.
2. As the static credit assumption does not set the semaphore number according
to the corresponding receiver memory space, it still have the risk to make the
receiver OOM.
3. As the sender semaphore is small, it could not send its batch consecutively
due to wait for an Ack to release one semaphore , and then , the sender's
corresponding execution pipeline would be halt, also the same to its leaf
execution nodes.
The dynamic credit based flow control could solve these problems. It starts
from the static credit flow control. Then the receiver collects some batch
datas to calculate the average batch size. According to the receiver side
memory space, the receiver make a runtime sender credit and receiver side total
credit. The receiver sends out the runtime sender credit number to the sender
by the Ack response. The sender change to the runtime sender credit number when
receives the Ack response with a runtime credit value.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)