Sorabh Hamirwasia created DRILL-5508:
----------------------------------------
Summary: Flow control in Drill RPC layer
Key: DRILL-5508
URL: https://issues.apache.org/jira/browse/DRILL-5508
Project: Apache Drill
Issue Type: Improvement
Components: Execution - RPC
Reporter: Sorabh Hamirwasia
Drill uses Netty to implement it's RPC layer. Netty internally has
_ChannelOutboudBuffer_ where it stores all the data sent by application when
TCP send buffer is full. Netty also has a concept of
_WRITE_BUFFER_HIGH_WATER_MARK_ and _LOW_BUFFER_HIGH_WATER_MARK_ which are
configurable and help to know when the send buffer is full or when it can
accept more data. The channel writability is turned on/off based on these
parameters which application can use to make smart decision. More information
can be found
[here|https://netty.io/4.1/api/io/netty/channel/WriteBufferWaterMark.html]. All
these together can help to implement flow control in Drill. Today in Drill the
only flow control we have is based on number of batches sent (which is 3)
without ack. But that doesn't consider how much data is transferred as part of
those batches. Without using the proper flow control based on water marks Drill
is just overwhelming the pipeline.
With Drill 1.11 support for SASL encryption, there is a new
SaslEncryptionHandler inserted in Drill channel pipeline.This handler takes the
Drill ByteBuf and encrypt it and stores the encrypted buffer (>= original
buffer) in another ByteBuf. Now in this way the memory consumption is doubled
until next handler in pipeline is called when original buffer will be released.
There is a risk where if multiple connections (say N) happen to do encryption
on larger Data buffers (say of size D) at same time then each will end up
doubling the memory consumption at that instance. The total memory consumption
will be Mc = N*2D. This can happen even without encryption when the connection
count is doubled (i.e. 2N) which are transferring (D size of data). In
constrained memory environment this can be an issue if Mc is too large.
To resolve issues in both the scenarios it is required to have flow control in
place for Drill RPC layer. Basically we can configure High/Low Watermarks
(based on % of ChannelOutboundbuffer) and ChannelOutboundbuffer (multiple of
Chunk size) for Drill channel's. Then the application thread which just write
entire message in one go, need to chunk the message in some smaller sizes
(possibly configurable). Based on the channel write state, one or more chunk
should be written to socket. If the channel Writable state is false then
application thread will block until it get's notified of the state change in
which case it can again send more chunk downstream. In this way we are
achieving below:
1) In case when encryption is disabled Netty's ChannelOutboundbuffer will not
be overwhelmed. It will always have streamline flow of data to send over
network.
2) In case when encryption is enabled then we will always send smaller chunks
to the pipeline to encrypt rather than entire Data buffer. This will double the
memory in smaller units causing less memory pressure.
Note: This is just high level description of the problem and what can be a
potential solution. It needs more research/prototyping to come up with a proper
solution.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)