Re: how to understand the flink flow control

Ufuk Celebi Fri, 07 Aug 2015 06:07:39 -0700

Hey Zhijiang Wang,

I will update the docs next week with more information. The short version is 
that flow control happens via the buffer pools that Flink uses for produced and 
consumed intermediate results.

The slightly ;) longer version:

Each task has buffer pools. The size of these buffer pools depends on multiple 
things (per task manager):
- the configured number of network buffers [default 2048]
- the number of tasks running
- the number of consumed and produced outputs

Each consumed input (if you are looking into the code: each SingleInputGate) 
has a buffer pool associated with it and each produced intermediate result (see 
IntermediateResultPartition in the code) as well.

Each produced record is serialized into a buffer of the respective buffer pool 
of the produced result partition and dispatched to the consumer (either local 
or remote via network). After the buffer has been consumed it is recycled to 
the pool and can be used for the outstanding records. If the producer is faster 
than the consumer, these buffer will take longer to be available again and the 
producer will slow down by waiting on a buffer.

For local exchange the buffer is consumed as soon as the local consumer has 
deserialized the records and for remote exchange as soon as the network layer 
has dispatched the buffer.

For the input side, there is a similar mechanism. The network layer receives a 
buffer and copies it to the buffer pool of the respective input gate and queues 
the filled buffer to the (remote) input channel. If there is no buffer 
available at the input gate, the TCP channel is not read until a buffer is 
available again. This backpressures remote receivers, because their output 
buffers are not dispatched and cannot be recycled.

I hope this helps. If you have further questions, just post them here. I will 
update the docs with some figures, so this will be easier to follow.

– Ufuk

On 07 Aug 2015, at 03:37, wangzhijiang999 <[email protected]> wrote:

> As said in apache page, Flink's streaming runtime has natural flow control: 
> Slow downstream operators backpressure faster upstream operators.
> How to understand the flink natural flow control? 
> As i know, heron has the backpressure mechanism, if some tasks process 
> slowly, it will stop reading from source and notify other tasks to stop 
> reading from source.
> In flink, if the producer task process quickly, it will emit the results to 
> consumer. So the buffer in InputChannel of consumer wil be filled up, if the 
> consumer process slowly, how to control the upstream flow?
> 
> Thank you for any suggestions in advance!
> 
> 
> Best wishes,
> 
> Zhijiang Wang

Re: how to understand the flink flow control

Reply via email to