[ 
https://issues.apache.org/jira/browse/FLINK-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16717728#comment-16717728
 ] 

Stephan Ewen commented on FLINK-11037:
--------------------------------------

I am not generally opposed to changing the heuristic. Few things to watch out 
for, though:

* If there is anything in the network stack such that the transport of streams 
is not fair between the different streams, then the system will go into a 
problematic state under backpressure. So we need to be careful to preserve 
overall fairness.

* I would be more comfortable with such a change if it was running for a while 
in challenging setups before making this the new default mechanism. We could 
initially have it behind a feature flag, for example, and if it runs with a few 
months well in a large setup (like Alibaba), we can make it the new default.

* Complicated algorithms that "observe and adjust" are commonly more 
error-prone and less stable then simpler inherently stable/fair algorithms. I 
think before we go down that route, we should be sure that it is worth it! Do 
you have any indication what situations it improves and how significant that 
improvement is?

> Introduce another greedy mechanism for distributing floating buffers
> --------------------------------------------------------------------
>
>                 Key: FLINK-11037
>                 URL: https://issues.apache.org/jira/browse/FLINK-11037
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Network
>    Affects Versions: 1.8.0
>            Reporter: zhijiang
>            Assignee: zhijiang
>            Priority: Minor
>
> The current mechanism for distributing floating buffers is fair for all the 
> listeners. In detail, each input channel can only request one floating buffer 
> each time although this channel may actually need more floating buffers. Then 
> this channel has to loop to request floating buffer until all are satisfied 
> or pool is exhausted.
> In generally speaking, this way seems fair for all the concurrent channels 
> invoked by netty nio thread.  But every request from LocalBufferPool needs to 
> syn lock and it is hard to say how to distribute all the available floating 
> buffers behaves better in real scenarios.
> Therefore we propose another greedy mechanism to request more floating 
> buffers each time. In extreme case, we can even request all the required 
> buffers at a time or partial ones via configured parameters.  On the other 
> side, LocalBufferPool can also decide how many floating buffers should been 
> assigned based on some factors, such as how many total channels and how many 
> total floating buffers.
> The motivation is making better use of floating buffer resources and it may 
> need extra metrics for adjusting the mechanism dynamically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to