[jira] [Commented] (FLINK-10661) Initial credit should be configured in a separate parameter

zhijiang (JIRA) Mon, 24 Dec 2018 01:13:51 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-10661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728278#comment-16728278
 ]


zhijiang commented on FLINK-10661:
----------------------------------

[~NicoK], thanks for your reply!  I should describe it more directly. :)

The floating buffers are requested based on sender's backlog, and the receiver 
always tries to announce {{backlog+initial_credit}} to senders in order to make 
the transport smoothly. The {{initial_credit}} is from the parameter 
{{taskmanager.network.memory.buffers-per-channel}} currently. 

I think we should define a separate parameter for this extra credit, because if 
we tune the {{per-channel}} parameter as 1, then the overhead 1 extra credit 
more than backlog might not enough for making the transport smoothly. In other 
words, the sender may need wait for credits  before registering sub partition 
available for transfer.

In this case, the out queue usage is 100%, but the input queue usage may not 
reach 100%. If we have a separate parameter to tune the extra credits, then it 
can help for more control. For example, if the {{per-channel}} is 1, then we 
might try to announce {{backlog+3}} credits each time.

 

> Initial credit should be configured in a separate parameter
> -----------------------------------------------------------
>
>                 Key: FLINK-10661
>                 URL: https://issues.apache.org/jira/browse/FLINK-10661
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Network
>    Affects Versions: 1.5.4, 1.6.1
>            Reporter: zhijiang
>            Assignee: zhijiang
>            Priority: Minor
>
> In credit-based network flow control, the required credits on receiver side 
> are calculated by backlog plus initial credit which is equal to the value in 
> parameter {{taskmanager.network.memory.buffers-per-channel}}. We plus the 
> initial credit as backlog overhead in order to decrease the possibility of 
> waiting credits on sender side. The best result is concurrent work between 
> sender and receiver, not block each other.
>  
> We found a bad case in some rebalance or rescale scenarios, the outqueue 
> usage reaches 100% on sender side, but the inqueue usage is about 50% or 
> less.  That means the credit announcement is not enough for sender side 
> although there are still many free credit resources on receiver side. So it 
> is not reasonable resulting in wasting resources.
>  
> It would be better if we can adjust the credit overhead to debug the 
> performance online. And it needs another separate parameter to define initial 
> credit not messed with {{taskmanager.network.memory.buffers-per-channel}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10661) Initial credit should be configured in a separate parameter

Reply via email to