Would appreciate feedback/comments on this proposal.

Thanks
Anindya

> On Feb 12, 2017, at 9:03 PM, Anindya Sinha <anindya_si...@apple.com> wrote:
> 
> Reference: https://issues.apache.org/jira/browse/MESOS-7087 
> <https://issues.apache.org/jira/browse/MESOS-7087>
> 
> Currently, we have at least 3 types of backoff such as:
> 1) Exponential backoff with randomness, as in framework/agent registration.
> 2) Exponential backoff with no randomness, as in status updates.
> 3) Linear backoff with randomness, as in executor registration.
> 
> In framework registration as an example, each retry ranges between [0 .. 
> b*2^(n-1)] for nth retry attempt as long as each interval is less than 1 min.
> 
> For clusters with large number of frameworks and/or agents, the randomness 
> may not be enough since the timeout can end up being very small for a 
> substantial number of clients (agents and/or frameworks) due to the fact that 
> the allowed range is [0 .. <n>] for all retry attempts.
> 
> The following doc looks at an enhancement to the existing proposal to ensure 
> that the timeout values are not extremely small, and that every subsequent 
> retry should have a timeout value atleast as much as the previous iteration.
> 
> https://docs.google.com/document/d/1nUxvh6BbB8jv5G-MvckGj9XzFYLBrUM0O5Go_Zmdftk/edit?usp=sharing
>  
> <https://docs.google.com/document/d/1nUxvh6BbB8jv5G-MvckGj9XzFYLBrUM0O5Go_Zmdftk/edit?usp=sharing>
> 
> Feedback welcome.
> 
> Thanks
> Anindya
> 

Reply via email to