Meng Zhu created MESOS-9173:
-------------------------------
Summary: Improve initial authentication/registration backoff
handling in the agent and framework.
Key: MESOS-9173
URL: https://issues.apache.org/jira/browse/MESOS-9173
Project: Mesos
Issue Type: Improvement
Reporter: Meng Zhu
An initial backoff delay in the framework/agent authentication/registration
could mitigate the thundering herd effect and make the system more robust.
Currently, there are several related issues:
In the agent:
- The initial delay for both authentication and registration is a random
duration between 0 and `registration_back_off_factor`. We should specialize
that for authentication.
- It is not clear why/whether the initial delay should be [0,
registration_back_off factor]. Since we do exponential backoff for failed
attempts ([min,min+factor*2^N]), it seems more natural to use
[min,min+factor*2^0]? We need to re-evaluate this.
In the scheduler driver:
- There is currently no initial backoff delay.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)