Meng Zhu created MESOS-9173:
-------------------------------

             Summary: Improve initial authentication/registration backoff 
handling in the agent and framework.
                 Key: MESOS-9173
                 URL: https://issues.apache.org/jira/browse/MESOS-9173
             Project: Mesos
          Issue Type: Improvement
            Reporter: Meng Zhu


An initial backoff delay in the framework/agent authentication/registration 
could mitigate the thundering herd effect and make the system more robust. 
Currently, there are several related issues:

In the agent:
- The initial delay for both authentication and registration is a random 
duration between 0 and `registration_back_off_factor`. We should specialize 
that for authentication.
- It is not clear why/whether the initial delay should be [0, 
registration_back_off factor]. Since we do exponential backoff for failed 
attempts ([min,min+factor*2^N]), it seems more natural to use 
[min,min+factor*2^0]? We need to re-evaluate this.

In the scheduler driver:
- There is currently no initial backoff delay.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to