Anand Mazumdar created MESOS-5359: ------------------------------------- Summary: The scheduler library should have a delay before initiating a connection with master. Key: MESOS-5359 URL: https://issues.apache.org/jira/browse/MESOS-5359 Project: Mesos Issue Type: Bug Affects Versions: 0.29.0 Reporter: Anand Mazumdar
Currently, the scheduler library does have an artificially induced delay when trying to initially establish a connection with the master. In the event of a master failover or ZK disconnect, a large number of frameworks can get disconnected and then thereby overwhelm the master with TCP SYN requests. On a large cluster with many agents, the master is already overwhelmed with handling connection requests from the agents. This compounds the issue further on the master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)