On 28 мая 2014 г., at 12:49, Willy Tarreau <w...@1wt.eu> wrote: > On Wed, May 28, 2014 at 12:35:17PM +0400, Dmitry Sivachenko wrote: >>> - otherwise, we redispatch starting on the first retry as you suggest. But >>> then we have two possibilities for the delay before reconnecting. If the >>> server farm has more than 1 server and the balance algorithm is not a hash >>> nor "first", then we don't apply the delay because we expect to land on a >>> different server with a high probability. Otherwise we keep the delay >>> because we're almost certain to land on the same server. >>> >>> This way it continues to silently mask occasional server restarts and is >>> optimally efficient in stateless farms when there's a possibility to quickly >>> pick another server. Do you see any other point that needs specific care ? >> >> >> >> I would export that magic "1 second" as a configuration parameter (with 0 >> meaning no delay). > > I'm not sure we need to add another tunable just for this.
Okay. > >> After all, we could fail to connect not only because of server restart, but >> also because a switch or a router dropped a packet. > > No, because a dropped packet is already handled by the TCP stack. Here the > haproxy retry is really about retrying after an explicit failure (server > responded that the port was closed). Also, the typical TCP retransmit > interval for dropped packets in the network stack is 3s, so we're already > 3 times as fast as the TCP stack. I don't think it's reasonable to always > kill this delay when retrying on the same server. We used to have that in > the past and people were complaining that we were hammering servers for no > reason, since there's little chance that a server which is not started will > suddenly be ready in the next 100 microseconds. > I mean that with timeout connect=100ms (good value for local network IMO), we are far away from TCP restransmit timeout and if a switch drops a packet (it drops randomly and it can transmit next one even if we retry immediately). If we have a tunable (let's make a default 1 second), people will have more freedom in some situations.