Hi Joe,

On Mon, Oct 18, 2010 at 02:02:29PM -0700, Joe Williams wrote:
> List,
> I am experiencing a gap between when the old process stops listening and the 
> new process starts were requests fail. AFAICT this is not  a new issue rather 
> we just started to notice it with increased number of requests and we found 
> we can readily reproduce it. My understanding is that this is likely the time 
> between when the SIGTTOU is sent to the old process and the new one started. 
> This is probably milliseconds but we are definitely seeing dropped 
> connections. It doesn't seem to me that having multiple haproxy processes 
> would help in this case unless the reloads to each process are staggered.  
> Does anyone else see the same issue? Are there work arounds available?

This is indeed a known behaviour. The gap is precisely between the moment the
old process unbinds its port and the moment the new one successfully binds the
port. It can vary between a few microseconds to a few milliseconds depending
on the load on the machine and the presence or not of other processes that
could take the CPU between the two steps and delay the operation.

Against this, I'm using two kernel patches (either is enough) :
  - the first one allows multiple processes to bind to the same port, just
    like it happens after a fork. Then the first haproxy binds, starts and
    finally notifies the other one that it can quit. Simple, clean, with
    zero gap ;

  - the second one is used to avoid sending resets for SYN packets sent to
    an unbound port, ie during the gap. That's very efficient too, because
    the client sees nothing and retransmits, which is a lot cleaner than
    making it fail.

I suspect the second point could be made to work with recent linux kernels
that support the TPROXY feature, though I've not tried it. You could possibly
write an iptables rule that drops traffic sent to the INPUT chain and that
has a SYN which does not match a socket. Something like this :

   iptables -A INPUT -m socket -p tcp --syn \! -m socket -j DROP

This should catch packets only during the gap in my opinion. That might be
something to try.


Reply via email to