Hello!

I am using a TCP front-end that potentially keeps connections open for several 
hours, while also frequently issuing reloads due to an id to server mapping 
that is changing constantly. This causes many processes to be running at any 
given time, which generally works as expected. However, after some time I see 
some strange behavior with the processes and stats that doesn’t appear to have 
any pattern to it.

Here is the setup in general:

Every two minutes, there is a process that checks if HAProxy should be 
reloaded. If that is the case, this command is run:

/usr/local/sbin/haproxy -D –f -sf PID

The PID is the current HAProxy process. If there are TCP connections to that 
process, it will stay running until those connection drop, then generally it 
will get killed.

1. Sometimes a process will appear to not get killed, and have no connections. 
It will be running for several hours and have 99 CPU. When straced, it doesn't 
appear to be actually doing anything -- just clock and polls very frequently. 
Is there some sort of timeout for the graceful shutdown of the old processes?

2. Is it possible for the old processes to accept new connections? Even though 
a pid has been sent the shutdown signal, I have seen requests reference old 
server mappings that would have been in an earlier process.

3. Often the stats page will become out of whack over time. The number of 
requests per second will become drastically different from what is actually 
occuring. It looks like the old stuck processes might be sending more data that 
is maybe not getting cleared?

Are there any considerations for starting up or reloading when dealing with 
long running connections?

Thanks!

David Pean


Reply via email to