On Thu, Dec 6, 2012 at 10:10 AM, Bryan Berry <[email protected]> wrote: > On Wed, Dec 5, 2012 at 10:58 PM, Willy Tarreau <[email protected]> wrote: >> >> Hi Bryan, >> > Thanks a lot for your help Willy, I really appreciate. And for haproxy. It > is a fantastic tool. > >> >> On Wed, Dec 05, 2012 at 04:22:45PM +0100, Bryan Berry wrote:Does this stay >> that way for a long time ? I mean, could it be something >> like a health check not getting a response (eg: just a few seconds) or >> does that seem to match your client/server timeout (500s in your case) ? > > > It does stay high, here is a graph of cpu performance over the last 24 > hours, the left-hand side are % of CPU time > https://docs.google.com/open?id=0BzPvBvLIIq7NV0QtTkliM3Yxenc > > The high cpu usage doesn't appear to correlate to any HTTP 500 status codes > and I wouldn't expect it to since it seems related to the TCP mode proxying > of our databases. > > >> >> Could you please add "level admin" on your stats socket, restart and issue >> a "show sess all" on the stats socket when the issue happens, and capture >> the output. It will help *a lot*. The best way to do it is to redirect it >> to a file, for example like this : >> >> echo "show sess all" | socat stdio /var/run/haproxy.sock > >> show-sess.out > > > done > > https://docs.google.com/document/d/1A3qEq0RmlAtG-fzKJDbZgB0pvmYJnlUuJ0T2IrpBGGg/edit > > Here are the IP addresses of the database backend servers. Note they are not > the originals but have been munged to protect the innocent. > > 168.100.2.181, 168.100.2.237, 168.100.2.195, 168.100.2.183 > > just by playing w/ strace, it looks like the following function is being > called over and over again with a value of 0 for wait_time > > status = epoll_wait(0, {}, 26, 0) > > Line 133, ev_epoll.c > > hope this helps! thanks again for your assistance
Hi Willy, I got the same issue at a customer yesterday with long term TCP connections (exchange 2010 load-balancing). There was roughly 6 open connections at that time on the LB. cheers

