On Wed, Dec 5, 2012 at 10:58 PM, Willy Tarreau <[email protected]> wrote:

> Hi Bryan,
>
> Thanks a lot for your help Willy, I really appreciate. And for haproxy. It
is a fantastic tool.


> On Wed, Dec 05, 2012 at 04:22:45PM +0100, Bryan Berry wrote:Does this stay
> that way for a long time ? I mean, could it be something
> like a health check not getting a response (eg: just a few seconds) or
> does that seem to match your client/server timeout (500s in your case) ?
>

It does stay high, here is a graph of cpu performance over the last 24
hours, the left-hand side are % of CPU time
https://docs.google.com/open?id=0BzPvBvLIIq7NV0QtTkliM3Yxenc

The high cpu usage doesn't appear to correlate to any HTTP 500 status codes
and I wouldn't expect it to since it seems related to the TCP mode proxying
of our databases.



> Could you please add "level admin" on your stats socket, restart and issue
> a "show sess all" on the stats socket when the issue happens, and capture
> the output. It will help *a lot*. The best way to do it is to redirect it
> to a file, for example like this :
>
>    echo "show sess all" | socat stdio /var/run/haproxy.sock > show-sess.out
>

done

https://docs.google.com/document/d/1A3qEq0RmlAtG-fzKJDbZgB0pvmYJnlUuJ0T2IrpBGGg/edit

Here are the IP addresses of the database backend servers. Note they are
not the originals but have been munged to protect the innocent.

168.100.2.181, 168.100.2.237, 168.100.2.195, 168.100.2.183

just by playing w/ strace, it looks like the following function is being
called over and over again with a value of 0 for wait_time

status = epoll_wait(0, {}, 26, 0)

Line 133, ev_epoll.c

hope this helps! thanks again for your assistance

Reply via email to