Hi Patrick,
On Tue, Jan 17, 2017 at 02:33:44AM +0000, Patrick Hemmer wrote:
> So on one of my local development machines haproxy started pegging the
> CPU at 100%
> `strace -T` on the process just shows:
>
> ...
> epoll_wait(0, {}, 200, 0) = 0 <0.000003>
> epoll_wait(0, {}, 200, 0) = 0 <0.000003>
> epoll_wait(0, {}, 200, 0) = 0 <0.000003>
> epoll_wait(0, {}, 200, 0) = 0 <0.000003>
> epoll_wait(0, {}, 200, 0) = 0 <0.000003>
> epoll_wait(0, {}, 200, 0) = 0 <0.000003>
> ...
Hmm not good.
> Opening it up with gdb, the backtrace shows:
>
> (gdb) bt
> #0 0x00007f4d18ba82a3 in __epoll_wait_nocancel () from /lib64/libc.so.6
> #1 0x00007f4d1a570ebc in _do_poll (p=<optimized out>, exp=-1440976915)
> at src/ev_epoll.c:125
> #2 0x00007f4d1a4d3098 in run_poll_loop () at src/haproxy.c:1737
> #3 0x00007f4d1a4cf2c0 in main (argc=<optimized out>, argv=<optimized
> out>) at src/haproxy.c:2097
Ok so an event is not being processed correctly.
> This is haproxy 1.7.0 on CentOS/7
Ah, that could be a clue. We've had 2 or 3 very ugly bugs in 1.7.0
and 1.7.1. One of them is responsible for the few outages on haproxy.org
(last one happened today, I left it running to get the core to confirm).
One of them is an issue with the condition to wake up an applet when it
failed to get a buffer first and it could be what you're seeing. The
other ones could possibly cause some memory corruption resulting in
anything.
Thus I'd strongly urge you to update this one to 1.7.2 (which I'm going
to do on haproxy.org now that I could get a core). Continue to monitor
it but I'd feel much safer after this update.
Thanks for your report!
Willy