Hi Henry,

that sounds like a very serious bug indeed.


I suggest you take a traffic capture with a ring buffer and
capture the exact frontend traffic.


You can do this with "dumpcap" for example, something like
should do it (if your frontend traffic is tcp port 80):

dumpcap -i eth0 -p -s0 -b duration:600 -b files:10 -f \
"tcp port 80" -w /root/tcp-port80-traffic.cap &


dumpcap is part of the wireshark or tshark package in most
distributions. tcpdump afaik doesn't support ring buffers.


If you captured the traffic when haproxy hit the 100%
condition, please send it only to
Willy Tarreau <[email protected]>
and do not send it to the mailing list.



Thanks!





________________________________
> From: [email protected]
> To: [email protected]
> Subject: haproxy hit 100% CPU
> Date: Fri, 12 Apr 2013 05:06:31 +0000
>
> Hi,
>
> We have been using haproxy for couple of years and find it very stable.
> However last week our primary haproxy hit 100% user CPU and then
> stopped responding to any requests. It led to completely down of our
> web sites. When that happened, we were using haproxy 1.4.10. Then we
> upgraded to 1.4.23 immediately, but two days later, the 100% user CPU
> occurred again. Then we upgraded to 1.5 dev 18, but today, the 100% CPU
> occurred on 1.5 dev 18.
>
> When all these happened, the haproxy configuration hasn't changed for
> over half a year. So we think this is not triggered by configuration
> change, and suspected specific traffic caused the issue.
>
> Also we don't think it's hardware specific issue, because when we
> switch the web traffic to backup haproxy server, the hang
> occurred again on the backup haproxy server and third backup haproxy
> server only after couple of minutes running.
>
> So far the troubleshooting steps we've taken are:
>
> 1) Checked all linux log to find anything wrong with the linux system.
> But we didn't find anything, CPU, Memory, harddisk, port,
> etc., suspicious.
>
> 2) Tried to dump session information though 'echo "show sess all" |
> socat /var/run/haproxy.stat stdio'>
> /var/log/haproxy-session.log. However it returns a zero byte file. When
> haproxy ran normally, the same command usually generates a log file of
> over 150K in size.
> 3) Tried to trace what haproxy process is doing though "strace -c -p
> $(pid of haproxy)". However it returns nothing as well.
> 4) Used GDB to step though the haproxy process, and find the haproxy is
> loop though the following code endlessly. For detail, please see
> attached file GDB_haproxy.txt.
>
> 444 in ebtree/ebtree.h
> 327 in src/lb_chash.c
> 330 in src/lb_chash.c
> 340 in src/lb_chash.c
> 341 in src/lb_chash.c
> 44 in src/queue.c
> 46 in src/queue.c
> 53 in src/queue.c
> 61 in src/queue.c
> 349 in src/lb_chash.c
> 325 in src/lb_chash.c
> 326 in src/lb_chash.c
> 326 in src/lb_chash.c
> 551 in ebtree/ebtree.h
> 553 in ebtree/ebtree.h
> 558 in ebtree/ebtree.h
> 559 in ebtree/ebtree.h
>
> The make command we used to build haproxy 1.4.10, 1.4.23 and 1.5 dev 18
> is "make TARGET=linux2628 CPU=native USE_PCRE=1 USE_OPENSSL=1
> USE_ZLIB=1".
>
> This issue looks like an haproxy bug. If anyone could take a look and
> provide some workaround or fix, your effort will be highly appreciated.
>
> Thanks,
> -Henry
>
>
>
>                                         

Reply via email to