Hi Henry,
that sounds like a very serious bug indeed. I suggest you take a traffic capture with a ring buffer and capture the exact frontend traffic. You can do this with "dumpcap" for example, something like should do it (if your frontend traffic is tcp port 80): dumpcap -i eth0 -p -s0 -b duration:600 -b files:10 -f \ "tcp port 80" -w /root/tcp-port80-traffic.cap & dumpcap is part of the wireshark or tshark package in most distributions. tcpdump afaik doesn't support ring buffers. If you captured the traffic when haproxy hit the 100% condition, please send it only to Willy Tarreau <[email protected]> and do not send it to the mailing list. Thanks! ________________________________ > From: [email protected] > To: [email protected] > Subject: haproxy hit 100% CPU > Date: Fri, 12 Apr 2013 05:06:31 +0000 > > Hi, > > We have been using haproxy for couple of years and find it very stable. > However last week our primary haproxy hit 100% user CPU and then > stopped responding to any requests. It led to completely down of our > web sites. When that happened, we were using haproxy 1.4.10. Then we > upgraded to 1.4.23 immediately, but two days later, the 100% user CPU > occurred again. Then we upgraded to 1.5 dev 18, but today, the 100% CPU > occurred on 1.5 dev 18. > > When all these happened, the haproxy configuration hasn't changed for > over half a year. So we think this is not triggered by configuration > change, and suspected specific traffic caused the issue. > > Also we don't think it's hardware specific issue, because when we > switch the web traffic to backup haproxy server, the hang > occurred again on the backup haproxy server and third backup haproxy > server only after couple of minutes running. > > So far the troubleshooting steps we've taken are: > > 1) Checked all linux log to find anything wrong with the linux system. > But we didn't find anything, CPU, Memory, harddisk, port, > etc., suspicious. > > 2) Tried to dump session information though 'echo "show sess all" | > socat /var/run/haproxy.stat stdio'> > /var/log/haproxy-session.log. However it returns a zero byte file. When > haproxy ran normally, the same command usually generates a log file of > over 150K in size. > 3) Tried to trace what haproxy process is doing though "strace -c -p > $(pid of haproxy)". However it returns nothing as well. > 4) Used GDB to step though the haproxy process, and find the haproxy is > loop though the following code endlessly. For detail, please see > attached file GDB_haproxy.txt. > > 444 in ebtree/ebtree.h > 327 in src/lb_chash.c > 330 in src/lb_chash.c > 340 in src/lb_chash.c > 341 in src/lb_chash.c > 44 in src/queue.c > 46 in src/queue.c > 53 in src/queue.c > 61 in src/queue.c > 349 in src/lb_chash.c > 325 in src/lb_chash.c > 326 in src/lb_chash.c > 326 in src/lb_chash.c > 551 in ebtree/ebtree.h > 553 in ebtree/ebtree.h > 558 in ebtree/ebtree.h > 559 in ebtree/ebtree.h > > The make command we used to build haproxy 1.4.10, 1.4.23 and 1.5 dev 18 > is "make TARGET=linux2628 CPU=native USE_PCRE=1 USE_OPENSSL=1 > USE_ZLIB=1". > > This issue looks like an haproxy bug. If anyone could take a look and > provide some workaround or fix, your effort will be highly appreciated. > > Thanks, > -Henry > > > >

