Hi,

We have been using haproxy for couple of years and find it very stable. However 
last week our primary haproxy hit 100% user CPU and then stopped responding to 
any requests. It led to completely down of our web sites. When that happened, 
we were using haproxy 1.4.10. Then we upgraded to 1.4.23 immediately, but two 
days later, the 100% user CPU occurred again. Then we upgraded to 1.5 dev 18, 
but today, the 100% CPU occurred on 1.5 dev 18.

When all these happened, the haproxy configuration hasn't changed for over half 
a year. So we think this is not triggered by configuration change, and 
suspected specific traffic caused the issue.

Also we don't think it's hardware specific issue, because when we switch the 
web traffic to backup haproxy server, the hang occurred again on the backup 
haproxy server and third backup haproxy server only after couple of minutes 
running.

So far the troubleshooting steps we've taken are:

1) Checked all linux log to find anything wrong with the linux system. But we 
didn't find anything, CPU, Memory, harddisk, port, etc., suspicious.

2) Tried to dump session information though 'echo "show sess all" | socat 
/var/run/haproxy.stat stdio' > /var/log/haproxy-session.log. However it returns 
a zero byte file. When haproxy ran normally, the same command usually generates 
a log file of over 150K in size.
3) Tried to trace what haproxy process is doing though "strace -c -p $(pid of 
haproxy)". However it returns nothing as well.
4) Used GDB to step though the haproxy process, and find the haproxy is loop 
though the following code endlessly. For detail, please see attached file 
GDB_haproxy.txt.

444    in ebtree/ebtree.h
327    in src/lb_chash.c
330    in src/lb_chash.c
340    in src/lb_chash.c
341    in src/lb_chash.c
44      in src/queue.c
46      in src/queue.c
53      in src/queue.c
61      in src/queue.c
349    in src/lb_chash.c
325    in src/lb_chash.c
326    in src/lb_chash.c
326    in src/lb_chash.c
551    in ebtree/ebtree.h
553    in ebtree/ebtree.h
558    in ebtree/ebtree.h
559    in ebtree/ebtree.h

The make command we used to build haproxy 1.4.10, 1.4.23 and 1.5 dev 18 is 
"make TARGET=linux2628 CPU=native USE_PCRE=1 USE_OPENSSL=1 USE_ZLIB=1".

This issue looks like an haproxy bug. If anyone could take a look and provide 
some workaround or fix, your effort will be highly appreciated.

Thanks,
-Henry




GNU gdb (GDB) Red Hat Enterprise Linux (7.2-50.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/haproxy...done.
Attaching to program: /usr/sbin/haproxy, process 11673
Reading symbols from /lib64/libcrypt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libz.so.1
Reading symbols from /usr/lib64/libssl.so.10...(no debugging symbols 
found)...done.
Loaded symbols for /usr/lib64/libssl.so.10
Reading symbols from /usr/lib64/libcrypto.so.10...(no debugging symbols 
found)...done.
Loaded symbols for /usr/lib64/libcrypto.so.10
Reading symbols from /usr/lib64/libpcreposix.so.0...(no debugging symbols 
found)...done.
Loaded symbols for /usr/lib64/libpcreposix.so.0
Reading symbols from /lib64/libpcre.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib64/libpcre.so.0
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/libfreebl3.so...(no debugging symbols found)...done.
Loaded symbols for /lib64/libfreebl3.so
Reading symbols from /lib64/libgssapi_krb5.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libgssapi_krb5.so.2
Reading symbols from /lib64/libkrb5.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib64/libkrb5.so.3
Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libcom_err.so.2
Reading symbols from /lib64/libk5crypto.so.3...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libk5crypto.so.3
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libkrb5support.so.0...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libkrb5support.so.0
Reading symbols from /lib64/libkeyutils.so.1...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libkeyutils.so.1
Reading symbols from /lib64/libresolv.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libresolv.so.2
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols 
found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libselinux.so.1...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libselinux.so.1
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libnss_files.so.2
eb_walk_down (p=0x1ed57a0, srvtoavoid=<value optimized out>)
    at ebtree/ebtree.h:444
444     ebtree/ebtree.h: No such file or directory.
        in ebtree/ebtree.h
Missing separate debuginfos, use: debuginfo-install haproxy-1.4.19-1.el6.x86_64
(gdb) bt
#0  eb_walk_down (p=0x1ed57a0, srvtoavoid=<value optimized out>)
    at ebtree/ebtree.h:444
#1  eb_next (p=0x1ed57a0, srvtoavoid=<value optimized out>)
    at ebtree/ebtree.h:561
#2  eb32_next (p=0x1ed57a0, srvtoavoid=<value optimized out>)
    at ebtree/eb32tree.h:68
#3  chash_get_next_server (p=0x1ed57a0, srvtoavoid=<value optimized out>)
    at src/lb_chash.c:326
#4  0x000000000043ffab in assign_server (s=0x5885c00) at src/backend.c:615
#5  0x0000000000440148 in assign_server_and_queue (s=0x5885c00)
    at src/backend.c:791
#6  0x0000000000440291 in srv_redispatch_connect (t=0x5885c00)
    at src/backend.c:1030
#7  0x0000000000456894 in sess_prepare_conn_req (t=0x5886310)
    at src/session.c:1180
#8  process_session (t=0x5886310) at src/session.c:2198
#9  0x000000000040dab0 in process_runnable_tasks (next=0x7fff13db2c6c)
    at src/task.c:238
#10 0x0000000000404dd0 in run_poll_loop () at src/haproxy.c:1210
#11 0x0000000000407073 in main (argc=<value optimized out>,
    argv=<value optimized out>) at src/haproxy.c:1541
(gdb)
eb_next (p=0x1ed57a0, srvtoavoid=<value optimized out>) at ebtree/ebtree.h:561
561     in ebtree/ebtree.h
(gdb)
eb_walk_down (p=0x1ed57a0, srvtoavoid=<value optimized out>)
    at ebtree/ebtree.h:445
445     in ebtree/ebtree.h
(gdb)
444     in ebtree/ebtree.h
(gdb)
chash_get_next_server (p=0x1ed57a0, srvtoavoid=<value optimized out>)
    at src/lb_chash.c:327
327     src/lb_chash.c: No such file or directory.
        in src/lb_chash.c
(gdb)
330     in src/lb_chash.c
(gdb)
340     in src/lb_chash.c
(gdb)
341     in src/lb_chash.c
(gdb)
srv_dynamic_maxconn (s=0x1ef1180) at src/queue.c:44
44      src/queue.c: No such file or directory.
        in src/queue.c
(gdb)
46      in src/queue.c
(gdb)
53      in src/queue.c
(gdb)
61      in src/queue.c
(gdb)
chash_get_next_server (p=0x1ed57a0, srvtoavoid=<value optimized out>)
    at src/lb_chash.c:349
349     src/lb_chash.c: No such file or directory.
        in src/lb_chash.c
(gdb)
325     in src/lb_chash.c
(gdb)
326     in src/lb_chash.c
(gdb)
eb32_next (p=0x1ed57a0, srvtoavoid=<value optimized out>) at src/lb_chash.c:326
326     in src/lb_chash.c
(gdb)
eb_next (p=0x1ed57a0, srvtoavoid=<value optimized out>) at ebtree/ebtree.h:551
551     ebtree/ebtree.h: No such file or directory.
        in ebtree/ebtree.h
(gdb)
553     in ebtree/ebtree.h
(gdb)
558     in ebtree/ebtree.h
(gdb)
559     in ebtree/ebtree.h
(gdb)
eb_walk_down (p=0x1ed57a0, srvtoavoid=<value optimized out>)
    at ebtree/ebtree.h:444
444     in ebtree/ebtree.h
(gdb)
chash_get_next_server (p=0x1ed57a0, srvtoavoid=<value optimized out>)
    at src/lb_chash.c:327
327     src/lb_chash.c: No such file or directory.
        in src/lb_chash.c
(gdb)
330     in src/lb_chash.c
(gdb)
340     in src/lb_chash.c
(gdb)
341     in src/lb_chash.c
(gdb)
srv_dynamic_maxconn (s=0x1ee0260) at src/queue.c:44
44      src/queue.c: No such file or directory.
        in src/queue.c
(gdb)
46      in src/queue.c
(gdb)
53      in src/queue.c
(gdb)
61      in src/queue.c
(gdb)
chash_get_next_server (p=0x1ed57a0, srvtoavoid=<value optimized out>)
    at src/lb_chash.c:349
349     src/lb_chash.c: No such file or directory.
        in src/lb_chash.c
(gdb)
325     in src/lb_chash.c
(gdb)
(gdb)
326     in src/lb_chash.c
(gdb)
eb32_next (p=0x1ed57a0, srvtoavoid=<value optimized out>) at src/lb_chash.c:326
326     in src/lb_chash.c
(gdb)
eb_next (p=0x1ed57a0, srvtoavoid=<value optimized out>) at ebtree/ebtree.h:551
551     ebtree/ebtree.h: No such file or directory.
        in ebtree/ebtree.h
(gdb)
553     in ebtree/ebtree.h
(gdb)
555     in ebtree/ebtree.h
(gdb)
553     in ebtree/ebtree.h
(gdb)
558     in ebtree/ebtree.h
(gdb)
559     in ebtree/ebtree.h
(gdb)
eb_walk_down (p=0x1ed57a0, srvtoavoid=<value optimized out>)
    at ebtree/ebtree.h:444
444     in ebtree/ebtree.h
(gdb)

Reply via email to