Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)
Confirmed on my side as well. No segfault, and no spinning CPU with the latest patch. thanks! Steve Ruiz Manager - Hosting Operations Mirth ste...@mirth.com ste...@mirthcorp.com On Fri, Jan 17, 2014 at 10:25 AM, Cyril Bonté cyril.bo...@free.fr wrote: Le 17/01/2014 11:14, Willy Tarreau a écrit : On Fri, Jan 17, 2014 at 11:03:51AM +0100, Willy Tarreau wrote: On Fri, Jan 17, 2014 at 10:47:01AM +0100, Willy Tarreau wrote: So I might have broken something in the way to count the try value, ending up with zero being selected and nothing done. Unfortunately it works fine here. OK I can reproduce it in 32-bit now. Let's see what happens... OK here's the fix. I'm ashamed for not having noticed this mistake during the change. I ported the raw_sock changes to ssl_sock, it was pretty straght-forward but I missed the condition in the while () loop. And unfortunately, the variable happened to be non-zero in the stack, resulting in something working well for me :-/ I've pushed the fix. Great ! I didn't have time to try to fix it yesterday. Everything is working well now, we can definitely close this bug, that's a good thing ;-) -- Cyril Bonté -- CONFIDENTIALITY NOTICE: The information contained in this electronic transmission may be confidential. If you are not an intended recipient, be aware that any disclosure, copying, distribution or use of the information contained in this transmission is prohibited and may be unlawful. If you have received this transmission in error, please notify us by email reply and then erase it from your computer system.
Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)
Cyril is correct - I simply waited for a segfault, but didn't actually test through the load balancer. I'm using SSL on haproxy, and yes, when I try to hit a web page behind haproxy, CPU spins at 100% for a good while. Steve Ruiz Manager - Hosting Operations Mirth ste...@mirth.com ste...@mirthcorp.com On Thu, Jan 16, 2014 at 1:48 PM, Cyril Bonté cyril.bo...@free.fr wrote: Hi Willy, Le 15/01/2014 01:08, Willy Tarreau a écrit : On Tue, Jan 14, 2014 at 12:25:37PM -0800, Steve Ruiz wrote: Patched and confirmed in our environment that this is now working / seems to have fixed the issue. Thanks! Great, many thanks to you both guys. We've got rid of another pretty old bug, these are the ones that make me the happiest once fixed! I'm currently unpacking my laptop to push the fix so that it appears in todays snapshot. Excellent work! I fear there are some more work to do on this patch. I made some tests on ssl and it looks to be broken since this commit :-( The shortest configuration I could find to reproduce the issue is : listen test bind 0.0.0.0:443 ssl crt cert.pem mode http timeout server 5s timeout client 5s When a request is received by haproxy, the cpu raises to 100% in a epoll_wait loop (timeouts are here to prevent an unlimited loop). $ curl -k https://localhost/ ... epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 200, 0) = 1 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 200, 0) = 1 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 200, 0) = 1 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 200, 0) = 1 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 200, 0) = 1 ... The same issue occurs when a server is declared. The same also occurs when the proxy is in clear http and a server is in https : listen test bind 0.0.0.0:80 mode http timeout server 5s timeout client 5s server ssl_backend 127.0.0.1:443 ssl $ curl http://localhost/ ... epoll_wait(3, {}, 200, 0) = 0 epoll_wait(3, {}, 200, 0) = 0 epoll_wait(3, {}, 200, 0) = 0 epoll_wait(3, {}, 200, 0) = 0 epoll_wait(3, {}, 200, 0) = 0 ... -- Cyril Bonté -- CONFIDENTIALITY NOTICE: The information contained in this electronic transmission may be confidential. If you are not an intended recipient, be aware that any disclosure, copying, distribution or use of the information contained in this transmission is prohibited and may be unlawful. If you have received this transmission in error, please notify us by email reply and then erase it from your computer system.
Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)
Patched and confirmed in our environment that this is now working / seems to have fixed the issue. Thanks! Steve Ruiz On Tue, Jan 14, 2014 at 3:22 AM, Willy Tarreau w...@1wt.eu wrote: OK here's a proposed fix which addresses the API issue for both raw_sock and ssl_sock. Steve, it would be nice if you could give it a try just to confirm I didn't miss anything. Thanks, Willy -- CONFIDENTIALITY NOTICE: The information contained in this electronic transmission may be confidential. If you are not an intended recipient, be aware that any disclosure, copying, distribution or use of the information contained in this transmission is prohibited and may be unlawful. If you have received this transmission in error, please notify us by email reply and then erase it from your computer system.
Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)
) (gdb) step 223 right = buf-data + buf-size; (gdb) step 222 if (buf-data + buf-o = buf-p) (gdb) step 223 right = buf-data + buf-size; (gdb) step 222 if (buf-data + buf-o = buf-p) (gdb) step 227 left = buffer_wrap_add(buf, buf-p + buf-i); (gdb) step buffer_wrap_add (b=0x6eff90, blk=0x6d82d0 GET /cp/testcheck.php HTTP/1.0\r\n, len=32) at include/common/buffer.h:191 191 if (ptr - buf-size = buf-data) (gdb) step bo_putblk (b=0x6eff90, blk=0x6d82d0 GET /cp/testcheck.php HTTP/1.0\r\n, len=32) at include/common/buffer.h:388 388 memcpy(b-p, blk, half); (gdb) print b-p $1 = 0x6effa4 play:table-column;float:none}table td[class*=\col-\],table th[class*=\col-\]{display:table-cell;float:none}.tabletheadtrtd.active,.tabletbodytrtd.active,.tabletfoottrtd.active,.tabletheadtr... (gdb) print blk $2 = 0x6d82d0 GET /cp/testcheck.php HTTP/1.0\r\n (gdb) print half $3 = value optimized out (gdb) $4 = value optimized out (gdb) step 384 half = buffer_contig_space(b); (gdb) step buffer_contig_space (b=0x6eff90, blk=0x6d82d0 GET /cp/testcheck.php HTTP/1.0\r\n, len=32) at include/common/buffer.h:227 227 left = buffer_wrap_add(buf, buf-p + buf-i); (gdb) step buffer_wrap_add (b=0x6eff90, blk=0x6d82d0 GET /cp/testcheck.php HTTP/1.0\r\n, len=32) at include/common/buffer.h:191 191 if (ptr - buf-size = buf-data) (gdb) step buffer_contig_space (b=0x6eff90, blk=0x6d82d0 GET /cp/testcheck.php HTTP/1.0\r\n, len=32) at include/common/buffer.h:228 228 return right - left; (gdb) step bo_putblk (b=0x6eff90, blk=0x6d82d0 GET /cp/testcheck.php HTTP/1.0\r\n, len=32) at include/common/buffer.h:388 388 memcpy(b-p, blk, half); (gdb) print b-p $5 = 0x6effa4 play:table-column;float:none}table td[class*=\col-\],table th[class*=\col-\]{display:table-cell;float:none}.tabletheadtrtd.active,.tabletbodytrtd.active,.tabletfoottrtd.active,.tabletheadtr... (gdb) print blk $6 = 0x6d82d0 GET /cp/testcheck.php HTTP/1.0\r\n (gdb) print half $7 = -2119952626 (gdb) $8 = -2,119,952,626 (gdb) step Program received signal SIGSEGV, Segmentation fault. 0x76e22c64 in memcpy () from /lib64/libc.so.6 Steve Ruiz Manager - Hosting Operations Mirth ste...@mirth.com ste...@mirthcorp.com -- CONFIDENTIALITY NOTICE: The information contained in this electronic transmission may be confidential. If you are not an intended recipient, be aware that any disclosure, copying, distribution or use of the information contained in this transmission is prohibited and may be unlawful. If you have received this transmission in error, please notify us by email reply and then erase it from your computer system.
Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)
Made those changes, and it seems to be working properly, no segfault yet after ~2 minutes of checks. Thanks! Steve Ruiz Manager - Hosting Operations Mirth ste...@mirth.com ste...@mirthcorp.com On Fri, Jan 10, 2014 at 3:06 PM, Baptiste bed...@gmail.com wrote: Hi Steve, Could you give a try to the tcp-check and tell us if your have the same issue. In your backend, turn your httpchk related directives into: option tcp-check tcp-check send GET\ /cp/testcheck.html\ HTTP/1.0\r\n tcp-check send \r\n tcp-check expect string good Baptiste On Fri, Jan 10, 2014 at 11:16 PM, Steve Ruiz ste...@mirth.com wrote: I'm experimenting with haproxy on a centos6 VM here. I found that when I specified a health check page (option httpchk GET /url), and that page didn't exist, we have a large 404 page returned, and that causes haproxy to quickly segfault (seems like on the second try GET'ing and parsing the page). I couldn't figure out from the website where to submit a bug, so I figure I'll try here first. Steps to reproduce: - setup http backend, with option httpchk and httpcheck expect string x. Make option httpchk point to a non-existent page - On backend server, set it up to serve large 404 response (in my case, the 404 page is 186kB, as it has an inline graphic and inline css) - Start haproxy, and wait for it to segfault I wasn't sure exactly what was causing this at first, so I did some work to narrow it down with GDB. The variable values from gdb led me to the cause on my side, and hopefully can help you fix the issue. I could not make this work with simply a large page for the http response - in that case, it seems to work as advertised, only inspecting the response up to tune.chksize (default 16384 as i've left it). But if I do this with a 404, it seems to kill it. Let me know what additional information you need if any. Thanks and kudos for the great bit of software! #haproxy config: #- # Example configuration for a possible web application. See the # full configuration options online. # # http://haproxy.1wt.eu/download/1.4/doc/configuration.txt # #- # Help in developing config here: # https://www.twilio.com/engineering/2013/10/16/haproxy #- # Global settings #- global # to have these messages end up in /var/log/haproxy.log you will # need to: # # 1) configure syslog to accept network log events. This is done #by adding the '-r' option to the SYSLOGD_OPTIONS in #/etc/sysconfig/syslog # # 2) configure local2 events to go to the /var/log/haproxy.log # file. A line like the following can be added to # /etc/sysconfig/syslog # #local2.* /var/log/haproxy.log # log 127.0.0.1 local2 info chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 userhaproxy group haproxy daemon #enable stats stats socket /tmp/haproxy.sock listen ha_stats :8088 balance source mode http timeout client 3ms stats enable stats auth haproxystats:foobar stats uri /haproxy?stats #- # common defaults that all the 'listen' and 'backend' sections will # use if not designated in their block #- defaults modehttp log global option httplog option dontlognull #keep persisten client connection open option http-server-close option forwardfor except 127.0.0.0/8 option redispatch # Limit number of retries - total time trying to connect = connect timeout * (#retries + 1) retries 2 timeout http-request10s timeout queue 1m #timeout opening a tcp connection to server - should be shorter than timeout client and server timeout connect 3100 timeout client 30s timeout server 30s timeout http-keep-alive 10s timeout check 10s maxconn 3000 #- # main frontend which proxys to the backends #- frontend https_frontend bind :80 redirect scheme https if !{ ssl_fc } #config help: https://github.com/observing/balancerbattle/blob
Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)
Thanks for the workaround + super fast response, and glad to help :). Steve Ruiz Manager - Hosting Operations Mirth ste...@mirth.com ste...@mirthcorp.com On Fri, Jan 10, 2014 at 3:53 PM, Baptiste bed...@gmail.com wrote: Well, let say this is a workaround... We'll definitively have to fix the bug ;) Baptiste On Sat, Jan 11, 2014 at 12:24 AM, Steve Ruiz ste...@mirth.com wrote: Made those changes, and it seems to be working properly, no segfault yet after ~2 minutes of checks. Thanks! Steve Ruiz Manager - Hosting Operations Mirth ste...@mirth.com On Fri, Jan 10, 2014 at 3:06 PM, Baptiste bed...@gmail.com wrote: Hi Steve, Could you give a try to the tcp-check and tell us if your have the same issue. In your backend, turn your httpchk related directives into: option tcp-check tcp-check send GET\ /cp/testcheck.html\ HTTP/1.0\r\n tcp-check send \r\n tcp-check expect string good Baptiste On Fri, Jan 10, 2014 at 11:16 PM, Steve Ruiz ste...@mirth.com wrote: I'm experimenting with haproxy on a centos6 VM here. I found that when I specified a health check page (option httpchk GET /url), and that page didn't exist, we have a large 404 page returned, and that causes haproxy to quickly segfault (seems like on the second try GET'ing and parsing the page). I couldn't figure out from the website where to submit a bug, so I figure I'll try here first. Steps to reproduce: - setup http backend, with option httpchk and httpcheck expect string x. Make option httpchk point to a non-existent page - On backend server, set it up to serve large 404 response (in my case, the 404 page is 186kB, as it has an inline graphic and inline css) - Start haproxy, and wait for it to segfault I wasn't sure exactly what was causing this at first, so I did some work to narrow it down with GDB. The variable values from gdb led me to the cause on my side, and hopefully can help you fix the issue. I could not make this work with simply a large page for the http response - in that case, it seems to work as advertised, only inspecting the response up to tune.chksize (default 16384 as i've left it). But if I do this with a 404, it seems to kill it. Let me know what additional information you need if any. Thanks and kudos for the great bit of software! #haproxy config: #- # Example configuration for a possible web application. See the # full configuration options online. # # http://haproxy.1wt.eu/download/1.4/doc/configuration.txt # #- # Help in developing config here: # https://www.twilio.com/engineering/2013/10/16/haproxy #- # Global settings #- global # to have these messages end up in /var/log/haproxy.log you will # need to: # # 1) configure syslog to accept network log events. This is done #by adding the '-r' option to the SYSLOGD_OPTIONS in #/etc/sysconfig/syslog # # 2) configure local2 events to go to the /var/log/haproxy.log # file. A line like the following can be added to # /etc/sysconfig/syslog # #local2.* /var/log/haproxy.log # log 127.0.0.1 local2 info chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 userhaproxy group haproxy daemon #enable stats stats socket /tmp/haproxy.sock listen ha_stats :8088 balance source mode http timeout client 3ms stats enable stats auth haproxystats:foobar stats uri /haproxy?stats #- # common defaults that all the 'listen' and 'backend' sections will # use if not designated in their block #- defaults modehttp log global option httplog option dontlognull #keep persisten client connection open option http-server-close option forwardfor except 127.0.0.0/8 option redispatch # Limit number of retries - total time trying to connect = connect timeout * (#retries + 1) retries 2 timeout http-request10s timeout queue 1m #timeout opening a tcp connection to server - should be shorter than timeout client and server timeout connect