Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)

2014-01-17 Thread Steve Ruiz
Confirmed on my side as well. No segfault, and no spinning CPU with the
latest patch.

thanks!

Steve Ruiz
Manager - Hosting Operations
Mirth
ste...@mirth.com ste...@mirthcorp.com


On Fri, Jan 17, 2014 at 10:25 AM, Cyril Bonté cyril.bo...@free.fr wrote:

 Le 17/01/2014 11:14, Willy Tarreau a écrit :

  On Fri, Jan 17, 2014 at 11:03:51AM +0100, Willy Tarreau wrote:

 On Fri, Jan 17, 2014 at 10:47:01AM +0100, Willy Tarreau wrote:

 So I might have broken something in the way to count the try value,
 ending up with zero being selected and nothing done. Unfortunately it
 works fine here.


 OK I can reproduce it in 32-bit now. Let's see what happens...


 OK here's the fix. I'm ashamed for not having noticed this mistake during
 the change. I ported the raw_sock changes to ssl_sock, it was pretty
 straght-forward but I missed the condition in the while () loop. And
 unfortunately, the variable happened to be non-zero in the stack,
 resulting in something working well for me :-/

 I've pushed the fix.


 Great ! I didn't have time to try to fix it yesterday.
 Everything is working well now, we can definitely close this bug, that's a
 good thing ;-)

 --
 Cyril Bonté


-- 
CONFIDENTIALITY NOTICE: The information contained in this electronic 
transmission may be confidential. If you are not an intended recipient, be 
aware that any disclosure, copying, distribution or use of the information 
contained in this transmission is prohibited and may be unlawful. If you 
have received this transmission in error, please notify us by email reply 
and then erase it from your computer system.


Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)

2014-01-16 Thread Steve Ruiz
Cyril is correct - I simply waited for a segfault, but didn't actually test
through the load balancer. I'm using SSL on haproxy, and yes, when I try to
hit a web page behind haproxy, CPU spins at 100% for a good while.

Steve Ruiz
Manager - Hosting Operations
Mirth
ste...@mirth.com ste...@mirthcorp.com


On Thu, Jan 16, 2014 at 1:48 PM, Cyril Bonté cyril.bo...@free.fr wrote:

 Hi Willy,

 Le 15/01/2014 01:08, Willy Tarreau a écrit :

  On Tue, Jan 14, 2014 at 12:25:37PM -0800, Steve Ruiz wrote:

 Patched and confirmed in our environment that this is now working / seems
 to have fixed the issue. Thanks!


 Great, many thanks to you both guys. We've got rid of another pretty
 old bug, these are the ones that make me the happiest once fixed!

 I'm currently unpacking my laptop to push the fix so that it appears
 in todays snapshot.

 Excellent work!


 I fear there are some more work to do on this patch.
 I made some tests on ssl and it looks to be broken since this commit :-(

 The shortest configuration I could find to reproduce the issue is :
   listen test
 bind 0.0.0.0:443 ssl crt cert.pem
 mode http
 timeout server 5s
 timeout client 5s

 When a request is received by haproxy, the cpu raises to 100% in a
 epoll_wait loop (timeouts are here to prevent an unlimited loop).

 $ curl -k https://localhost/
 ...
 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 200, 0) = 1
 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 200, 0) = 1
 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 200, 0) = 1
 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 200, 0) = 1
 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 200, 0) = 1
 ...
 The same issue occurs when a server is declared.

 The same also occurs when the proxy is in clear http and a server is in
 https :
   listen test
 bind 0.0.0.0:80
 mode http
 timeout server 5s
 timeout client 5s
 server ssl_backend 127.0.0.1:443 ssl

 $ curl http://localhost/
 ...
 epoll_wait(3, {}, 200, 0)   = 0
 epoll_wait(3, {}, 200, 0)   = 0
 epoll_wait(3, {}, 200, 0)   = 0
 epoll_wait(3, {}, 200, 0)   = 0
 epoll_wait(3, {}, 200, 0)   = 0
 ...


 --
 Cyril Bonté


-- 
CONFIDENTIALITY NOTICE: The information contained in this electronic 
transmission may be confidential. If you are not an intended recipient, be 
aware that any disclosure, copying, distribution or use of the information 
contained in this transmission is prohibited and may be unlawful. If you 
have received this transmission in error, please notify us by email reply 
and then erase it from your computer system.


Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)

2014-01-14 Thread Steve Ruiz
Patched and confirmed in our environment that this is now working / seems
to have fixed the issue. Thanks!

Steve Ruiz


On Tue, Jan 14, 2014 at 3:22 AM, Willy Tarreau w...@1wt.eu wrote:

 OK here's a proposed fix which addresses the API issue for both
 raw_sock and ssl_sock.

 Steve, it would be nice if you could give it a try just to confirm
 I didn't miss anything.

 Thanks,
 Willy



-- 
CONFIDENTIALITY NOTICE: The information contained in this electronic 
transmission may be confidential. If you are not an intended recipient, be 
aware that any disclosure, copying, distribution or use of the information 
contained in this transmission is prohibited and may be unlawful. If you 
have received this transmission in error, please notify us by email reply 
and then erase it from your computer system.


Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)

2014-01-10 Thread Steve Ruiz
)
(gdb) step
223 right = buf-data + buf-size;
(gdb) step
222 if (buf-data + buf-o = buf-p)
(gdb) step
223 right = buf-data + buf-size;
(gdb) step
222 if (buf-data + buf-o = buf-p)
(gdb) step
227 left = buffer_wrap_add(buf, buf-p + buf-i);
(gdb) step
buffer_wrap_add (b=0x6eff90, blk=0x6d82d0 GET /cp/testcheck.php
HTTP/1.0\r\n, len=32) at include/common/buffer.h:191
191 if (ptr - buf-size = buf-data)
(gdb) step
bo_putblk (b=0x6eff90, blk=0x6d82d0 GET /cp/testcheck.php HTTP/1.0\r\n,
len=32) at include/common/buffer.h:388
388 memcpy(b-p, blk, half);
(gdb) print b-p
$1 = 0x6effa4 play:table-column;float:none}table td[class*=\col-\],table
th[class*=\col-\]{display:table-cell;float:none}.tabletheadtrtd.active,.tabletbodytrtd.active,.tabletfoottrtd.active,.tabletheadtr...
(gdb) print blk
$2 = 0x6d82d0 GET /cp/testcheck.php HTTP/1.0\r\n
(gdb) print half
$3 = value optimized out
(gdb)
$4 = value optimized out
(gdb) step
384 half = buffer_contig_space(b);
(gdb) step
buffer_contig_space (b=0x6eff90, blk=0x6d82d0 GET /cp/testcheck.php
HTTP/1.0\r\n, len=32) at include/common/buffer.h:227
227 left = buffer_wrap_add(buf, buf-p + buf-i);
(gdb) step
buffer_wrap_add (b=0x6eff90, blk=0x6d82d0 GET /cp/testcheck.php
HTTP/1.0\r\n, len=32) at include/common/buffer.h:191
191 if (ptr - buf-size = buf-data)
(gdb) step
buffer_contig_space (b=0x6eff90, blk=0x6d82d0 GET /cp/testcheck.php
HTTP/1.0\r\n, len=32) at include/common/buffer.h:228
228 return right - left;
(gdb) step
bo_putblk (b=0x6eff90, blk=0x6d82d0 GET /cp/testcheck.php HTTP/1.0\r\n,
len=32) at include/common/buffer.h:388
388 memcpy(b-p, blk, half);
(gdb) print b-p
$5 = 0x6effa4 play:table-column;float:none}table td[class*=\col-\],table
th[class*=\col-\]{display:table-cell;float:none}.tabletheadtrtd.active,.tabletbodytrtd.active,.tabletfoottrtd.active,.tabletheadtr...
(gdb) print blk
$6 = 0x6d82d0 GET /cp/testcheck.php HTTP/1.0\r\n
(gdb) print half
$7 = -2119952626
(gdb)
$8 = -2,119,952,626
(gdb) step

Program received signal SIGSEGV, Segmentation fault.
0x76e22c64 in memcpy () from /lib64/libc.so.6




Steve Ruiz
Manager - Hosting Operations
Mirth
ste...@mirth.com ste...@mirthcorp.com

-- 
CONFIDENTIALITY NOTICE: The information contained in this electronic 
transmission may be confidential. If you are not an intended recipient, be 
aware that any disclosure, copying, distribution or use of the information 
contained in this transmission is prohibited and may be unlawful. If you 
have received this transmission in error, please notify us by email reply 
and then erase it from your computer system.


Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)

2014-01-10 Thread Steve Ruiz
Made those changes, and it seems to be working properly, no segfault yet
after ~2 minutes of checks.  Thanks!

Steve Ruiz
Manager - Hosting Operations
Mirth
ste...@mirth.com ste...@mirthcorp.com


On Fri, Jan 10, 2014 at 3:06 PM, Baptiste bed...@gmail.com wrote:

 Hi Steve,

 Could you give a try to the tcp-check and tell us if your have the same
 issue.
 In your backend, turn your httpchk related directives into:
   option tcp-check
   tcp-check send GET\ /cp/testcheck.html\ HTTP/1.0\r\n
   tcp-check send \r\n
   tcp-check expect string good

 Baptiste


 On Fri, Jan 10, 2014 at 11:16 PM, Steve Ruiz ste...@mirth.com wrote:
  I'm experimenting with haproxy on a centos6 VM here.  I found that when I
  specified a health check page (option httpchk GET /url), and that page
  didn't exist, we have a large 404 page returned, and that causes haproxy
 to
  quickly segfault (seems like on the second try GET'ing and parsing the
  page).  I couldn't figure out from the website where to submit a bug, so
 I
  figure I'll try here first.
 
  Steps to reproduce:
  - setup http backend, with option httpchk and httpcheck expect string x.
  Make option httpchk point to a non-existent page
  - On backend server, set it up to serve large 404 response (in my case,
 the
  404 page is 186kB, as it has an inline graphic and inline css)
  - Start haproxy, and wait for it to segfault
 
  I wasn't sure exactly what was causing this at first, so I did some work
 to
  narrow it down with GDB.  The variable values from gdb led me to the
 cause
  on my side, and hopefully can help you fix the issue.  I could not make
 this
  work with simply a large page for the http response - in that case, it
 seems
  to work as advertised, only inspecting the response up to tune.chksize
  (default 16384 as i've left it).  But if I do this with a 404, it seems
 to
  kill it.  Let me know what additional information you need if any.
  Thanks
  and kudos for the great bit of software!
 
 
  #haproxy config:
  #-
  # Example configuration for a possible web application.  See the
  # full configuration options online.
  #
  #   http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
  #
  #-
 
  # Help in developing config here:
  # https://www.twilio.com/engineering/2013/10/16/haproxy
 
 
  #-
  # Global settings
  #-
  global
  # to have these messages end up in /var/log/haproxy.log you will
  # need to:
  #
  # 1) configure syslog to accept network log events.  This is done
  #by adding the '-r' option to the SYSLOGD_OPTIONS in
  #/etc/sysconfig/syslog
  #
  # 2) configure local2 events to go to the /var/log/haproxy.log
  #   file. A line like the following can be added to
  #   /etc/sysconfig/syslog
  #
  #local2.*   /var/log/haproxy.log
  #
  log 127.0.0.1 local2 info
 
  chroot  /var/lib/haproxy
  pidfile /var/run/haproxy.pid
  maxconn 4000
  userhaproxy
  group   haproxy
  daemon
 
  #enable stats
  stats socket /tmp/haproxy.sock
 
  listen ha_stats :8088
  balance source
  mode http
  timeout client 3ms
  stats enable
  stats auth haproxystats:foobar
  stats uri /haproxy?stats
 
  #-
  # common defaults that all the 'listen' and 'backend' sections will
  # use if not designated in their block
  #-
  defaults
  modehttp
  log global
  option  httplog
  option  dontlognull
  #keep persisten client connection open
  option  http-server-close
  option forwardfor   except 127.0.0.0/8
  option  redispatch
  # Limit number of retries - total time trying to connect = connect
  timeout * (#retries + 1)
  retries 2
  timeout http-request10s
  timeout queue   1m
  #timeout opening a tcp connection to server - should be shorter than
  timeout client and server
  timeout connect 3100
  timeout client  30s
  timeout server  30s
  timeout http-keep-alive 10s
  timeout check   10s
  maxconn 3000
 
  #-
  # main frontend which proxys to the backends
  #-
  frontend https_frontend
  bind :80
  redirect scheme https if !{ ssl_fc }
 
  #config help:
  https://github.com/observing/balancerbattle/blob

Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)

2014-01-10 Thread Steve Ruiz
Thanks for the workaround + super fast response, and glad to help :).

Steve Ruiz
Manager - Hosting Operations
Mirth
ste...@mirth.com ste...@mirthcorp.com


On Fri, Jan 10, 2014 at 3:53 PM, Baptiste bed...@gmail.com wrote:

 Well, let say this is a workaround...
 We'll definitively have to fix the bug ;)

 Baptiste

 On Sat, Jan 11, 2014 at 12:24 AM, Steve Ruiz ste...@mirth.com wrote:
  Made those changes, and it seems to be working properly, no segfault yet
  after ~2 minutes of checks.  Thanks!
 
  Steve Ruiz
  Manager - Hosting Operations
  Mirth
  ste...@mirth.com
 
 
  On Fri, Jan 10, 2014 at 3:06 PM, Baptiste bed...@gmail.com wrote:
 
  Hi Steve,
 
  Could you give a try to the tcp-check and tell us if your have the same
  issue.
  In your backend, turn your httpchk related directives into:
option tcp-check
tcp-check send GET\ /cp/testcheck.html\ HTTP/1.0\r\n
tcp-check send \r\n
tcp-check expect string good
 
  Baptiste
 
 
  On Fri, Jan 10, 2014 at 11:16 PM, Steve Ruiz ste...@mirth.com wrote:
   I'm experimenting with haproxy on a centos6 VM here.  I found that
 when
   I
   specified a health check page (option httpchk GET /url), and that page
   didn't exist, we have a large 404 page returned, and that causes
 haproxy
   to
   quickly segfault (seems like on the second try GET'ing and parsing the
   page).  I couldn't figure out from the website where to submit a bug,
 so
   I
   figure I'll try here first.
  
   Steps to reproduce:
   - setup http backend, with option httpchk and httpcheck expect string
 x.
   Make option httpchk point to a non-existent page
   - On backend server, set it up to serve large 404 response (in my
 case,
   the
   404 page is 186kB, as it has an inline graphic and inline css)
   - Start haproxy, and wait for it to segfault
  
   I wasn't sure exactly what was causing this at first, so I did some
 work
   to
   narrow it down with GDB.  The variable values from gdb led me to the
   cause
   on my side, and hopefully can help you fix the issue.  I could not
 make
   this
   work with simply a large page for the http response - in that case, it
   seems
   to work as advertised, only inspecting the response up to tune.chksize
   (default 16384 as i've left it).  But if I do this with a 404, it
 seems
   to
   kill it.  Let me know what additional information you need if any.
   Thanks
   and kudos for the great bit of software!
  
  
   #haproxy config:
   #-
   # Example configuration for a possible web application.  See the
   # full configuration options online.
   #
   #   http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
   #
   #-
  
   # Help in developing config here:
   # https://www.twilio.com/engineering/2013/10/16/haproxy
  
  
   #-
   # Global settings
   #-
   global
   # to have these messages end up in /var/log/haproxy.log you will
   # need to:
   #
   # 1) configure syslog to accept network log events.  This is done
   #by adding the '-r' option to the SYSLOGD_OPTIONS in
   #/etc/sysconfig/syslog
   #
   # 2) configure local2 events to go to the /var/log/haproxy.log
   #   file. A line like the following can be added to
   #   /etc/sysconfig/syslog
   #
   #local2.*   /var/log/haproxy.log
   #
   log 127.0.0.1 local2 info
  
   chroot  /var/lib/haproxy
   pidfile /var/run/haproxy.pid
   maxconn 4000
   userhaproxy
   group   haproxy
   daemon
  
   #enable stats
   stats socket /tmp/haproxy.sock
  
   listen ha_stats :8088
   balance source
   mode http
   timeout client 3ms
   stats enable
   stats auth haproxystats:foobar
   stats uri /haproxy?stats
  
   #-
   # common defaults that all the 'listen' and 'backend' sections will
   # use if not designated in their block
   #-
   defaults
   modehttp
   log global
   option  httplog
   option  dontlognull
   #keep persisten client connection open
   option  http-server-close
   option forwardfor   except 127.0.0.0/8
   option  redispatch
   # Limit number of retries - total time trying to connect = connect
   timeout * (#retries + 1)
   retries 2
   timeout http-request10s
   timeout queue   1m
   #timeout opening a tcp connection to server - should be shorter
 than
   timeout client and server
   timeout connect