Re: Segfaults with 1.9.6

2019-10-25 Thread Willy Tarreau
On Fri, Oct 25, 2019 at 04:54:44PM +0200, GARDAIS Ionel wrote:
> Hi Olivier, 
> 
> As far as I can remember, 1.9 had a series of segfaults involving H2 and HTX, 
> patched from release to release. 
> [ http://www.haproxy.org/bugs/bugs-1.9.6.html | 
> http://www.haproxy.org/bugs/bugs-1.9.6.html ] 
> 
> As a rule of thumb, I can only suggest you try with the latest 1.9 release 
> (that is 1.9.12 as of today) and see if segfaults happen again. 

Seconded! Since 1.9 we've faced very complex conditions triggering some
sleeping bugs like these. One of the issues reported in Olivier's trace
was related to the improper locking of connections that was fixed some
time ago.

When facing any bug (not only a crash), the first thing to do is to make
sure you're up to date. If you don't update, it is guaranteed that the
bug you've faced will be able to appear again under the same conditions.
If you update, you have a chance that it was fixed. And if not, developers
can immediately look at the issue without first asking to update.

Based on the recent history with such complex bugs, I predict we'll
probably face one, maybe even two other bugs of this level of severity
in very rare conditions before 1.9 reaches EOL but overall the long code
audit that started some time ago helped address causes more than
consequences, even if that leads to more difficult backports. And
despite such few issues, the internal processing in 1.9 and 2.0 is way
cleaner, safer and more correct than previous versions, so it's really
strongly recommended to stay up to date.

Cheers,
Willy



Re: Segfaults with 1.9.6

2019-10-25 Thread GARDAIS Ionel
Hi Olivier, 

As far as I can remember, 1.9 had a series of segfaults involving H2 and HTX, 
patched from release to release. 
[ http://www.haproxy.org/bugs/bugs-1.9.6.html | 
http://www.haproxy.org/bugs/bugs-1.9.6.html ] 

As a rule of thumb, I can only suggest you try with the latest 1.9 release 
(that is 1.9.12 as of today) and see if segfaults happen again. 

-- 
Ionel GARDAIS 
Tech'Advantage CIO - IT Team manager 


De: "Olivier D"  
À: "haproxy"  
Envoyé: Vendredi 25 Octobre 2019 14:48:20 
Objet: Segfaults with 1.9.6 

Hello, 
I know I'm reporting an issue with an old version, but I got 2 segfaults in 
48h. 
As I only got 3 segfaults with HAProxy in +10 years, I just wanted to make sure 
these bugs have been caught and are now fixed. 

haproxy -vv output: 

HA-Proxy version 1.9.6 2019/03/29 - [ https://haproxy.org/ | 
https://haproxy.org/ ] 
Build options : 
TARGET = linux2628 
CPU = generic 
CC = gcc 
CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv 
-Wno-format-truncation -Wno-unused-label -Wno-sign-compare 
-Wno-unused-parameter -Wno-old-style-declaration -Wno-ignored-qualifiers 
-Wno-clobbered -Wno-missing-field-initializers -Wno-implicit-fallthrough 
-Wno-stringop-overflow -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 
-Wduplicated-cond -Wnull-dereference 
OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_LUA=1 USE_STATIC_PCRE=1 

Default settings : 
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 

Built with OpenSSL version : OpenSSL 1.1.1b 26 Feb 2019 
Running on OpenSSL version : OpenSSL 1.1.1b 26 Feb 2019 
OpenSSL library supports TLS extensions : yes 
OpenSSL library supports SNI : yes 
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3 
Built with Lua version : Lua 5.3.5 
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT 
IP_FREEBIND 
Built with zlib version : 1.2.11 
Running on zlib version : 1.2.11 
Compression algorithms supported : identity("identity"), deflate("deflate"), 
raw-deflate("deflate"), gzip("gzip") 
Built with PCRE version : 8.41 2017-07-05 
Running on PCRE version : 8.41 2017-07-05 
PCRE library supports JIT : no (USE_PCRE_JIT not set) 
Encrypted password support via crypt(3): yes 
Built with multi-threading support. 

Available polling systems : 
epoll : pref=300, test result OK 
poll : pref=200, test result OK 
select : pref=150, test result OK 
Total: 3 (3 usable), will use epoll. 

Available multiplexer protocols : 
(protocols marked as  cannot be specified using 'proto' keyword) 
h2 : mode=HTX side=FE|BE 
h2 : mode=HTTP side=FE 
 : mode=HTX side=FE|BE 
 : mode=TCP|HTTP side=FE|BE 

Available filters : 
[SPOE] spoe 
[COMP] compression 
[CACHE] cache 
[TRACE] trace 


### First segfault : ### 

Program terminated with signal 11, Segmentation fault. 
#0 0x004cba32 in h2_process_mux (h2c=0x9b4b300) at src/mux_h2.c:2588 

(gdb) bt full 
#0 0x004cba32 in h2_process_mux (h2c=0x9b4b300) at src/mux_h2.c:2588 
h2s = 0x98edf50 
#1 h2_send (h2c=h2c@entry=0x9b4b300) at src/mux_h2.c:2716 
flags =  
conn = 0x9aef030 
done = 0 
sent = 0 
#2 0x004d3918 in h2_io_cb (t=, ctx=0x9b4b300, 
status=) at src/mux_h2.c:2778 
h2c = 0x9b4b300 
ret = 0 
#3 0x00584456 in process_runnable_tasks () at src/task.c:437 
t = 0x9e15170 
state =  
ctx =  
process =  
t =  
max_processed = 194 
#4 0x00503fd4 in run_poll_loop () at src/haproxy.c:2642 
next =  
exp =  
#5 run_thread_poll_loop (data=data@entry=0x19a32b0) at src/haproxy.c:2707 
ptif =  
ptdf =  
start_lock = 0 
#6 0x004648d8 in main (argc=, argv=0x7ffccfb0cba8) at 
src/haproxy.c:3343 
tids = 0x19a32b0 
threads = 0x19a2750 
i =  
old_sig = {__val = {68097, 0, 64, 206158430210, 532575944795, 472446402679, 0, 
139791683256608, 24, 11381472, 335544638, 11392704, 26776016, 139791680031404, 
0, 26699504}} 
blocked_sig = {__val = {1844674406710583, 18446744073709551615 }} 
err =  
retry =  
limit = {rlim_cur = 801167, rlim_max = 801167} 
errmsg = 
"\000\000\000\000\000\000\000\000\220Ap\312#\177\000\000\000\357\200\000\000\000\000\000(\357\200\000\000\000\000\000\231\353\200\000\000\000\000\000\000\000\000\000\002",
 '\000' "\350, 
Dp\312#\177\000\000p\311\260\317\374\177\000\000\035\000\000\000\000\000\000\000\210\311\260\317\374\177\000\000
 \326\230\001\001\000\000\000\000v\000" 
pidfd =  


### Second segfault ### 
Program terminated with signal 11, Segmentation fault. 
#0 0x005808b5 in __pendconn_unlink (p=p@entry=0x7fff694b0730) at 
src/queue.c:138 

(gdb) bt full 
#0 0x005808b5 in __pendconn_unlink (p=p@entry=0x7fff694b0730) at 
src/queue.c:138 
No locals. 
#1 0x00581507 in pendconn_redistribute (s=s@entry=0x6b01cd0) at 
src/queue.c:413 
p = 0x7fff694b0730 
node = 0xb781a88 
#2 0x004ee2b2 in srv_update_status (s=s@entry=0x6b01cd0) at 
src/server.c:4805 
next_admin =  
check = 0x6b02170 
xferred =  

Segfaults with 1.9.6

2019-10-25 Thread Olivier D
Hello,

I know I'm reporting an issue with  an old version, but I got 2 segfaults
in 48h.
As I only got 3 segfaults with HAProxy in +10 years, I just wanted to make
sure these bugs have been caught and are now fixed.

haproxy -vv output:

HA-Proxy version 1.9.6 2019/03/29 - https://haproxy.org/
Build options :
  TARGET  = linux2628
  CPU = generic
  CC  = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
-fwrapv -Wno-format-truncation -Wno-unused-label -Wno-sign-compare
-Wno-unused-parameter -Wno-old-style-declaration -Wno-ignored-qualifiers
-Wno-clobbered -Wno-missing-field-initializers -Wno-implicit-fallthrough
-Wno-stringop-overflow -Wtype-limits -Wshift-negative-value
-Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference
  OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_LUA=1 USE_STATIC_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.1.1b  26 Feb 2019
Running on OpenSSL version : OpenSSL 1.1.1b  26 Feb 2019
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.5
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
IP_FREEBIND
Built with zlib version : 1.2.11
Running on zlib version : 1.2.11
Compression algorithms supported : identity("identity"),
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with PCRE version : 8.41 2017-07-05
Running on PCRE version : 8.41 2017-07-05
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Encrypted password support via crypt(3): yes
Built with multi-threading support.

Available polling systems :
  epoll : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as  cannot be specified using 'proto' keyword)
  h2 : mode=HTXside=FE|BE
  h2 : mode=HTTP   side=FE
: mode=HTXside=FE|BE
: mode=TCP|HTTP   side=FE|BE

Available filters :
[SPOE] spoe
[COMP] compression
[CACHE] cache
[TRACE] trace


### First segfault : ###

Program terminated with signal 11, Segmentation fault.
#0  0x004cba32 in h2_process_mux (h2c=0x9b4b300) at
src/mux_h2.c:2588

(gdb) bt full
#0  0x004cba32 in h2_process_mux (h2c=0x9b4b300) at
src/mux_h2.c:2588
h2s = 0x98edf50
#1  h2_send (h2c=h2c@entry=0x9b4b300) at src/mux_h2.c:2716
flags = 
conn = 0x9aef030
done = 0
sent = 0
#2  0x004d3918 in h2_io_cb (t=, ctx=0x9b4b300,
status=) at src/mux_h2.c:2778
h2c = 0x9b4b300
ret = 0
#3  0x00584456 in process_runnable_tasks () at src/task.c:437
t = 0x9e15170
state = 
ctx = 
process = 
t = 
max_processed = 194
#4  0x00503fd4 in run_poll_loop () at src/haproxy.c:2642
next = 
exp = 
#5  run_thread_poll_loop (data=data@entry=0x19a32b0) at src/haproxy.c:2707
ptif = 
ptdf = 
start_lock = 0
#6  0x004648d8 in main (argc=, argv=0x7ffccfb0cba8)
at src/haproxy.c:3343
tids = 0x19a32b0
threads = 0x19a2750
i = 
old_sig = {__val = {68097, 0, 64, 206158430210, 532575944795,
472446402679, 0, 139791683256608, 24, 11381472, 335544638, 11392704,
26776016, 139791680031404, 0, 26699504}}
blocked_sig = {__val = {1844674406710583, 18446744073709551615
}}
err = 
retry = 
limit = {rlim_cur = 801167, rlim_max = 801167}
errmsg =
"\000\000\000\000\000\000\000\000\220Ap\312#\177\000\000\000\357\200\000\000\000\000\000(\357\200\000\000\000\000\000\231\353\200\000\000\000\000\000\000\000\000\000\002",
'\000' "\350,
Dp\312#\177\000\000p\311\260\317\374\177\000\000\035\000\000\000\000\000\000\000\210\311\260\317\374\177\000\000
\326\230\001\001\000\000\000\000v\000"
pidfd = 


### Second segfault ###
Program terminated with signal 11, Segmentation fault.
#0  0x005808b5 in __pendconn_unlink (p=p@entry=0x7fff694b0730) at
src/queue.c:138

(gdb) bt full
#0  0x005808b5 in __pendconn_unlink (p=p@entry=0x7fff694b0730) at
src/queue.c:138
No locals.
#1  0x00581507 in pendconn_redistribute (s=s@entry=0x6b01cd0) at
src/queue.c:413
p = 0x7fff694b0730
node = 0xb781a88
#2  0x004ee2b2 in srv_update_status (s=s@entry=0x6b01cd0) at
src/server.c:4805
next_admin = 
check = 0x6b02170
xferred = 
px = 0x6a357e0
prev_srv_count = 2
srv_was_stopping = 
log_level = 
tmptrash = 0x0
#3  0x004eef04 in srv_set_stopped (s=0x6b01cd0,
reason=reason@entry=0x0,
check=) at src/server.c:1016
srv = 
#4  0x004eefc1 in srv_set_stopped (s=,
reason=reason@entry=0x0, check=) at