This morning we generated a crash with a slightly different config, it follows at the end of the email. The relevant change to the configuration was to run test on all processes. The problem appears to be the same, here is the current config and the output of gdb:
frontend app-http bind public.ip:80 interface p2p1 process all bind public.ip:843 ssl crt haproxy/ssl/_wildcard_.pem interface p2p1 process all hap01:/# gdb /usr/sbin/haproxy /core GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/sbin/haproxy...done. [New LWP 47513] Core was generated by `/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid'. Program terminated with signal SIGSEGV, Segmentation fault. #0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:36 36 ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: No such file or directory. (gdb) bt #0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:36 #1 0x0000000000498717 in fill_window () #2 0x0000000000498c20 in deflate_fast () #3 0x000000000049a2e3 in deflate () #4 0x0000000000483897 in deflate_flush_or_finish (comp_ctx=0xdb9fa0, out=0xb139f0, flag=<optimized out>) at src/compression.c:790 #5 0x00000000004847a3 in http_compression_buffer_end (s=s@entry=0xe3c400, in=in@entry=0xe3c458, out=out@entry=0x872d40 <tmpbuf>, end=<optimized out>) at src/compression.c:249 #6 0x0000000000452e84 in http_response_forward_body (s=s@entry=0xe3c400, res=res@entry=0xe3c450, an_bit=an_bit@entry=1048576) at src/proto_http.c:7173 #7 0x0000000000478086 in process_stream (t=<optimized out>) at src/stream.c:1939 #8 0x0000000000411855 in process_runnable_tasks () at src/task.c:238 #9 0x0000000000408310 in run_poll_loop () at src/haproxy.c:1573 #10 0x0000000000404dfa in main (argc=<optimized out>, argv=<optimized out>) at src/haproxy.c:1933 (gdb) bt full #0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:36 No locals. #1 0x0000000000498717 in fill_window () No symbol table info available. #2 0x0000000000498c20 in deflate_fast () No symbol table info available. #3 0x000000000049a2e3 in deflate () No symbol table info available. #4 0x0000000000483897 in deflate_flush_or_finish (comp_ctx=0xdb9fa0, out=0xb139f0, flag=<optimized out>) at src/compression.c:790 ret = <optimized out> out_len = 0 strm = 0xdb9fa0 #5 0x00000000004847a3 in http_compression_buffer_end (s=s@entry=0xe3c400, in=in@entry=0xe3c458, out=out@entry=0x872d40 <tmpbuf>, end=<optimized out>) at src/compression.c:249 to_forward = <optimized out> left = <optimized out> msg = 0xe3c710 ib = 0xe74b30 ob = 0xb139f0 tail = <optimized out> ret = <optimized out> #6 0x0000000000452e84 in http_response_forward_body (s=s@entry=0xe3c400, res=res@entry=0xe3c450, an_bit=an_bit@entry=1048576) at src/proto_http.c:7173 sess = 0xcf5bf0 txn = 0xe3c700 msg = 0xe3c710 tmpbuf = 0xb139f0 compressing = 1 ret = <optimized out> #7 0x0000000000478086 in process_stream (t=<optimized out>) at src/stream.c:1939 max_loops = <optimized out> ana_list = 1048576 ana_back = 1048576 flags = 2147483650 s = 0xe3c400 sess = <optimized out> rqf_last = 143065088 rpf_last = <optimized out> rq_prod_last = <optimized out> rq_cons_last = <optimized out> ---Type <return> to continue, or q <return> to quit--- rp_cons_last = <optimized out> rp_prod_last = <optimized out> req_ana_back = <optimized out> req = 0xe3c410 res = 0xe3c450 si_f = 0xe3c5f8 si_b = 0xe3c618 #8 0x0000000000411855 in process_runnable_tasks () at src/task.c:238 t = 0x7f522a2bf128 #9 0x0000000000408310 in run_poll_loop () at src/haproxy.c:1573 next = <optimized out> #10 0x0000000000404dfa in main (argc=<optimized out>, argv=<optimized out>) at src/haproxy.c:1933 err = <optimized out> retry = <optimized out> limit = {rlim_cur = 206127, rlim_max = 206127} errmsg = "\000\000\000\000\000\000\000\000b\001", '\000' <repeats 30 times>, "\300\066\315*R\177\000\000\002\000\000\000\000\000\000\000(\000\000\000\000\000\000\000\317\030_\000\000\000\000\000\070\\M\000\000\000\000\000\001\000\000\000\374\177\000\000pW\234\000\000\000\000\000\000\000\000" pidfd = <optimized out> ________________________________ From: Olivier Doucet <[email protected]> Sent: Tuesday, August 2, 2016 1:36:20 AM To: James Hartshorn Cc: [email protected] Subject: Re: Haproxy 1.6.7 segmentation fault under load Hello James, 2016-08-02 4:35 GMT+02:00 James Hartshorn <[email protected]<mailto:[email protected]>>: Hi, We’re running into segmentation faults on a new haproxy system we’re developing. We’ve been building haproxy 1.6.7 on ubuntu 14.04.5 with openssl,pcre, and zlib. The problem doesn’t manifest when running a single process. Load testing is approximately 1gbps of ssl traffic from four test servers on the internet, there are two backend servers handling it. It seems that only processes assigned to handle the traffic die and only under load. We have tried with Pthreads and Mutex off, but the problem remained. In the config listed below I have omitted some other front/backends for brevity, they are unused at present and are very simple (no ssl, no process assignments). Segmentation fault is as: [592869.807299] haproxy[31045]: segfault at 7f02cfca88e8 ip 00007f02cf971eee sp 00007ffe1380d4a8 error 4 in libc-2.19.so<http://libc-2.19.so>[7f02cf8da000+1ba000] Can you get the coredump when it happens ? You can get it like this : ulimit -c unlimited echo '/tmp/coredump-%e.%p' > /proc/sys/kernel/core_pattern haproxy -f /your/config/file.cfg Then wait for crash to happen. Then, see coredump file in /tmp ; you can get the backtrace details like this : gdb /usr/bin/haproxy /tmp/coredump*** > bt > bt full This will give extra informations that will be very useful. Olivier Kernel is: "Linux hap01 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux”, Cpu is a single E5-2650 v3, nic is an Intel X710 with i40e driver version 1.5.16 We are running nbproc, relevant config sections: ************** global daemon ssl-server-verify none log /dev/log local0 info # log /dev/log local1 debug # user haproxy # group haproxy spread-checks 50 #maxpipes 64000 tune.idletimer 0 #tune.maxpollevents 1 #tune.comp.maxlevel 9 #tune.zlib.memlevel 9 stats socket /run/haproxy/stats1 uid 0 gid 0 mode 0777 level user process 1 stats socket /run/haproxy/stats2 uid 0 gid 0 mode 0777 level user process 2 stats socket /run/haproxy/stats3 uid 0 gid 0 mode 0777 level user process 3 stats socket /run/haproxy/stats4 uid 0 gid 0 mode 0777 level user process 4 stats socket /run/haproxy/stats5 uid 0 gid 0 mode 0777 level user process 5 stats socket /run/haproxy/stats6 uid 0 gid 0 mode 0777 level user process 6 stats socket /run/haproxy/stats7 uid 0 gid 0 mode 0777 level user process 7 stats socket /run/haproxy/stats8 uid 0 gid 0 mode 0777 level user process 8 #stats socket /run/haproxy/stats9 uid 0 gid 0 mode 0777 level user process 9 stats bind-process all nbproc 8 cpu-map 1 1 cpu-map 2 2 cpu-map 3 3 cpu-map 4 4 cpu-map 5 5 cpu-map 6 6 cpu-map 7 7 cpu-map 8 8 #cpu-map 9 9 maxconn 100000 defaults log global timeout server 5s timeout connect 5s timeout client 5s option accept-invalid-http-request # option http-ignore-probes # option dontlognull mode http option dontlognull option splice-request option splice-response default-server inter 100s timeout connect 5000 timeout client 50000 timeout server 50000 compression algo gzip frontend app-http bind public.ip:80 interface p2p1 process 1-3 bind public.ip:443 ssl crt /haproxy/ssl/_wildcard_.pem interface p2p1 process 4-7 option httplog log global mode http acl white_list src someiprange/24 someip tcp-request content accept if white_list tcp-request content reject default_backend app-http-backend backend app-http-backend bind-process 8 mode http option httplog log global option httpchk balance static-rr server server1-8080 internal.ip:8082 check port 8082 server server2-8080 internal.ip:8082 check port 8082 listen stats bind internal.ip:1901 process 1 bind internal.ip:1902 process 2 bind internal.ip:1903 process 3 bind internal.ip:1904 process 4 bind internal.ip:1905 process 5 bind internal.ip:1906 process 6 bind internal.ip:1907 process 7 bind internal.ip:1908 process 8 mode http stats enable stats uri / stats show-node stats show-legends ************************* Compile Line: make TARGET=linux2628 USE_OPENSSL=1 SSL_INC=$STATICLIBSSL/include SSL_LIB=$STATICLIBSSL/lib ADDLIB=-ldl USE_ZLIB=1 ZLIB_INC=/opt/zlib-$ZLIB_VERSION/ ZLIB_LIB=/opt/zlib-$ZLIB_VERSION/ USE_STATIC_PCRE=1 PCRE_LIB=$PCRESTUFFS/lib/ PCRE_INC=$PCRESTUFFS/include/ Output of haproxy -vv ********************** /opt/haproxy-1.6.7# ./haproxy -vv HA-Proxy version 1.6.7 2016/07/13 Copyright 2000-2016 Willy Tarreau <[email protected]<mailto:[email protected]>> Build options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_STATIC_PCRE=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Encrypted password support via crypt(3): yes Built with zlib version : 1.2.8 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with OpenSSL version : OpenSSL 1.0.1t 3 May 2016 Running on OpenSSL version : OpenSSL 1.0.1t 3 May 2016 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports prefer-server-ciphers : yes Built with PCRE version : 8.38 2015-11-23 PCRE library supports JIT : no (USE_PCRE_JIT not set) Built without Lua support Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. ******************************* While I did notice a newer PCRE I haven’t built with it yet, but that doesn’t seem to be the area of problem. I also noticed when using the USE_PTHREAD_PSHARED=1 or USE_FUTEX= options on the make command the “OPTION=“ output of haproxy -vv doesn’t change, though a close examination of the output of the make indicates they are being respected. I do the static compile of pcre, zlib, and openssl once. For each compile of haproxy I do make clean first. p.s. In the Atomic Operations section of the readme: "you willy have to either use”

