Sorry here is my haproxy command information as well: | [email protected] ~ $ sudo haproxy -vv | HA-Proxy version 1.7.0-1 2016/11/27 | Copyright 2000-2016 Willy Tarreau <[email protected]> | | Build options : | TARGET = linux2628 | CPU = generic | CC = gcc | CFLAGS = -g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 | OPTIONS = USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1 USE_NS=1 | | Default settings : | maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 | | Encrypted password support via crypt(3): yes | Built with zlib version : 1.2.8 | Running on zlib version : 1.2.8 | Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") | Built with OpenSSL version : OpenSSL 1.0.2j 26 Sep 2016 | Running on OpenSSL version : OpenSSL 1.0.2j 26 Sep 2016 | OpenSSL library supports TLS extensions : yes | OpenSSL library supports SNI : yes | OpenSSL library supports prefer-server-ciphers : yes | Built with PCRE version : 8.35 2014-04-04 | Running on PCRE version : 8.35 2014-04-04 | PCRE library supports JIT : no (USE_PCRE_JIT not set) | Built with Lua version : Lua 5.3.1 | Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND | Built with network namespace support | | Available polling systems : | epoll : pref=300, test result OK | poll : pref=200, test result OK | select : pref=150, test result OK | Total: 3 (3 usable), will use epoll. | | Available filters : | [COMP] compression | [TRACE] trace | [SPOE] spoe
Thanks, Joshua M. Boniface Linux System Ærchitect - Boniface Labs Sigmentation fault: core dumped On 29/11/16 02:17 AM, Joshua M. Boniface wrote: > Hello list! > > I believe I've found a bug in haproxy related to multiproc and a set of DNS > resolvers. What happens is, when combining these two features (multiproc and > dynamic resolvers), I get the following problem: the DNS resolvers, one per > process it seems, will fail intermittently and independently for no obvious > reason, and this triggers a DOWN event in the backend; a short time later, > the resolution succeeds and the backend goes back UP for a short time, before > repeating indefinitely. This bug also seems to have a curious effect of > causing the active record type to switch from A to AAAA and then back to A > repeatedly in a dual-stack setup, though the test below shows that this bug > occurs in an IPv4-only environment as well, and this failure is not > documented in my tests. > > First, some background. I'm attempting to set up an haproxy instance with > multiple processes for SSL termination. At the same time, I'm also trying to > use a IPv6 backend managed by DNS, so I set up a "resolvers" section so I > could use resolved IPv6 addresses from AAAA records. As an aside, I've > noticed that haproxy will not start up if the only record for a host is an > AAAA record, reporting that the address can't be resolved. However since I > run a dual-stack [A + AAAA] record setup normally, this is not a huge deal to > me, though I think supporting an IPv6/AAAA-only backend should definitely be > a future goal! > > First my config (figure 1). This is the most basic config I can construct > that triggers the bug while keeping most of my important settings; note the > host resolves to an A record only (figure 2); this record is provided by a > dnsmasq process which has read it out of /etc/hosts, so the resolution here > should be 100% stable. > > (figure 1) > | global > | log ::1:514 daemon debug > | log-send-hostname > | chroot /var/lib/haproxy > | pidfile /run/haproxy/haproxy.pid > | nbproc 2 > | cpu-map 1 0 > | cpu-map 2 1 > | stats socket /var/lib/haproxy/admin-1.sock mode 660 level admin > process 1 > | stats socket /var/lib/haproxy/admin-2.sock mode 660 level admin > process 2 > | stats timeout 30s > | user haproxy > | group haproxy > | daemon > | maxconn 10000 > | resolvers dns > | nameserver dnsmasq 127.0.0.1:53 > | resolve_retries 1 > | hold valid 1s > | hold timeout 1s > | timeout retry 1s > | defaults > | log global > | option http-keep-alive > | option forwardfor except 127.0.0.0/8 > | option redispatch > | option dontlognull > | option forwardfor > | timeout connect 5s > | timeout client 24h > | timeout server 60m > | listen back_deb-http > | bind 0.0.0.0:80 > | mode http > | option httpchk GET /debian/haproxy > | server deb deb.domain.net:80/debian check inter 1000 resolvers dns > > (figure 2) > | [email protected] ~ $ dig @127.0.0.1 +short deb.domain.net > > | 10.9.0.13 > > When this config starts [using the command in (figure 3)], within seconds I > receive errors that the backend host has gone down due to a DNS resolution > failure. This continues indefinitely, however I stopped this test after one > failure/return cycle for each process (figure 4). > > (figure 3) > | [email protected] ~ $ sudo haproxy -f /etc/haproxy/haproxy.cfg; sudo > strace -tp $(pgrep haproxy | head -1) &>~/haproxy-1.strace & sudo strace -tp > $(pgrep haproxy | tail -1) &>~/haproxy-2.strace & > > (figure 4) > | Nov 29 01:46:54 elb2.domain.net haproxy[29887]: Proxy back_deb-http started. > | Nov 29 01:46:58 elb2.domain.net haproxy[29888]: Server back_deb-http/deb is > going DOWN for maintenance (DNS timeout status). 0 active and 0 backup > servers left. 0 sessions active, 0 requeued, 0 remaining in queue. > | Nov 29 01:46:58 elb2.domain.net haproxy[29888]: proxy back_deb-http has no > server available! > | Nov 29 01:46:59 elb2.domain.net haproxy[29888]: Server back_deb-http/deb > ('deb.domain.net') is UP/READY (resolves again). > | Nov 29 01:46:59 elb2.domain.net haproxy[29888]: Server back_deb-http/deb > administratively READY thanks to valid DNS answer. > | Nov 29 01:47:00 elb2.domain.net haproxy[29889]: Server back_deb-http/deb is > going DOWN for maintenance (DNS timeout status). 0 active and 0 backup > servers left. 0 sessions active, 0 requeued, 0 remaining in queue. > | Nov 29 01:47:00 elb2.domain.net haproxy[29889]: proxy back_deb-http has no > server available! > | Nov 29 01:47:02 elb2.domain.net haproxy[29889]: Server back_deb-http/deb > ('deb.domain.net') is UP/READY (resolves again). > | Nov 29 01:47:02 elb2.domain.net haproxy[29889]: Server back_deb-http/deb > administratively READY thanks to valid DNS answer. > > I've attached straces of both processes as shown in the command in (figure 3) > as well as a pcap file taken on my loopback interface of the DNS resolution > requests to dnsmask. The most curious thing I see in the DNS pcap is requests > for AAAA records that don't exist, but I don't know if that's the cause, > though this would explain the swapping-A-and-AAAA-records I mentioned earlier. > > I've noticed this bug both on haproxy 1.6.9 (from the Debian jessie-backports > repo) and also on my own self-built 1.7.0 package as well, and all the above > testing was on 1.7.0. Please let me know if I can provide any further > information! > > Thanks, > Joshua M. Boniface > Linux System Ærchitect - Boniface Labs > Sigmentation fault: core dumped >

