On Tue, May 9, 2017 at 10:17 PM, Jim Pingle <li...@pingle.org> wrote:
> With HAProxy 1.7.3 and later on FreeBSD, recent DNS-related code changes > in HAProxy appear to have broken the UNIX socket in daemon mode when > resolvers are present in the configuration. > > How to reproduce: > > * Install HAProxy 1.7.x (where x > 2) on FreeBSD 10.3 or FreeBSD 11, > even HAProxy 1.7.5 > > * Configure HAProxy to provide a UNIX socket for stats: > > stats socket /tmp/haproxy.socket level admin > > * Configure HAProxy with resolvers: > > resolvers globalresolvers > nameserver localdns 10.6.0.1:53 > resolve_retries 3 > timeout retry 1s > hold valid 10s > > * Start haproxy > > * Attempt to grab some stats from the UNIX socket (ANY stats, not just > resolver stats!): > > echo "show stat server" | nc -U /tmp/haproxy.socket > > * The request never completes, it hangs indefinitely. The above command > is a shorthand way, using it interactively also fails. > > If I revert 91a964aae7a405f2752f8be22d669745caa0c16f > eaf96d7a0849b2883e98459f52489d555b6b013c from the HAProxy source and > rebuild HAProxy, it works as expected. The stats command succeeds and it > yields proper output. > > If the resolvers section is removed, it works with and without those > commits applied. The resolvers do not even have to be added to any > backend, only defined in the configuration. Adding multiple nameserver > entries does not change the behavior. Removing the resolver parameters > other than nameserver also does not change the behavior. > > If the daemon is started in the foreground (without -D, or with -V -db > and so on), it also works. It appears to only be a problem when using > daemon mode (-D). > > I posted this over on Discourse before I noticed the message saying to > post bugs on the list instead. The full test configuration is on my post > there[1]. > > Note that everything I run is FreeBSD so I have not tested this against > a Linux system. It may be a more general problem or it may be isolated > to FreeBSD. Since I only reproduced it on FreeBSD, that's how I stated > the problem. > > Thanks in advance for any assistance in getting this solved. > > Jim P. > > 1: > http://discourse.haproxy.org/t/dns-changes-in-1-7-3-break- > unix-socket-stats-when-resolvers-are-configured/1222 > > Hi Jim, Could you confirm that when the stats socket hangs, HAProxy can still process traffic (or not) ? Can you run a strace on the process while running the command on the stats socket. I don't know if that's related, but while working on making DNS resolution autonomous (they are currently triggered by health checks), I discovered a "task leak" with the way we open / close the connection in 91a964aae7a405f2752f8be22d669745caa0c16f in src/dns.c. Because this function is called twice, it creates 2 tasks... Please check patch in attachment to fix this issue. That said, I don't think it is related to your problem... but it worth a try. Baptiste
From c8c9133f0a3c6f8f08a56e7c9bf0b5d83355d6e8 Mon Sep 17 00:00:00 2001 From: Baptiste Assmann <bassm...@brocade.com> Date: Wed, 10 May 2017 10:15:49 +0200 Subject: [PATCH] BUG/MINOR: dns: fix a task leak in dns_init_resolvers() Since eaf96d7a0849b2883e98459f52489d555b6b013c, dns_init_resolvers may be called twice during startup phase of HAProxy. Each call to this function creates a task, which means one task is created and won't be used. This may confuse HAProxy's scheduler and the task consumers. --- src/dns.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/dns.c b/src/dns.c index dc2ed05..0682348 100644 --- a/src/dns.c +++ b/src/dns.c @@ -960,7 +960,13 @@ int dns_init_resolvers(int close_socket) t->context = curr_resolvers; t->expire = TICK_ETERNITY; - curr_resolvers->t = t; + /* dns_init_resolvers() may be called multiple times by haproxy, + * that said we don't want to create multiple tasks for the same purpose, + * this will confuse the scheduler and the consumers of this task. */ + if (!curr_resolvers->t) + curr_resolvers->t = t; + else + task_free(t); list_for_each_entry(curnameserver, &curr_resolvers->nameserver_list, list) { dgram = NULL; -- 2.7.4