On Tue, Jun 21, 2016 at 10:15:06AM +0200, Eric Webster wrote: > Since upgrading from 1.6.4 to 1.6.5, server state is not loaded on > start. I do not get any errors or warnings, it just doesn't seem to > fire. Is this a bug perhaps? > > What I'm talking about is this kind of setup: > http://fossies.org/linux/haproxy/examples/seamless_reload.txt > > Where I have in my config: > server-state-base /var/lib/haproxy > load-server-state-from-file local > > and the socat loop to dump the state in init. State is dumped, but on > reload it just doesn't load any of them.
A fix introduced a regression on server state in 1.6.5 (wrong ID sometimes dumped), which could possibly be responsible for what you're observing. Since then the attached patch was merged. If you could check with latest snapshot or by applying the attached patch on top of 1.6.5 and confirm that the issue is gone, that would be great. I'd like to issue 1.6.6 this week with all pending fixes. Better ensure we don't leave such a pending bug open! Thanks, Willy
>From 4cb6ccc835ce0c2c874e9868a62a981278b510f7 Mon Sep 17 00:00:00 2001 From: =?latin1?q?Cyril=20Bont=E9?= <cyril.bo...@free.fr> Date: Fri, 27 May 2016 00:06:45 +0200 Subject: BUG/MEDIUM: stats: show servers state may show an servers from another backend Olivier Doucet reported that "show servers state" was producing an invalid output with some configurations where nbproc > 1. Indeed, commit 76a99784f4 fixed some issues but unfortunately introduced a regression when a backend bound to the same process as the stats socket and a previous backend is bound to another one. For example : global daemon nbproc 2 stats socket /var/run/haproxy-1.sock process 1 stats socket /var/run/haproxy-2.sock process 2 listen proc1 bind 127.0.0.1:9001 bind-process 1 server WRONG 127.0.0.1:80 listen proc2 bind 127.0.0.1:9002 bind-process 2 server RIGHT 127.0.0.1:80 Requesting "show servers state" on /var/run/haproxy-2.sock was producing a line like : 3 proc2 1 WRONG 127.0.0.1 2 0 1 1 4 1 0 2 0 0 0 0 whereas the line below was awaited : 3 proc2 1 RIGHT 127.0.0.1 2 0 1 1 5 1 0 2 0 0 0 0 This was caused by the initialization of the server loop too early, before the bind_proc filtering whereas it should be done after. This fix should be backported to 1.6, where the regression has unfortunately been backported. (cherry picked from commit d55bd7a6a934387cdc5df7ad3fbc2718dc3a724e) --- src/dumpstats.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/dumpstats.c b/src/dumpstats.c index b9f5719..4614cf2 100644 --- a/src/dumpstats.c +++ b/src/dumpstats.c @@ -2755,6 +2755,9 @@ static int dump_servers_state(struct stream_interface *si, struct chunk *buf) if (appctx->ctx.server_state.px->bind_proc && !(appctx->ctx.server_state.px->bind_proc & (1UL << (relative_pid - 1)))) return 1; + if (!appctx->ctx.server_state.sv) + appctx->ctx.server_state.sv = appctx->ctx.server_state.px->srv; + for (; appctx->ctx.server_state.sv != NULL; appctx->ctx.server_state.sv = srv->next) { srv = appctx->ctx.server_state.sv; srv_addr[0] = '\0'; @@ -2857,8 +2860,6 @@ static int stats_dump_servers_state_to_buffer(struct stream_interface *si) for (; appctx->ctx.server_state.px != NULL; appctx->ctx.server_state.px = curproxy->next) { curproxy = appctx->ctx.server_state.px; - if (!appctx->ctx.server_state.sv) - appctx->ctx.server_state.sv = appctx->ctx.server_state.px->srv; /* servers are only in backends */ if (curproxy->cap & PR_CAP_BE) { if (!dump_servers_state(si, &trash)) -- 1.7.12.1