On Thu, Apr 13, 2017 at 12:59:38PM +0200, Conrad Hoffmann wrote: > On 04/13/2017 11:31 AM, Olivier Houchard wrote: > > On Thu, Apr 13, 2017 at 11:17:45AM +0200, Conrad Hoffmann wrote: > >> Hi Olivier, > >> > >> On 04/12/2017 06:09 PM, Olivier Houchard wrote: > >>> On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote: > >>>> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote: > >>>>> Hi again, > >>>>> > >>>>> so I tried to get this to work, but didn't manage yet. I also don't > >>>>> quite > >>>>> understand how this is supposed to work. The first haproxy process is > >>>>> started _without_ the -x option, is that correct? Where does that > >>>>> instance > >>>>> ever create the socket for transfer to later instances? > >>>>> > >>>>> I have it working now insofar that on reload, subsequent instances are > >>>>> spawned with the -x option, but they'll just complain that they can't > >>>>> get > >>>>> anything from the unix socket (because, for all I can tell, it's not > >>>>> there?). I also can't see the relevant code path where this socket gets > >>>>> created, but I didn't have time to read all of it yet. > >>>>> > >>>>> Am I doing something wrong? Did anyone get this to work with the > >>>>> systemd-wrapper so far? > >>>>> > >>>>> Also, but this might be a coincidence, my test setup takes a huge > >>>>> performance penalty just by applying your patches (without any reloading > >>>>> whatsoever). Did this happen to anybody else? I'll send some numbers and > >>>>> more details tomorrow. > >>>>> > >>>> > >>>> Ok I can confirm the performance issues, I'm investigating. > >>>> > >>> > >>> Found it, I was messing with SO_LINGER when I shouldn't have been. > >> > >> <removed code for brevity> > >> > >> thanks a lot, I can confirm that the performance regression seems to be > >> gone! > >> > >> I am still having the other (conceptual) problem, though. Sorry if this is > >> just me holding it wrong or something, it's been a while since I dug > >> through the internals of haproxy. > >> > >> So, as I mentioned before, we use nbproc (12) and the systemd-wrapper, > >> which in turn starts haproxy in daemon mode, giving us a process tree like > >> this (path and file names shortened for brevity): > >> > >> \_ /u/s/haproxy-systemd-wrapper -f ./hap.cfg -p /v/r/hap.pid > >> \_ /u/s/haproxy-master > >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds > >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds > >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds > >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds > >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds > >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds > >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds > >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds > >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds > >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds > >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds > >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds > >> > >> Now, in our config file, we have something like this: > >> > >> # expose admin socket for each process > >> stats socket ${STATS_ADDR} level admin process 1 > >> stats socket ${STATS_ADDR}-2 level admin process 2 > >> stats socket ${STATS_ADDR}-3 level admin process 3 > >> stats socket ${STATS_ADDR}-4 level admin process 4 > >> stats socket ${STATS_ADDR}-5 level admin process 5 > >> stats socket ${STATS_ADDR}-6 level admin process 6 > >> stats socket ${STATS_ADDR}-7 level admin process 7 > >> stats socket ${STATS_ADDR}-8 level admin process 8 > >> stats socket ${STATS_ADDR}-9 level admin process 9 > >> stats socket ${STATS_ADDR}-10 level admin process 10 > >> stats socket ${STATS_ADDR}-11 level admin process 11 > >> stats socket ${STATS_ADDR}-12 level admin process 12 > >> > >> Basically, we have a dedicate admin socket for each ("real") process, as we > >> need to be able to talk to each process individually. So I was wondering: > >> which admin socket should I pass as HAPROXY_STATS_SOCKET? I initially > >> thought it would have to be a special stats socket in the haproxy-master > >> process (which we currently don't have), but as far as I can tell from the > >> output of `lsof` the haproxy-master process doesn't even hold any FDs > >> anymore. Will this setup currently work with your patches at all? Do I need > >> to add a stats socket to the master process? Or would this require a list > >> of stats sockets to be passed, similar to the list of PIDs that gets passed > >> to new haproxy instances, so that each process can talk to the one from > >> which it is taking over the socket(s)? In case I need a stats socket for > >> the master process, what would be the directive to create it? > >> > > > > Hi Conrad, > > > > Any of those sockets will do. Each process are made to keep all the > > listening sockets opened, even if the proxy is not bound to that specific > > process, justly so that it can be transferred via the unix socket. > > > > Regards, > > > > Olivier > > > Thanks, I am finally starting to understand, but I think there still might > be a problem. I didn't see that initially, but when I use one of the > processes existing admin sockets it still fails, with the following messages: > > 2017-04-13_10:27:46.95005 [WARNING] 102/102746 (14101) : We didn't get the > expected number of sockets (expecting 48 got 37) > 2017-04-13_10:27:46.95007 [ALERT] 102/102746 (14101) : Failed to get the > sockets from the old process! > > I have a suspicion about the possible reason. We have a two-tier setup, as > is often recommended here on the mailing list: 11 processes do (almost) > only SSL termination, then pass to a single process that does most of the > heavy lifting. These process use different sockets of course (we use > `bind-process 1` and `bind-process 2-X` in frontends). The message above is > from the first process, which is the non-SSL one. When using an admin > socket from any of the other processes, the message changes to "(expecting > 48 got 17)". > > I assume the patches are incompatible with such a setup at the moment? > > Thanks once more :) > Conrad
Hmm that should not happen, and I can't seem to reproduce it. Can you share the haproxy config file you're using ? Are the number of socket received always the same ? How are you generating your load ? Is it happening on each reload ? Thanks a lot for going through this, this is really appreciated :) Olivier