Re: [RFC][PATCHES] seamless reload

Conrad Hoffmann Thu, 13 Apr 2017 08:00:14 -0700


On 04/13/2017 03:50 PM, Olivier Houchard wrote:
> On Thu, Apr 13, 2017 at 03:06:47PM +0200, Conrad Hoffmann wrote:
>>
>>
>> On 04/13/2017 02:28 PM, Olivier Houchard wrote:
>>> On Thu, Apr 13, 2017 at 12:59:38PM +0200, Conrad Hoffmann wrote:
>>>> On 04/13/2017 11:31 AM, Olivier Houchard wrote:
>>>>> On Thu, Apr 13, 2017 at 11:17:45AM +0200, Conrad Hoffmann wrote:
>>>>>> Hi Olivier,
>>>>>>
>>>>>> On 04/12/2017 06:09 PM, Olivier Houchard wrote:
>>>>>>> On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
>>>>>>>> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
>>>>>>>>> Hi again,
>>>>>>>>>
>>>>>>>>> so I tried to get this to work, but didn't manage yet. I also don't 
>>>>>>>>> quite
>>>>>>>>> understand how this is supposed to work. The first haproxy process is
>>>>>>>>> started _without_ the -x option, is that correct? Where does that 
>>>>>>>>> instance
>>>>>>>>> ever create the socket for transfer to later instances?
>>>>>>>>>
>>>>>>>>> I have it working now insofar that on reload, subsequent instances are
>>>>>>>>> spawned with the -x option, but they'll just complain that they can't 
>>>>>>>>> get
>>>>>>>>> anything from the unix socket (because, for all I can tell, it's not
>>>>>>>>> there?). I also can't see the relevant code path where this socket 
>>>>>>>>> gets
>>>>>>>>> created, but I didn't have time to read all of it yet.
>>>>>>>>>
>>>>>>>>> Am I doing something wrong? Did anyone get this to work with the
>>>>>>>>> systemd-wrapper so far?
>>>>>>>>>
>>>>>>>>> Also, but this might be a coincidence, my test setup takes a huge
>>>>>>>>> performance penalty just by applying your patches (without any 
>>>>>>>>> reloading
>>>>>>>>> whatsoever). Did this happen to anybody else? I'll send some numbers 
>>>>>>>>> and
>>>>>>>>> more details tomorrow.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Ok I can confirm the performance issues, I'm investigating.
>>>>>>>>
>>>>>>>
>>>>>>> Found it, I was messing with SO_LINGER when I shouldn't have been.
>>>>>>
>>>>>> <removed code for brevity>
>>>>>>
>>>>>> thanks a lot, I can confirm that the performance regression seems to be 
>>>>>> gone!
>>>>>>
>>>>>> I am still having the other (conceptual) problem, though. Sorry if this 
>>>>>> is
>>>>>> just me holding it wrong or something, it's been a while since I dug
>>>>>> through the internals of haproxy.
>>>>>>
>>>>>> So, as I mentioned before, we use nbproc (12) and the systemd-wrapper,
>>>>>> which in turn starts haproxy in daemon mode, giving us a process tree 
>>>>>> like
>>>>>> this (path and file names shortened for brevity):
>>>>>>
>>>>>> \_ /u/s/haproxy-systemd-wrapper -f ./hap.cfg -p /v/r/hap.pid
>>>>>>     \_ /u/s/haproxy-master
>>>>>>         \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>         \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>         \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>         \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>         \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>         \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>         \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>         \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>         \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>         \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>         \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>         \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>
>>>>>> Now, in our config file, we have something like this:
>>>>>>
>>>>>> # expose admin socket for each process
>>>>>>   stats socket ${STATS_ADDR}   level admin process 1
>>>>>>   stats socket ${STATS_ADDR}-2 level admin process 2
>>>>>>   stats socket ${STATS_ADDR}-3 level admin process 3
>>>>>>   stats socket ${STATS_ADDR}-4 level admin process 4
>>>>>>   stats socket ${STATS_ADDR}-5 level admin process 5
>>>>>>   stats socket ${STATS_ADDR}-6 level admin process 6
>>>>>>   stats socket ${STATS_ADDR}-7 level admin process 7
>>>>>>   stats socket ${STATS_ADDR}-8 level admin process 8
>>>>>>   stats socket ${STATS_ADDR}-9 level admin process 9
>>>>>>   stats socket ${STATS_ADDR}-10 level admin process 10
>>>>>>   stats socket ${STATS_ADDR}-11 level admin process 11
>>>>>>   stats socket ${STATS_ADDR}-12 level admin process 12
>>>>>>
>>>>>> Basically, we have a dedicate admin socket for each ("real") process, as 
>>>>>> we
>>>>>> need to be able to talk to each process individually. So I was wondering:
>>>>>> which admin socket should I pass as HAPROXY_STATS_SOCKET? I initially
>>>>>> thought it would have to be a special stats socket in the haproxy-master
>>>>>> process (which we currently don't have), but as far as I can tell from 
>>>>>> the
>>>>>> output of `lsof` the haproxy-master process doesn't even hold any FDs
>>>>>> anymore. Will this setup currently work with your patches at all? Do I 
>>>>>> need
>>>>>> to add a stats socket to the master process? Or would this require a list
>>>>>> of stats sockets to be passed, similar to the list of PIDs that gets 
>>>>>> passed
>>>>>> to new haproxy instances, so that each process can talk to the one from
>>>>>> which it is taking over the socket(s)? In case I need a stats socket for
>>>>>> the master process, what would be the directive to create it?
>>>>>>
>>>>>
>>>>> Hi Conrad,
>>>>>
>>>>> Any of those sockets will do. Each process are made to keep all the 
>>>>> listening sockets opened, even if the proxy is not bound to that specific
>>>>> process, justly so that it can be transferred via the unix socket.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Olivier
>>>>
>>>>
>>>> Thanks, I am finally starting to understand, but I think there still might
>>>> be a problem. I didn't see that initially, but when I use one of the
>>>> processes existing admin sockets it still fails, with the following 
>>>> messages:
>>>>
>>>> 2017-04-13_10:27:46.95005 [WARNING] 102/102746 (14101) : We didn't get the
>>>> expected number of sockets (expecting 48 got 37)
>>>> 2017-04-13_10:27:46.95007 [ALERT] 102/102746 (14101) : Failed to get the
>>>> sockets from the old process!
>>>>
>>>> I have a suspicion about the possible reason. We have a two-tier setup, as
>>>> is often recommended here on the mailing list: 11 processes do (almost)
>>>> only SSL termination, then pass to a single process that does most of the
>>>> heavy lifting. These process use different sockets of course (we use
>>>> `bind-process 1` and `bind-process 2-X` in frontends). The message above is
>>>> from the first process, which is the non-SSL one. When using an admin
>>>> socket from any of the other processes, the message changes to "(expecting
>>>> 48 got 17)".
>>>>
>>>> I assume the patches are incompatible with such a setup at the moment?
>>>>
>>>> Thanks once more :)
>>>> Conrad
>>>
>>> Hmm that should not happen, and I can't seem to reproduce it.
>>> Can you share the haproxy config file you're using ? Are the number of 
>>> socket
>>> received always the same ? How are you generating your load ? Is it 
>>> happening
>>> on each reload ?
>>>
>>> Thanks a lot for going through this, this is really appreciated :)
>>
>> I am grateful myself you're helping me through this :)
>>
>> So I removed all the logic and backends from our config file, it's still
>> quite big and it still works in our environment, which is unfortunately
>> quite complex. I can also still reliably reproduce the error with this
>> config. The number seem consistently the same (except for the difference
>> between the first process and the others).
>>
>> I am not sure if it makes sense for you to recreate the environment we have
>> this running in, the variables used in the config file are set to the
>> following values:
>>
> 
> 
> Ah ! Thanks to your help, I think I got it (well really Willy got it, but
> let's just pretend it's me).
> The attached patch should hopefully fix that, so that you can uncover yet
> another issue :).


Sure, here it is ;P

I now get a segfault (on reload):

*** Error in `/usr/sbin/haproxy': corrupted double-linked list:
0x0000000005b511e0 ***

Here is the backtrace, retrieved from the core file:

(gdb) bt
#0  0x00007f4c92801067 in __GI_raise (sig=sig@entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f4c92802448 in __GI_abort () at abort.c:89
#2  0x00007f4c9283f1b4 in __libc_message (do_abort=do_abort@entry=1,
fmt=fmt@entry=0x7f4c92934210 "*** Error in `%s': %s: 0x%s ***\n") at
../sysdeps/posix/libc_fatal.c:175
#3  0x00007f4c9284498e in malloc_printerr (action=1, str=0x7f4c929302ec
"corrupted double-linked list", ptr=<optimized out>) at malloc.c:4996
#4  0x00007f4c92845923 in _int_free (av=0x7f4c92b71620 <main_arena>,
p=<optimized out>, have_lock=0) at malloc.c:3996
#5  0x0000000000485850 in tcp_find_compatible_fd (l=0xaaed20) at
src/proto_tcp.c:812
#6  tcp_bind_listener (listener=0xaaed20, errmsg=0x7ffccc774e10 "",
errlen=100) at src/proto_tcp.c:878
#7  0x0000000000493ce1 in start_proxies (verbose=0) at src/proxy.c:793
#8  0x00000000004091ec in main (argc=21, argv=0x7ffccc775168) at
src/haproxy.c:1942

I can send you the entire core file if that makes any sense? Should I send
the executable along, so that the symbols match? The source revision is
c28bb55cdc554549a59f92997ebe7abf8d4612fe with all your patches applied
(latest ones where fixups were sent).

In case it's relevant, here is the output of `-vv`:

HA-Proxy version 1.8-dev1-c28bb5-5 2017/04/05
Copyright 2000-2017 Willy Tarreau <wi...@haproxy.org>

Build options :
  TARGET  = linux26
  CPU     = generic
  CC      = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
  OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.2k  26 Jan 2017
Running on OpenSSL version : OpenSSL 1.0.2k  26 Jan 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
IP_FREEBIND
Built with network namespace support.
Built with zlib version : 1.2.8
Running on zlib version : 1.2.8
Compression algorithms supported : identity("identity"),
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Encrypted password support via crypt(3): yes
Built with PCRE version : 8.35 2014-04-04
Running on PCRE version : 8.35 2014-04-04
PCRE library supports JIT : no (USE_PCRE_JIT not set)

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
        [SPOE] spoe
        [COMP] compression
        [TRACE] trace


Thanks again an keep it up, I feel we are almost there :)
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B

Re: [RFC][PATCHES] seamless reload

Reply via email to