On 24/10/2016 09:13 μμ, Willy Tarreau wrote:
> Hi again,
> 
> On Mon, Oct 24, 2016 at 07:41:06PM +0200, Willy Tarreau wrote:
>> I don't know if this is something you're interested in experimenting
>> with. This is achieved using fcntl(F_SETLKW). It should be done in the
>> wrapper as well.
> 
> Finally I did it and it doesn't help at all. The signal-based asynchronous
> reload is fundamentally flawed. It's amazing to see how systemd managed to
> break something simple and robust in the sake of reliability, by introducing
> asynchronous signal delivery...
> 
> The problem is not even with overlapping writes (well, it very likely
> happens) but it is related to the fact that you never know whom you're
> sending your signals at all and that the children may not even be started
> yet, or may not have had the time to process the whole config file, etc.
> 
> So now I'm wondering what to do with all this mess. Declaring systemd
> misdesigned and born with some serious trauma will not help us progress
> on this, so we need to work around this pile of crap which tries to prevent
> us from dealing with a simple service.
> 
> Either we find a way to completely redesign the wrapper, even possibly the
> relation between the wrapper and the sub-processes, or we'll simply have
> to get rid of the reload action under systemd and reroute it to a restart.
> 
> I've thought about something which could possibly work though I'm far from
> being sure for now.
> 
> Let's say that the wrapper tries to take an exclusive lock on the pidfile
> upon receipt of SIGUSR2. It then keeps the file open and passes this FD to
> all the haproxy sub-processes. Ideally the FD num is passed as an argument
> to the child.
> 
> Once it fork()+exec(), it can simply close its fd. The exclusive lock is still
> maintained by the children so it's not lost. The benefit is that at this
> point, until the sub-processes have closed the pid file, there's no way for
> the wrapper to pick the same lock again. Thus it can *know* the processes
> have not finished booting. This will cause further SIGUSR2 processing to
> wait for the children processes to either start or die. Sort of a way to
> "pass" the lock to the sub-processes.
> 
> Here we don't even care if signals are sent in storm because only one of
> them will be used and will have to wait for the previous one to be dealt
> with.
> 
> The model is not perfect and ideally a lock file would be better than using
> the pidfile since the pidfile currently is opened late in haproxy and requires
> an unlinking in case of successful startup. But I suspect that using extra
> files will just make things worse. And I don't know if it's possible to flock
> something else (eg: a pipe).
> 
> BTW, that just makes me realize that we also have another possibility for this
> precisely using a pipe (which are more portable than mandatory locks). Let's
> see if that would work. The wrapper creates a pipe then forks. The child
> closes the read side, the parent the write side. Then the parent performs a
> read() on this fd and waits until it returns zero. The child execve() and
> calls the haproxy sub-processes. The FD is closed after the pidfile is updated
> (and in children). After the last close, the wrapper receives a zero on this
> pipe. If haproxy dies, the pipe is closed as well. We could even (ab)use it
> to let the wrapper know whether the process properly started or not, or pass
> the pids there (though that just needlessly complicates operations).
> 
> Any opinion on this ?
> 
> Willy
> 

IMHO: Ask the users to not perform reloads every 2miliseconds. It is insane.
You may spend X hours on this which will make the code a bot more complex and
cause possible breakages somewhere else.

I am pretty sure 90% of the cases which require so often reload are the ones 
which
try to integrate HAProxy with docker stuff, where servers in the pools are 
treated
as ephemeral nodes, appear and disappear very often and at high volume.

@Pieper, what is your user-case for so many reloads?


My 0.02cents,
Pavlos


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to