Hi,

On Sat, Feb 09, 2013 at 10:44:04AM +0100, Marc-Antoine Perennou wrote:
> I just made a simple test, running a webserver serving a big file locally,
> using haproxy,
> my wrapper and systemd service. I started a download and during this
> download,
> reloaded haproxy. I using nbproc = 2.
> What happened ?
> When I started haproxy, I ended up with a wrapper launching a child
> launching itself
> two children, we'll call them ha1, ha11 and ha12. Then when I reloaded, the
> wrapper
> launched a new child which launched two new children, ha2, ha21 and ha22.
> ha11 and thus ha1 were still there until the download had finished, ha12
> got to zombie state.

Then if ha1 was still there, I don't understand how it did not prevent the
new process from binding to the port. Could you please check ha1 is still
bound to the port once the process runs ? That's what I don't understand.

> ha2, ha21 and ha22 successfully have shown up and take all new connections.
> Once the download has finished, ha11 exited, ha12 too (waitpid making it
> leave the zombie state)
> and then ha11, leaving us with only ha2, ha21 and ha22.
> I think this is the expected behaviour, so there don't seem to be any bug
> here.

Yes it's the expected behaviour, but I don't understand *why* it works, so
it is very possible that we're having a bug somewhere else making this work
as a side effect.

> For the EINTR stuff, I'm not sure at all, not really familiar with it, so I
> will give it a look

Typically I would replace :

     waitpid(pid, NULL, 0);

with

     while (waitpid(pid, NULL, 0) == -1 && errno == EINTR);

For instance, when your process receives SIGTTOU/SIGTTIN upon a failed
attempt of a new process to start, the old one very likely skips a few
children (one per signal).

If it can help you, here's how to test for the worst case :

  - have a running process with a simple configuration bound to one port :

    listen foo
         bind :8000

  - then have a second configuration which will not work due to a double
    bind on the same port, and another bind on the first port :

    listen foo
         bind :8000

    listen fail1
         bind :8001

    listen fail2
         bind :8001

  - when the first one is running, try to start the second one. It will
    fail to bind to :8000, will send a SIGTTOU to process 1 and try again.
    Then it will fail to bind :8001 without knowing it's not because of #1,
    so it will wait a bit, believing it's #1 which has not yet released it,
    and then it will abort, sending SIGTTIN to #1 to inform it that it gives
    up and that #1 must continue its job as if nothing happened.

  - process 1 should then just remain unaffected. And restarting #1 with
    the fail2 listener commented out should work as expected.

Best regards,
Willy


Reply via email to