Hi Nicolas,

On Fri, Nov 15, 2013 at 01:28:39PM +0100, Nicolas Grilly wrote:
> I read the following blog post, published one year ago, about haproxy
> dropping connections waiting in the kernel listen queue during a restart:
(...)
> With the current restart mechanism, where the old and the new process are
> independent and do not share socket file descriptors, the connections
> waiting in the listen queue of the old process are dropped when the process
> exits. This is the expected behavior.

Indeed.

> But I don't understand the following paragraph:
> 
> *"At one point I thought that the principle of passing the listening FD
> between one process and another one would solve the issue, but
> unfortunately it does not for the same reason as above : at one point the
> old process stops listening so the requests pending in its queue get
> dropped."*
> 
> Willy suggests that even when passing the FD from the old process to the
> new one, connections waiting in the listen queue are still dropped. I don't
> understand why.
> 
> Under Linux, I can see two ways to pass and share the listening socket file
> descriptors:
> 
> - The usual one is fork/exec. For example, the haproxy process could use
> execve to upgrade to a new binary and inherit listening socket file
> descriptors. uWSGI does this:
> http://uwsgi-docs.readthedocs.org/en/latest/Management.html#reloading-the-server

> - A less usual way is to launch the new process and pass the socket file
> descriptors through a Unix domain socket using SCM_RIGHTS. More clumsy in
> my opinion :) Example here:
> http://www.mail-archive.com/[email protected]/msg00002.html

I agree these are the two methods.

> In both cases, the file descriptor (which is an integer) is duplicated in
> the old and new processes (in the process file descriptor table), but the
> file description is shared (in the kernel system-wide data structures).
> Because of this, I expect to have just one listen backlog, shared by both
> processes. If the file description and the listen backlog is shared, I
> don't understand why terminating the old process would drop connections
> waiting in the backlog. The new process is still supposed to accept them
> later?

A totally agree with your analysis however I clearly have memories of
seeing something like a per-process queue in the kernel code when I
tracked this down.  BTW, I also remember observing the same behaviour
when killing one process when running with nbproc > 1.

> Am I missing something?

Maybe or maybe not. What kernel version did you check ? I think my tests
were somewhere between 2.6.32 and 3.5.

Regards,
Willy


Reply via email to