Hi Nicolas, On Fri, Nov 15, 2013 at 01:28:39PM +0100, Nicolas Grilly wrote: > I read the following blog post, published one year ago, about haproxy > dropping connections waiting in the kernel listen queue during a restart: (...) > With the current restart mechanism, where the old and the new process are > independent and do not share socket file descriptors, the connections > waiting in the listen queue of the old process are dropped when the process > exits. This is the expected behavior.
Indeed. > But I don't understand the following paragraph: > > *"At one point I thought that the principle of passing the listening FD > between one process and another one would solve the issue, but > unfortunately it does not for the same reason as above : at one point the > old process stops listening so the requests pending in its queue get > dropped."* > > Willy suggests that even when passing the FD from the old process to the > new one, connections waiting in the listen queue are still dropped. I don't > understand why. > > Under Linux, I can see two ways to pass and share the listening socket file > descriptors: > > - The usual one is fork/exec. For example, the haproxy process could use > execve to upgrade to a new binary and inherit listening socket file > descriptors. uWSGI does this: > http://uwsgi-docs.readthedocs.org/en/latest/Management.html#reloading-the-server > - A less usual way is to launch the new process and pass the socket file > descriptors through a Unix domain socket using SCM_RIGHTS. More clumsy in > my opinion :) Example here: > http://www.mail-archive.com/[email protected]/msg00002.html I agree these are the two methods. > In both cases, the file descriptor (which is an integer) is duplicated in > the old and new processes (in the process file descriptor table), but the > file description is shared (in the kernel system-wide data structures). > Because of this, I expect to have just one listen backlog, shared by both > processes. If the file description and the listen backlog is shared, I > don't understand why terminating the old process would drop connections > waiting in the backlog. The new process is still supposed to accept them > later? A totally agree with your analysis however I clearly have memories of seeing something like a per-process queue in the kernel code when I tracked this down. BTW, I also remember observing the same behaviour when killing one process when running with nbproc > 1. > Am I missing something? Maybe or maybe not. What kernel version did you check ? I think my tests were somewhere between 2.6.32 and 3.5. Regards, Willy

