Hi Nicolas, On Wed, Nov 20, 2013 at 02:15:09PM +0100, Nicolas Grilly wrote: > Hi Willy and list, > > On Mon, Nov 18, 2013 at 11:32 AM, Willy Tarreau <[email protected]> wrote: > > > > In both cases, the file descriptor (which is an integer) is duplicated in > > > the old and new processes (in the process file descriptor table), but the > > > file description is shared (in the kernel system-wide data structures). > > > Because of this, I expect to have just one listen backlog, shared by both > > > processes. If the file description and the listen backlog is shared, I > > > don't understand why terminating the old process would drop connections > > > waiting in the backlog. The new process is still supposed to accept them > > > later? > > > > A totally agree with your analysis however I clearly have memories of > > seeing something like a per-process queue in the kernel code when I > > tracked this down. BTW, I also remember observing the same behaviour > > when killing one process when running with nbproc > 1. > > > > I tried to reproduce the issue with a small Python scripts that does the > following: > 1/ It starts a server process that listens on a TCP socket with a backlog > of 128 (but do not accept connections). > 2/ It starts a client process that opens 20 connections (using one thread > per connection). > 3/ Then the server process forks a new server process that inherits the > listening socket, and exits.
You should have tried to fork *before* sending the connections. That's the scenario where I saw the issue and where the socket had multiple listen queues. > 4/ Then the new sever process starts to accept connections waiting in the > accept queue. > > Everything works perfectly and no connection is dropped. > > I spend some time reading the TCP implementation of the kernel mainline > (3.12) and found no trace of a per process accept queue. But I am an > absolute newbie with the kernel networking code and it's very likely that I > missed something. Not necessarily. The accept code changed in 3.9 to support round-robin between listening sockets bound to the same ip:port so that when you have multiple processes, they get an equal share. But that's for a different case, still this could have had an impact on the implementation. I also found additional changes in 3.10 in tcp_v4_conn_request(). > Regarding machines with a number of CPU/cores greater than 1, I have read > many papers that propose to optimize the Linux kernel TCP code by > partitioning the accept queue, with one queue per core, in order to avoid > lock contention. But I don't know if it has been implemented in the kernel > mainline. I think this has been superseded by the ability to attach multiple processes to the same socket using SO_REUSEPORT. > I'm really curious to know how you got those dropped connections with a > shared listening socket. Because if it is true, it means the graceful > reload advertised by a lot of open source projets (nginx being one of them > [1]) is not really graceful and can still drop some established connections > waiting in the accept queue! It would be really interesting to know :) > > [1] http://nginx.org/en/docs/control.html I will have to dig in my old sent e-mails to find this info. It took me a while to diagnose this and will certainly take as much to do it again! > > Am I missing something? > > > > Maybe or maybe not. What kernel version did you check ? I think my tests > > were somewhere between 2.6.32 and 3.5. > > > > I tested using kernel version 3.2.0 and Ubuntu 12.04. OK thanks for the info! Regards, Willy

