Hi Nicolas,

On Wed, Nov 20, 2013 at 02:15:09PM +0100, Nicolas Grilly wrote:
> Hi Willy and list,
> 
> On Mon, Nov 18, 2013 at 11:32 AM, Willy Tarreau <[email protected]> wrote:
> 
> > > In both cases, the file descriptor (which is an integer) is duplicated in
> > > the old and new processes (in the process file descriptor table), but the
> > > file description is shared (in the kernel system-wide data structures).
> > > Because of this, I expect to have just one listen backlog, shared by both
> > > processes. If the file description and the listen backlog is shared, I
> > > don't understand why terminating the old process would drop connections
> > > waiting in the backlog. The new process is still supposed to accept them
> > > later?
> >
> > A totally agree with your analysis however I clearly have memories of
> > seeing something like a per-process queue in the kernel code when I
> > tracked this down.  BTW, I also remember observing the same behaviour
> > when killing one process when running with nbproc > 1.
> >
> 
> I tried to reproduce the issue with a small Python scripts that does the
> following:
> 1/ It starts a server process that listens on a TCP socket with a backlog
> of 128 (but do not accept connections).
> 2/ It starts a client process that opens 20 connections (using one thread
> per connection).
> 3/ Then the server process forks a new server process that inherits the
> listening socket, and exits.

You should have tried to fork *before* sending the connections. That's
the scenario where I saw the issue and where the socket had multiple
listen queues.

> 4/ Then the new sever process starts to accept connections waiting in the
> accept queue.
> 
> Everything works perfectly and no connection is dropped.
> 
> I spend some time reading the TCP implementation of the kernel mainline
> (3.12) and found no trace of a per process accept queue. But I am an
> absolute newbie with the kernel networking code and it's very likely that I
> missed something.

Not necessarily. The accept code changed in 3.9 to support round-robin
between listening sockets bound to the same ip:port so that when you have
multiple processes, they get an equal share. But that's for a different
case, still this could have had an impact on the implementation. I also
found additional changes in 3.10 in tcp_v4_conn_request().

> Regarding machines with a number of CPU/cores greater than 1, I have read
> many papers that propose to optimize the Linux kernel TCP code by
> partitioning the accept queue, with one queue per core, in order to avoid
> lock contention. But I don't know if it has been implemented in the kernel
> mainline.

I think this has been superseded by the ability to attach multiple processes
to the same socket using SO_REUSEPORT.

> I'm really curious to know how you got those dropped connections with a
> shared listening socket. Because if it is true, it means the graceful
> reload advertised by a lot of open source projets (nginx being one of them
> [1]) is not really graceful and can still drop some established connections
> waiting in the accept queue! It would be really interesting to know :)
> 
> [1] http://nginx.org/en/docs/control.html

I will have to dig in my old sent e-mails to find this info. It took me
a while to diagnose this and will certainly take as much to do it again!

> > Am I missing something?
> >
> > Maybe or maybe not. What kernel version did you check ? I think my tests
> > were somewhere between 2.6.32 and 3.5.
> >
> 
> I tested using kernel version 3.2.0 and Ubuntu 12.04.

OK thanks for the info!

Regards,
Willy


Reply via email to