On Wed, Apr 25, 2018 at 04:24:42PM +0300, Slawa Olhovchenkov wrote:
> > > TCP load rise CPU use on all core (0-15), I am expect rise CPU use
> > > only on 8-15 core. What I am miss?
> > 
> > It's unrelated to the frontend's bindings but to the way the socket's fd
> > is registered with pollers, which is why you still have the problem here.
> > We're still surprized it doesn't happen with other pollers, which is why
> > we need to dig deeper as it could cover another issue.
> 
> Pollers distinct from frontend?
> Can I bind pollers to CPU?

Each thread has its own poller. Since you map threads to CPUs you indeed
have one poller per CPU.

> > We'll keep you updated.
> 
> Thanks so much

Please try this patch. It works for me. I finally managed to reproduce
the issue even with epoll(), it's just that it's much harder to see it,
but after trying multiple times eventually you see it as well. Under
poll() however the issue occasionally happens and disappears by itself.

Olivier found that we have a race condition in the way we update FDs that
are polled by multiple pollers. This is a side effect of the per-thread
poller change that we had to do recently to fix other issues. The good
news is that in 1.8, only listeners should be present on multiple threads
so it is not dramatic. The DNS has no reason for being present anywhere
but on the thread that creates the outgoing connection. It's what this
patches does. However now we have a painful job of trying to address
the listener case, as I think there are definitely races there that are
quite hard to reach but we don't want to leave them.

Willy

diff --git a/src/dns.c b/src/dns.c
index b1fac10..d3b2c10 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -195,7 +195,7 @@ static int dns_connect_namesaver(struct dns_nameserver *ns)
        dgram->t.sock.fd = fd;
        fdtab[fd].owner  = dgram;
        fdtab[fd].iocb   = dgram_fd_handler;
-       fd_insert(fd, (unsigned long)-1);
+       fd_insert(fd, tid_bit);
        fd_want_recv(fd);
        return 0;
 }

Reply via email to