Excellent! Please let me know if there's any other output you'd like from this machine.
Strace on that new process shows thousands of these types of syscalls, which vary slightly, epoll_wait(3, {{EPOLLIN, {u32=206, u64=206}}}, 200, 239) = 1 and these: epoll_wait(3, {}, 200, 0) = 0 There is also something of a cascade (each repeats about 10-20x before the next): epoll_wait(3, {{EPOLLIN, {u32=47, u64=47}}}, 200, 71) = 1 epoll_wait(3, {{EPOLLIN, {u32=93, u64=93}}, {EPOLLIN, {u32=656, u64=656}}}, 200, 65) = 2 epoll_wait(3, {{EPOLLIN, {u32=93, u64=93}}, {EPOLLIN, {u32=656, u64=656}}, {EPOLLIN, {u32=227, u64=227}}}, 200, 0) = 3 epoll_wait(3, {{EPOLLIN, {u32=93, u64=93}}, {EPOLLIN, {u32=656, u64=656}}, {EPOLLIN, {u32=227, u64=227}}, {EPOLLIN, {u32=785, u64=785}}}, 200, 65) = 4 epoll_wait(3, {{EPOLLIN, {u32=93, u64=93}}, {EPOLLIN, {u32=656, u64=656}}, {EPOLLIN, {u32=227, u64=227}}, {EPOLLIN, {u32=785, u64=785}}, {EPOLLIN, {u32=639, u64=639}}}, 200, 64) = 5 I've seen it go as deep as 15. The trace is absolutely dominated by these. On 1/12/18 10:01 AM, Willy Tarreau wrote: > On Fri, Jan 12, 2018 at 09:50:58AM -0600, Samuel Reed wrote: >> To accelerate the process, I've increased the number of threads from 4 >> to 8 on a 16-core machine. Ran strace for about 5s on each. >> >> Single process (8 threads): >> >> $ strace -cp 16807 >> % time seconds usecs/call calls errors syscall >> ------ ----------- ----------- --------- --------- ---------------- >> 71.36 0.330172 21 15479 epoll_wait >> 13.59 0.062861 4 14477 1 write >> 10.58 0.048955 4 11518 10 recvfrom >> 4.44 0.020544 38 537 244 read > This one is OK and shows that quite some time is in fact spent waiting > for I/O events. > >> Two processes (2x8 threads): >> >> ## Draining process >> >> % time seconds usecs/call calls errors syscall >> ------ ----------- ----------- --------- --------- ---------------- >> 48.65 0.544758 30 18359 epoll_wait >> 28.69 0.321283 14 23540 write >> 22.60 0.253049 19 13338 recvfrom >> 0.04 0.000474 1 786 374 read >> 0.03 0.000287 2 187 sendto > This one as well. > >> ## "New" process >> >> % time seconds usecs/call calls errors syscall >> ------ ----------- ----------- --------- --------- ---------------- >> 93.87 1.588239 11 149253 epoll_wait >> 3.84 0.064985 30 2140 31 recvfrom >> 1.77 0.029905 13 2388 write >> 0.34 0.005737 10 589 130 read >> 0.12 0.002018 38 53 close >> 0.06 0.000960 8 114 2 sendto > This one is very interesting! So the epoll_wait to other syscalls ratio > went from roughly 1/2 to 30/1. I'm pretty sure that a regular strace would > show you a large number of epoll_wait(0)=0 indicating we're missing some > events. I seem to remember that sometimes there are situations where a > thread may be notified by epoll() about an fd it cannot take care of but > I don't remember in which case, I'll have to discuss with Christopher. > > But at least now we have an explanation and it's not directly related to > thread contention but more likely with the mapping of FDs to threads, so > we may have opportunities to improve the situation here. > > Thanks! > Willy