On 1.8-git, similar results on the new process: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 93.75 0.265450 15 17805 epoll_wait 4.85 0.013730 49 283 write 1.40 0.003960 15 266 12 recvfrom 0.01 0.000018 0 42 12 read 0.00 0.000000 0 28 close 0.00 0.000000 0 12 socket 0.00 0.000000 0 12 12 connect 0.00 0.000000 0 19 1 sendto 0.00 0.000000 0 12 sendmsg 0.00 0.000000 0 6 shutdown 0.00 0.000000 0 35 setsockopt 0.00 0.000000 0 7 getsockopt 0.00 0.000000 0 12 fcntl 0.00 0.000000 0 13 epoll_ctl 0.00 0.000000 0 2 2 accept4 ------ ----------- ----------- --------- --------- ---------------- 100.00 0.283158 18554 39 total
Cursory look through the strace output looks the same, with the same three types as in the last email, including the cascade. On 1/12/18 10:23 AM, Willy Tarreau wrote: > On Fri, Jan 12, 2018 at 10:13:55AM -0600, Samuel Reed wrote: >> Excellent! Please let me know if there's any other output you'd like >> from this machine. >> >> Strace on that new process shows thousands of these types of syscalls, >> which vary slightly, >> >> epoll_wait(3, {{EPOLLIN, {u32=206, u64=206}}}, 200, 239) = 1 > If the u32 value almost doesn't vary, that's an uncaught event. We've > got a report for this that we've just fixed yesterday which started to > appear after the system was upgraded with Meltdown fixes. That seems > unrelated but reverting made the problem disappear. > >> and these: >> >> epoll_wait(3, {}, 200, 0) = 0 > This one used to appear in yesterday's report though it could be caused > by other bugs as well. That's the one I predicted. > >> There is also something of a cascade (each repeats about 10-20x before >> the next): >> >> epoll_wait(3, {{EPOLLIN, {u32=47, u64=47}}}, 200, 71) = 1 >> epoll_wait(3, {{EPOLLIN, {u32=93, u64=93}}, {EPOLLIN, {u32=656, >> u64=656}}}, 200, 65) = 2 >> epoll_wait(3, {{EPOLLIN, {u32=93, u64=93}}, {EPOLLIN, {u32=656, >> u64=656}}, {EPOLLIN, {u32=227, u64=227}}}, 200, 0) = 3 >> epoll_wait(3, {{EPOLLIN, {u32=93, u64=93}}, {EPOLLIN, {u32=656, >> u64=656}}, {EPOLLIN, {u32=227, u64=227}}, {EPOLLIN, {u32=785, >> u64=785}}}, 200, 65) = 4 >> epoll_wait(3, {{EPOLLIN, {u32=93, u64=93}}, {EPOLLIN, {u32=656, >> u64=656}}, {EPOLLIN, {u32=227, u64=227}}, {EPOLLIN, {u32=785, u64=785}}, >> {EPOLLIN, {u32=639, u64=639}}}, 200, 64) = 5 >> >> I've seen it go as deep as 15. The trace is absolutely dominated by these. > OK that's very interesting. Just in doubt, please update to latest > 1.8-git to see if it makes this issue disappear. > > Thanks, > Willy