On 1.8-git, similar results on the new process:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 93.75    0.265450          15     17805           epoll_wait
  4.85    0.013730          49       283           write
  1.40    0.003960          15       266        12 recvfrom
  0.01    0.000018           0        42        12 read
  0.00    0.000000           0        28           close
  0.00    0.000000           0        12           socket
  0.00    0.000000           0        12        12 connect
  0.00    0.000000           0        19         1 sendto
  0.00    0.000000           0        12           sendmsg
  0.00    0.000000           0         6           shutdown
  0.00    0.000000           0        35           setsockopt
  0.00    0.000000           0         7           getsockopt
  0.00    0.000000           0        12           fcntl
  0.00    0.000000           0        13           epoll_ctl
  0.00    0.000000           0         2         2 accept4
------ ----------- ----------- --------- --------- ----------------
100.00    0.283158                 18554        39 total

Cursory look through the strace output looks the same, with the same
three types as in the last email, including the cascade.


On 1/12/18 10:23 AM, Willy Tarreau wrote:
> On Fri, Jan 12, 2018 at 10:13:55AM -0600, Samuel Reed wrote:
>> Excellent! Please let me know if there's any other output you'd like
>> from this machine.
>>
>> Strace on that new process shows thousands of these types of syscalls,
>> which vary slightly,
>>
>> epoll_wait(3, {{EPOLLIN, {u32=206, u64=206}}}, 200, 239) = 1
> If the u32 value almost doesn't vary, that's an uncaught event. We've
> got a report for this that we've just fixed yesterday which started to
> appear after the system was upgraded with Meltdown fixes. That seems
> unrelated but reverting made the problem disappear.
>
>> and these:
>>
>> epoll_wait(3, {}, 200, 0)               = 0
> This one used to appear in yesterday's report though it could be caused
> by other bugs as well. That's the one I predicted.
>
>> There is also something of a cascade (each repeats about 10-20x before
>> the next):
>>
>> epoll_wait(3, {{EPOLLIN, {u32=47, u64=47}}}, 200, 71) = 1
>> epoll_wait(3, {{EPOLLIN, {u32=93, u64=93}}, {EPOLLIN, {u32=656,
>> u64=656}}}, 200, 65) = 2
>> epoll_wait(3, {{EPOLLIN, {u32=93, u64=93}}, {EPOLLIN, {u32=656,
>> u64=656}}, {EPOLLIN, {u32=227, u64=227}}}, 200, 0) = 3
>> epoll_wait(3, {{EPOLLIN, {u32=93, u64=93}}, {EPOLLIN, {u32=656,
>> u64=656}}, {EPOLLIN, {u32=227, u64=227}}, {EPOLLIN, {u32=785,
>> u64=785}}}, 200, 65) = 4
>> epoll_wait(3, {{EPOLLIN, {u32=93, u64=93}}, {EPOLLIN, {u32=656,
>> u64=656}}, {EPOLLIN, {u32=227, u64=227}}, {EPOLLIN, {u32=785, u64=785}},
>> {EPOLLIN, {u32=639, u64=639}}}, 200, 64) = 5
>>
>> I've seen it go as deep as 15. The trace is absolutely dominated by these.
> OK that's very interesting. Just in doubt, please update to latest
> 1.8-git to see if it makes this issue disappear.
>
> Thanks,
> Willy


Reply via email to