On Mon, Jan 15, 2018 at 08:14:40AM -0600, Samuel Reed wrote:
> Thank you for the patch and your quick attention to this issue. Results
> after a few reloads, 8 threads on 16 core machine, both draining and new
> process have patches.
> 
> New process:
> 
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  96.24    0.432348          16     26917           epoll_wait
>   3.60    0.016158          16      1023         7 recvfrom
>   0.12    0.000524          31        17           sendto
>   0.04    0.000190           0      1126         3 write
>   0.01    0.000036           1        70        23 read
>   0.00    0.000000           0        21           close
>   0.00    0.000000           0         7           socket
>   0.00    0.000000           0         7         7 connect
>   0.00    0.000000           0        13           sendmsg
>   0.00    0.000000           0        17           setsockopt
>   0.00    0.000000           0         7           fcntl
>   0.00    0.000000           0         9           epoll_ctl
>   0.00    0.000000           0         5         5 accept4
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.449256                 29239        45 total
> 
> 
> Draining process:
> 
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  78.26    0.379045          16     23424           epoll_wait
>  13.94    0.067539           7      9877         4 recvfrom
>   7.80    0.037764           4     10471         6 write
>   0.00    0.000007           0        29        10 read
>   0.00    0.000000           0         9           close
>   0.00    0.000000           0         5           sendto
>   0.00    0.000000           0         3           shutdown
>   0.00    0.000000           0        20           epoll_ctl
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.484355                 43838        20 total
> 
> 
> I ran this a few times while both processes were live and the numbers
> weren't significantly different. The new process still has a remarkably
> high proportion of epoll_wait.

Thank you Samuel for the test. It's sad, but it may indicate something
completely different. Christopher at least I'm willing to integrate your
fix to rule out this corner case in the future.

Among the differences possible between an old and a new process, we can
enumerate very few things, for example the peers, which work differently
for new and old processes. Do you use peers in your config ? It would also
be possible that we pass an fd corresponding to a more or less closed
listener or something like this. Do you reload with -x to pass FDs across
processes ? Do you use master-worker ? Just trying to rule out a number of
hypothesis. An anonymized version of your config will definitely help here
I'm afraid.

Thanks!
Willy

Reply via email to