Hi Andy,

On Tue, May 27, 2014 at 06:00:37PM +0100, Andrew Phillips wrote:
> Something I overlooked replying to on this thread;
> 
> > BTW, I remember you said that you fixed the busy loop by disabling the
> > FD in the speculative event cache, but do you remember how you re-enable
> > it ? Eg, if all other processes have accepted some connections, your
> > first process will have to accept new connections again, so that means
> > that its state depends on others'.
> 
>   We initially just returned from listener_accept(). This caused us to
> go into a busy spin as there were always pending speculative reads, so
> fd_nbspec was non zero in ev_epoll.c which triggered setting
> wait_time=0. 
> 
>   Looking at the flow in listener_accept(), what we observed happening
> before was that without any of our patches, several processes would wake
> up on a new socket event. The fastest would win and accept() and the
> slower ones would hit the error check in listener.c at line 353. 
> 353:   if (unlikely(cfd == -1))  
>            switch (errno) {
>                  case EAGAIN:
>                  case EINTR:
>                  case ECONNABORTED:
>                       fd_poll_recv(fd);
>                       return;   /* nothing more to accept */
>              :
>   In this case, chasing fd_poll_recv(fd) through the files indicated it
> cleared the speculative events off the queue, meaning fd_nbspec would
> not be set, and wait_time would not get set to 0. 
> 
>   So we just added the same call to the shm patch refusal path. Which
> solved our problem. 
> 
>   Not sure how that relates to your point about the processes state
> depending on others, which does not seem to be the case. 

Got it, thanks for the explanation! I thought you completely disabled
events on this FD, which would be an issue right now. Here with only
disabling speculative events, you "only" lose the readiness information.
That works for level-triggered pollers, but will not work anymore with
an event-triggered poller if/when we switch to EPOLL_ET. But at least
I get the picture now.

Thanks!
Willy


Reply via email to