On Fri, Jul 18, 2025 at 10:59:48AM +0200, Nam Cao wrote:
> On Fri, Jul 18, 2025 at 09:38:27AM +0100, Soheil Hassas Yeganeh wrote:
> > On Fri, Jul 18, 2025 at 8:52 AM Nam Cao <[email protected]> wrote:
> > >
> > > ep_events_available() checks for available events by looking at 
> > > ep->rdllist
> > > and ep->ovflist. However, this is done without a lock, therefore the
> > > returned value is not reliable. Because it is possible that both checks on
> > > ep->rdllist and ep->ovflist are false while ep_start_scan() or
> > > ep_done_scan() is being executed on other CPUs, despite events are
> > > available.
> > >
> > > This bug can be observed by:
> > >
> > >   1. Create an eventpoll with at least one ready level-triggered event
> > >
> > >   2. Create multiple threads who do epoll_wait() with zero timeout. The
> > >      threads do not consume the events, therefore all epoll_wait() should
> > >      return at least one event.
> > >
> > > If one thread is executing ep_events_available() while another thread is
> > > executing ep_start_scan() or ep_done_scan(), epoll_wait() may wrongly
> > > return no event for the former thread.
> > 
> > That is the whole point of epoll_wait with a zero timeout. We would want to
> > opportunistically poll without much overhead, which will have more
> > false positives.
> > A caller that calls with a zero timeout should retry later, and will
> > at some point observe the event.
> 
> Is this a documented behavior that users expect? I do not see this in the
> man page.

The selftests rely on this behavior that timeout=0 sees events from a
concurrently running producer. They would fail at a very higher rate
after this change - believe me I had a similar patch that changed
something in this area. I would explore the seqcount that Mateusz
suggested tbh.

Reply via email to