On 4/24/24 2:28 PM, Christian König wrote:

I don't fully understand how that happens either, it could be that there is 
some bug in the EPOLL_FD code. Maybe it's a race when the EPOLL file descriptor 
is closed or something like that.

IIUC the race condition looks like the following:

Thread 0                        Thread 1
-> do_epoll_ctl()
   f_count++, now 2
   ...
   ...                          -> vfs_poll(), f_count == 2
   ...                          ...
<- do_epoll_ctl()               ...
   f_count--, now 1             ...
-> filp_close(), f_count == 1   ...
   ...                            -> dma_buf_poll(), f_count == 1
   -> fput()                      ... [*** race window ***]
      f_count--, now 0              -> maybe get_file(), now ???
      -> __fput() (delayed)

E.g. dma_buf_poll() may be entered in thread 1 with f->count == 1
and call to get_file() shortly later (and may even skip this if
there is nothing to EPOLLIN or EPOLLOUT). During this time window,
thread 0 may call fput() (on behalf of close() in this example)
and (since it sees f->count == 1) file is scheduled to delayed_fput().

Dmitry

Reply via email to