RE: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-08 Thread David Laight
From: Linus Torvalds > Sent: 06 May 2024 19:18 ... > Which is why I applied the minimal patch for just "refcount over > vfs_poll()", and am just mentioning my suggestion in the hope that > some eager beaver would like to see how painful it would do to make > the bigger surgery... I wonder if I

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-07 Thread Stefan Metzmacher
Am 03.05.24 um 23:24 schrieb Linus Torvalds: On Fri, 3 May 2024 at 14:11, Al Viro wrote: What we need is * promise that ep_item_poll() won't happen after eventpoll_release_file(). AFAICS, we do have that. * ->poll() not playing silly buggers. No. That is not enough at

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-06 Thread Linus Torvalds
On Mon, 6 May 2024 at 10:46, Stefan Metzmacher wrote: > > I think it's a very important detail that epoll does not take > real references. Otherwise an application level 'close()' on a socket > would not trigger a tcp disconnect, when an fd is still registered with > epoll. Yes, exactly.

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-06 Thread Daniel Vetter
On Fri, May 03, 2024 at 10:53:03PM +0100, Al Viro wrote: > On Fri, May 03, 2024 at 02:42:22PM -0700, Linus Torvalds wrote: > > On Fri, 3 May 2024 at 14:36, Al Viro wrote: > > > > > > ... the last part is no-go - poll_wait() must be able to grab a reference > > > (well, the callback in it must) >

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-04 Thread Christian Brauner
On Fri, May 03, 2024 at 12:59:52PM -0700, Kees Cook wrote: > On Fri, May 03, 2024 at 01:35:09PM -0600, Jens Axboe wrote: > > On 5/3/24 1:22 PM, Kees Cook wrote: > > > On Fri, May 03, 2024 at 12:49:11PM -0600, Jens Axboe wrote: > > >> On 5/3/24 12:26 PM, Kees Cook wrote: > > >>> Thanks for doing

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-04 Thread Christian Brauner
On Fri, May 03, 2024 at 11:26:32AM -0700, Kees Cook wrote: > On Fri, May 03, 2024 at 06:54:22PM +0700, Bui Quang Minh wrote: > > [...] > > Root cause: > > AFAIK, eventpoll (epoll) does not increase the registered file's reference. > > To ensure the safety, when the registered file is deallocated

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-03 Thread Al Viro
On Fri, May 03, 2024 at 02:42:22PM -0700, Linus Torvalds wrote: > On Fri, 3 May 2024 at 14:36, Al Viro wrote: > > > > ... the last part is no-go - poll_wait() must be able to grab a reference > > (well, the callback in it must) > > Yeah. I really think that *poll* itself is doing everything

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-03 Thread Linus Torvalds
On Fri, 3 May 2024 at 14:36, Al Viro wrote: > > ... the last part is no-go - poll_wait() must be able to grab a reference > (well, the callback in it must) Yeah. I really think that *poll* itself is doing everything right. It knows that it's called with a file pointer with a reference, and it

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-03 Thread Al Viro
On Fri, May 03, 2024 at 10:11:09PM +0100, Al Viro wrote: > On Fri, May 03, 2024 at 01:28:37PM -0700, Kees Cook wrote: > > > > Is this the right approach? It still feels to me like get_file() needs > > to happen much earlier... > > I don't believe it needs to happen at all. The problem is not

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-03 Thread Al Viro
On Fri, May 03, 2024 at 02:24:45PM -0700, Linus Torvalds wrote: > Because even with perfectly normal "->poll()", and even with the > ep_item_poll() happening *before* eventpoll_release_file(), you have > this trivial race: > > ep_item_poll() > ->poll() > > and *between* those two

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-03 Thread Linus Torvalds
On Fri, 3 May 2024 at 14:11, Al Viro wrote: > > What we need is > * promise that ep_item_poll() won't happen after > eventpoll_release_file(). > AFAICS, we do have that. > * ->poll() not playing silly buggers. No. That is not enough at all. Because even with perfectly normal

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-03 Thread Al Viro
On Fri, May 03, 2024 at 01:28:37PM -0700, Kees Cook wrote: > > Is this the right approach? It still feels to me like get_file() needs > to happen much earlier... I don't believe it needs to happen at all. The problem is not that ->release() can be called during ->poll() - it can't and it

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-03 Thread Kees Cook
On Fri, May 03, 2024 at 12:59:52PM -0700, Kees Cook wrote: > So, yeah, I can't figure out how eventpoll_release() and epoll_wait() > are expected to behave safely for .poll handlers. > > Regardless, for the simple case: it seems like it's just totally illegal > to use get_file() in a poll

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-03 Thread Kees Cook
On Fri, May 03, 2024 at 01:35:09PM -0600, Jens Axboe wrote: > On 5/3/24 1:22 PM, Kees Cook wrote: > > On Fri, May 03, 2024 at 12:49:11PM -0600, Jens Axboe wrote: > >> On 5/3/24 12:26 PM, Kees Cook wrote: > >>> Thanks for doing this analysis! I suspect at least a start of a fix > >>> would be this:

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-03 Thread Jens Axboe
On 5/3/24 1:22 PM, Kees Cook wrote: > On Fri, May 03, 2024 at 12:49:11PM -0600, Jens Axboe wrote: >> On 5/3/24 12:26 PM, Kees Cook wrote: >>> Thanks for doing this analysis! I suspect at least a start of a fix >>> would be this: >>> >>> diff --git a/drivers/dma-buf/dma-buf.c

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-03 Thread Kees Cook
On Fri, May 03, 2024 at 12:49:11PM -0600, Jens Axboe wrote: > On 5/3/24 12:26 PM, Kees Cook wrote: > > Thanks for doing this analysis! I suspect at least a start of a fix > > would be this: > > > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > > index

Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-03 Thread Jens Axboe
On 5/3/24 12:26 PM, Kees Cook wrote: > Thanks for doing this analysis! I suspect at least a start of a fix > would be this: > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index 8fe5aa67b167..15e8f74ee0f2 100644 > --- a/drivers/dma-buf/dma-buf.c > +++

get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-03 Thread Kees Cook
On Fri, May 03, 2024 at 06:54:22PM +0700, Bui Quang Minh wrote: > [...] > Root cause: > AFAIK, eventpoll (epoll) does not increase the registered file's reference. > To ensure the safety, when the registered file is deallocated in __fput, > eventpoll_release is called to unregister the file from