On Mon, Oct 21, 2019 at 03:26:03PM +0200, Alexandr Nedvedicky wrote: > On Mon, Oct 21, 2019 at 02:59:01PM +0200, Martin Pieuchot wrote: > > On 21/10/19(Mon) 13:28, Alexandr Nedvedicky wrote: > > > Hello, > > > > > > </snip> > > > > > > > > > The vnode is not locked in this > > > > > path > > > > > either so it won't end up waiting on the ongoing vclean(). > > > > > > > > That leads to an interesting question: should we serialize device access > > > > at the vnode level or should we generalize the existing "solution" and > > > > assume that the descriptor can be NULL/invalid at the "pseudo-driver" > > > > level? > > > > > > > my preference is to fix it at pseudo-driver level. Dealing with it at > > > driver > > > level keeps scope of locks small. I'm just afraid the `big lock` on > > > vnode may > > > bite us later. > > > > Well the vnode lock is already taken for read/write/close. That's why > > the race is visible here with ioctl, which doesn't grab the lock. It > > might be also possible to trigger a race with open. > > I'm trying to craft some unit test case based on input from Anton. > I'm just curious to see if I'll be able to trigger similar panic. > > > > That said I'm fine with fixing this issue at the driver level. Does > > that mean all bpfilter_lookup() should be audited since they can return > > NULL? At least ioctl/poll/kqfilter aren't serialized with close. > > > > Or am I missing something? > > I don't know at the moment. There are like 6 calls to bpfilter_lookup(), > which need to be inspected. I'll do it hopefully later today. > > I've got already OK from visa@ for diff, which will put reference counting on > bpf desciptors back. But I'd like to hold on and give a try the 'check for > NULL' approach. Stay tuned.
ok anton@ on putting back refcount. Note that without it, the `bpfilter_lookup() != NULL' condition has to be checked after each potential sleeping point when the vnode is not locked.
