Am Thu, May 20, 2021 at 11:28:19AM +0200 schrieb Claudio Jeker:
> On Thu, May 20, 2021 at 09:37:38AM +0200, Martin Pieuchot wrote:
> > On 20/05/21(Thu) 03:23, Alexandr Nedvedicky wrote:
> > > Hrvoje gave a try to experimental diff, which trades rw-locks in pf(4)
> > > for mutexes [1]. Hrvoje soon discovered machine panics, when doing 'pfctl 
> > > -ss'
> > > The callstack looks as follows:
> > >
> > > [...]
> > > specific to experimental diff [1]. However this made me thinking, that
> > > it's not a good idea to do copyout() while holding NET_LOCK() and 
> > > state_lock.
> > 
> > malloc(9) and copyout(9) are kind of ok while using the NET_LOCK() but
> > if a deadlock occurs while a global rwlock is held, debugging becomes
> > harder.
> > 
> > As long as the `state_lock' and PF_LOCK() are mutexes all allocations
> > and copyin/copyout(9) must be done without holding them.
> 
> One way to reduce the problems with copyout(9) is to use uvm_vslock() to
> lock the pages. This is what sysctl does but it comes with its own set of
> issues.
> 
> In general exporting large collections from the kernel needs to be
> rethought. The system should not grab a lock for a long time to
> serve a userland process.
>  
> > > Diff below moves copyout() at line 1784 outside of protection of both 
> > > locks.
> > > The approach I took is relatively straightforward:
> > > 
> > >     let DIOCGETSTATES to allocate hold_states array, which will keep
> > >     references to states.
> > > 
> > >     grab locks and take references, keep those references in hold
> > >     array.
> > > 
> > >     drop locks, export states and do copyout, while walking
> > >     array of references.
> > > 
> > >     drop references, release hold_states array.
> > > 
> > > does it make sense? If we agree that this approach makes sense
> > 
> > In my opinion it does.  The other approach would be to (ab)use the
> > NET_LOCK() to serialize updates, like bluhm@'s diff does.  Both
> > approaches have pros and cons.
> > 
> 
> I really think adding more to the NET_LOCK() is a step in the wrong
> direction. It will just creap into everything and grow the size of the
> kernel lock. For me the cons outweight the pros.

While what we do at genua probably isn't particularly relevant to the
OpenBSD approach to unlocking the network subsystem, but what we do is
that there is a rwlock that protects the network configuration, and for
ioctls like DIOCGETSTATES we do indeed call uvm_vslock(), then take the
lock either read or write, depending on which ioctl it is, and unlock it
prior to return.

Since this is only the network conf lock, taking it with a 'read' for
something like DIOCGETSTATES does not really influence the network
receive/transmit path, I think.

Reply via email to