On Thu, May 20, 2021 at 09:37:38AM +0200, Martin Pieuchot wrote:
> On 20/05/21(Thu) 03:23, Alexandr Nedvedicky wrote:
> > Hrvoje gave a try to experimental diff, which trades rw-locks in pf(4)
> > for mutexes [1]. Hrvoje soon discovered machine panics, when doing 'pfctl 
> > -ss'
> > The callstack looks as follows:
> >
> > [...]
> > specific to experimental diff [1]. However this made me thinking, that
> > it's not a good idea to do copyout() while holding NET_LOCK() and 
> > state_lock.
> 
> malloc(9) and copyout(9) are kind of ok while using the NET_LOCK() but
> if a deadlock occurs while a global rwlock is held, debugging becomes
> harder.
> 
> As long as the `state_lock' and PF_LOCK() are mutexes all allocations
> and copyin/copyout(9) must be done without holding them.

One way to reduce the problems with copyout(9) is to use uvm_vslock() to
lock the pages. This is what sysctl does but it comes with its own set of
issues.

In general exporting large collections from the kernel needs to be
rethought. The system should not grab a lock for a long time to
serve a userland process.
 
> > Diff below moves copyout() at line 1784 outside of protection of both locks.
> > The approach I took is relatively straightforward:
> > 
> >     let DIOCGETSTATES to allocate hold_states array, which will keep
> >     references to states.
> > 
> >     grab locks and take references, keep those references in hold
> >     array.
> > 
> >     drop locks, export states and do copyout, while walking
> >     array of references.
> > 
> >     drop references, release hold_states array.
> > 
> > does it make sense? If we agree that this approach makes sense
> 
> In my opinion it does.  The other approach would be to (ab)use the
> NET_LOCK() to serialize updates, like bluhm@'s diff does.  Both
> approaches have pros and cons.
> 

I really think adding more to the NET_LOCK() is a step in the wrong
direction. It will just creap into everything and grow the size of the
kernel lock. For me the cons outweight the pros.


-- 
:wq Claudio

Reply via email to