On Mon, Nov 21, 2022 at 08:53:52PM +0100, Mark Kettenis wrote:
> > Date: Mon, 21 Nov 2022 20:28:35 +0100
> > From: Alexander Bluhm <[email protected]>
> > 
> > Hi,
> > 
> > Some of my test machines hang while booting userland.
> > 
> > starting network
> > -> here it hangs
> > load: 0.02  cmd: ifconfig 81303 [sbar] 0.00u 0.15s 0% 78k
> > 
> > ddb shows these two processes.
> > 
> >  81303  375320  89140      0  3         0x3  sbar          ifconfig
> >  48135  157353      0      0  3     0x14200  netlock       systqmp
> > 
> > ddb{0}> trace /t 0t375320
> > sleep_finish(ffff800022d31318,1) at sleep_finish+0xfe
> > cond_wait(ffff800022d313b0,ffffffff81f15e9d) at cond_wait+0x54
> > sched_barrier(ffff800022512ff0) at sched_barrier+0x73
> > ixgbe_stop(ffff800000118000) at ixgbe_stop+0x1f7
> > ixgbe_init(ffff800000118000) at ixgbe_init+0x32
> > ixgbe_ioctl(ffff800000118048,8020690c,ffff80000022ec00) at ixgbe_ioctl+0x13a
> > in_ifinit(ffff800000118048,ffff80000022ec00,ffff800022d31740,1) at 
> > in_ifinit+0x
> > ef
> > in_ioctl_change_ifaddr(8040691a,ffff800022d31730,ffff800000118048,1) at 
> > in_ioct
> > l_change_ifaddr+0x3a4
> > in_control(fffffd81901dc740,8040691a,ffff800022d31730,ffff800000118048) at 
> > in_c
> > ontrol+0x75
> > ifioctl(fffffd81901dc740,8040691a,ffff800022d31730,ffff800022d60000) at 
> > ifioctl
> > +0x982
> > sys_ioctl(ffff800022d60000,ffff800022d31840,ffff800022d318a0) at 
> > sys_ioctl+0x2c
> > 4
> > syscall(ffff800022d31910) at syscall+0x384
> > Xsyscall() at Xsyscall+0x128
> > end of kernel
> > end trace frame: 0x7f7ffffd94a0, count: -13
> > 
> > ddb{0}> trace /t 0t157353
> > sleep_finish(ffff800022ca8b70,1) at sleep_finish+0xfe
> > rw_enter(ffffffff822b4f80,1) at rw_enter+0x1cb
> > pf_purge(0) at pf_purge+0x1d
> > taskq_thread(ffffffff822ac568) at taskq_thread+0x100
> > end trace frame: 0x0, count: -4
> > 
> > ifconfig waits for the sched_barrier_task() on the systqmp task
> > queue while holding the netlock.  pf_purge() runs on the systqmp
> > task queue and is waiting for the netlock.  The netlock has been
> > taken by ifconfig in in_ioctl_change_ifaddr().
> > 
> > The problem has been introduced when pf_purge() was moved from systq
> > to systqmp.
> > https://marc.info/?l=openbsd-cvs&m=166818274216800&w=2
> 
> I'd say pfpurge should be moved to itw own taskq.

we're working toward dropping the need for NET_LOCK before PF_LOCK. could
we try the diff below as a compromise?

> ixgb(4) holding netlock while calling sched_barrier() is probably
> wrong too.

it's pretty baked in at this point that the SIOCSIFFLAGS ioctl is called
while holding the net lock, and we've been saying for ages that you can
clear IFF_RUNNING and then call intr_barrier (and lots of other
barriers) to wait for things to get off the rings before tearing them
down.

long term, drivers should protect themselves. we're nowhere near that
point though.

Index: pf.c
===================================================================
RCS file: /cvs/src/sys/net/pf.c,v
retrieving revision 1.1153
diff -u -p -r1.1153 pf.c
--- pf.c        12 Nov 2022 02:48:14 -0000      1.1153
+++ pf.c        22 Nov 2022 00:29:30 -0000
@@ -1603,20 +1620,14 @@ pf_purge(void *null)
 {
        unsigned int interval = max(1, pf_default_rule.timeout[PFTM_INTERVAL]);
 
-       /* XXX is NET_LOCK necessary? */
-       NET_LOCK();
-
-       PF_LOCK();
-
+       rw_enter_write(&pf_lock); /* PF_LOCK() without NET_LOCK() */
        pf_purge_expired_src_nodes();
-
-       PF_UNLOCK();
+       rw_exit_write(&pf_lock); /* PF_UNLOCK() without NET_LOCK() */
 
        /*
         * Fragments don't require PF_LOCK(), they use their own lock.
         */
        pf_purge_expired_fragments();
-       NET_UNLOCK();
 
        /* interpret the interval as idle time between runs */
        timeout_add_sec(&pf_purge_to, interval);
@@ -1891,9 +1902,8 @@ pf_purge_expired_states(const unsigned i
        if (SLIST_EMPTY(&gcl))
                return (scanned);
 
-       NET_LOCK();
        rw_enter_write(&pf_state_list.pfs_rwl);
-       PF_LOCK();
+       rw_enter_write(&pf_lock); /* PF_LOCK() without NET_LOCK() */
        PF_STATE_ENTER_WRITE();
        SLIST_FOREACH(st, &gcl, gc_list) {
                if (st->timeout != PFTM_UNLINKED)
@@ -1902,9 +1912,8 @@ pf_purge_expired_states(const unsigned i
                pf_free_state(st);
        }
        PF_STATE_EXIT_WRITE();
-       PF_UNLOCK();
+       rw_exit_write(&pf_lock); /* PF_UNLOCK() without NET_LOCK() */
        rw_exit_write(&pf_state_list.pfs_rwl);
-       NET_UNLOCK();
 
        while ((st = SLIST_FIRST(&gcl)) != NULL) {
                SLIST_REMOVE_HEAD(&gcl, gc_list);

Reply via email to