Is it expected interrupt handlers modify ifp->if_flags?
> On 10 Jun 2021, at 20:17, Alexander Bluhm <[email protected]> wrote:
>
> Hi,
>
> I have seen this crash trace on a 6.6 based system, but I think the
> bug exists still in -current. It happened when an ixl(4) interface
> was removed from trunk(4).
>
> uvm_fault(0xfffffd8739dc6558, 0x0, 0, 1) -> e
> fatal page fault in supervisor mode
> trap type 6 code 0 rip ffffffff81012a86 cs 8 rflags 10202 cr2 0 cpl 7 rsp
> ffff80002e4bd170
> gsbase 0xffffffff816d6ff0 kgsbase 0x0
> panic: trap type 6, code=0, pc=ffffffff81012a86
> Starting stack trace...
> panic() at panic+0x113
> kerntrap(ffff800000bf3000) at kerntrap+0xdc
> alltraps_kern_meltdown(6,0,4,0,ffff800000bf3000,0) at
> alltraps_kern_meltdown+0x7b
> ixl_intr(ffff800000bf3000) at ixl_intr+0x3e6
> intr_handler(ffffffff816d6ff0,ffff800000b57200) at intr_handler+0x5b
> Xintr_ioapic_edge30_untramp(4,ffffffff814d3a00,4,18041969,ffffffff816d6ff0,d)
> at Xintr_ioapic_edge30_untramp+0x19f
> Xspllower(ffffffff8178fb58,ffffffff816d6ff0,ffffffff8139d743,ffff800000bffd00,ffffffff8178fb40,10)
> at Xspllower+0xc
> softintr_dispatch(0) at softintr_dispatch+0xc5
> Xsoftclock(ffff800000bf3048,0,ffffffff8139d743,ffff800000bf3048,ffff800001207800,ffffffff814e862f)
> at Xsoftclock+0x1f
> ifnewlladdr(ffff800001624e10) at ifnewlladdr+0xf8
> trunk_port_destroy(ffff80002e4bd6e0) at trunk_port_destroy+0x2fd
> trunk_ioctl(fffffd87387594c0,ffff800001207800,8048698e) at trunk_ioctl+0x6a6
> ifioctl(fffffd87623df448,ffff80002e46e2d8,48,ffff80002e4bd7d0) at
> ifioctl+0x2d6
> sys_ioctl(360,ffff80002e46e2d8,36) at sys_ioctl+0x3cd
> syscall(0) at syscall+0x3d1
> Xsyscall(6,36,1,36,7f7ffffda720,7f7ffffdaa21) at Xsyscall+0x128
> end of kernel
> end trace frame: 0x7f7ffffda780, count: 241
> End of stack trace.
> syncing disks...
>
> ifnewlladdr() is interrupted by ixl transmit interrupt. There it
> crashes in ixl_txeof as txr is NULL. The code in -current if_ixl.c
> has changed, so it might not happen anymore. But I think the bug
> is in ifnewlladdr().
>
> ifnewlladdr() sets splnet() and configures the interface up and
> down. The ixl_down() code has some interrupt barriers which cannot
> work while interrupts are blocked by splnet(). So interrupts fire
> at splx() when the driver does not expect them.
>
> Combining interrupt barriers with spl protection looks like a bad
> idea.
>
> Is there anything that lowers spl in all cases during intr_barrier(),
> ifq_barrier() or timeout_del_barrier()?
>
> How should spls work together with barriers?
>
> The integrity of ifnewlladdr() state should be guaranteed by netlock.
> Changing if_flags needs splnet() as they are used by all drivers.
> The in6_ functions need netlock. And driver SIOCSIFFLAGS ioctl
> must not have splnet().
>
> Is reducing splnet() the correct aproach?
>
> bluhm
>
> Index: net/if.c
> ===================================================================
> RCS file: /data/mirror/openbsd/cvs/src/sys/net/if.c,v
> retrieving revision 1.641
> diff -u -p -r1.641 if.c
> --- net/if.c 25 May 2021 22:45:09 -0000 1.641
> +++ net/if.c 10 Jun 2021 14:32:12 -0000
> @@ -3109,6 +3109,8 @@ ifnewlladdr(struct ifnet *ifp)
> short up;
> int s;
>
> + NET_ASSERT_LOCKED();
> +
> s = splnet();
> up = ifp->if_flags & IFF_UP;
>
> @@ -3116,11 +3118,14 @@ ifnewlladdr(struct ifnet *ifp)
> /* go down for a moment... */
> ifp->if_flags &= ~IFF_UP;
> ifrq.ifr_flags = ifp->if_flags;
> + splx(s);
> (*ifp->if_ioctl)(ifp, SIOCSIFFLAGS, (caddr_t)&ifrq);
> + s = splnet();
> }
>
> ifp->if_flags |= IFF_UP;
> ifrq.ifr_flags = ifp->if_flags;
> + splx(s);
> (*ifp->if_ioctl)(ifp, SIOCSIFFLAGS, (caddr_t)&ifrq);
>
> #ifdef INET6
> @@ -3139,11 +3144,12 @@ ifnewlladdr(struct ifnet *ifp)
> #endif
> if (!up) {
> /* go back down */
> + s = splnet();
> ifp->if_flags &= ~IFF_UP;
> ifrq.ifr_flags = ifp->if_flags;
> + splx(s);
> (*ifp->if_ioctl)(ifp, SIOCSIFFLAGS, (caddr_t)&ifrq);
> }
> - splx(s);
> }
>
> void
>