On Thu, Feb 24, 2022 at 07:39:54PM +1000, David Gwynne wrote:
> claudio and i came up with the following, which is to have tun_dev_open
> check the state of the vnode associated with the current open call
> after all the sleeping and potential tun_clone_destroy and
> tun_clone_create calls. if the vnode has been made bad/dead after
> all the sleeping, it returns with ENXIO.
>
> this appears to fix the issue. i had a test program that would trigger
> the KASSERT in a matter of seconds and it's been running for an hour
> now.
>
> here's the diff.
The diff looks reasonable.
OK visa@
I guess a similar issue can arise with some other detachable devices
as well. However, the problem is most acute with things like tun(4)
that can be created and destroyed programmatically on a whim at high
speed.
spec_open() could check for vnode revocation after completing d_open.
Of course, that would not fix the problem with tun. The check would
allow error reporting to happen earlier, but there would be no
fundamental shift in behaviour, I think.
> Index: if_tun.c
> ===================================================================
> RCS file: /cvs/src/sys/net/if_tun.c,v
> retrieving revision 1.234
> diff -u -p -r1.234 if_tun.c
> --- if_tun.c 16 Feb 2022 02:22:39 -0000 1.234
> +++ if_tun.c 24 Feb 2022 08:08:38 -0000
> @@ -374,10 +374,19 @@ tun_dev_open(dev_t dev, const struct if_
> struct ifnet *ifp;
> int error;
> u_short stayup = 0;
> + struct vnode *vp;
>
> char name[IFNAMSIZ];
> unsigned int rdomain;
>
> + /*
> + * Find the vnode associated with this open before we sleep
> + * and let something else revoke it. Our caller has a reference
> + * to it so we don't need to account for it.
> + */
> + if (!vfinddev(dev, VCHR, &vp))
> + panic("%s vfinddev failed", __func__);
> +
> snprintf(name, sizeof(name), "%s%u", ifc->ifc_name, minor(dev));
> rdomain = rtable_l2(p->p_p->ps_rtableid);
>
> @@ -412,6 +421,12 @@ tun_dev_open(dev_t dev, const struct if_
> /* XXX if_clone_destroy if stayup? */
> goto done;
> }
> + }
> +
> + /* Has tun_clone_destroy torn the rug out under us? */
> + if (vp->v_type == VBAD) {
> + error = ENXIO;
> + goto done;
> }
>
> if (sc->sc_dev != 0) {
>