Re: if attach/detach netlocks
On 3 January 2017 at 12:06, Reyk Floeterwrote: > On Tue, Jan 03, 2017 at 11:42:21AM +0100, Martin Pieuchot wrote: >> On 02/01/17(Mon) 21:51, Mike Belopuhov wrote: >> > I got to test the diff and I had to make another adjustment: >> > vxlan_if_change is setup as a detach hook, however dohooks is >> > called very early in if_detach before we remove IP addresses >> > from the interface. It makes vxlan_config find these IP >> > addresses just fine and re-add its own detach hook again. >> >> Why not fix vxlan_if_change()? >> >> > This >> > repeats ad infinitum hogging the machine. I couldn't think of >> > anything better than deferring an operation via a task. Seems >> > to do the trick. >> >> That's ugly. Why would you re-add anything in a detach hook? This >> is obviously broken. >> > > It is used for the multicast address (only) and it made sense when the > source address wasn't mandatory, eg. 0.0.0.0 -> 239.1.1.100 would find > the next interface that points to the group after detaching the > current one. > > The source is mandatory now, so the diff below should work as well. > > Reyk > Thanks, this looks good, OK mikeb
Re: if attach/detach netlocks
On Tue, Jan 03, 2017 at 11:42:21AM +0100, Martin Pieuchot wrote: > On 02/01/17(Mon) 21:51, Mike Belopuhov wrote: > > On Fri, Dec 30, 2016 at 18:57 +0100, Mike Belopuhov wrote: > > > On Thu, Dec 29, 2016 at 09:30 +0100, Martin Pieuchot wrote: > > > > On 29/12/16(Thu) 01:15, Alexander Bluhm wrote: > > > > > On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote: > > > > > > On 22/12/16(Thu) 20:45, Mike Belopuhov wrote: > > > > > > > I think this is what is required here. Works here, but YMMV. > > > > > > > > > > > > splnet() in a pseudo-driver seems completely wrong, you could get > > > > > > rid of > > > > > > it. > > > > > > > > > > Yes, but that is another issue. Can we get the netlock splasserts > > > > > fixed first? This diff looks good to me. > > > > > > > > Sure I'm ok with the diff. > > > > > > > > > > I agree with Martin and have cooked a diff but couldn't test it yet. > > > This is it for the reference. > > > > > > > I got to test the diff and I had to make another adjustment: > > vxlan_if_change is setup as a detach hook, however dohooks is > > called very early in if_detach before we remove IP addresses > > from the interface. It makes vxlan_config find these IP > > addresses just fine and re-add its own detach hook again. > > Why not fix vxlan_if_change()? > > > This > > repeats ad infinitum hogging the machine. I couldn't think of > > anything better than deferring an operation via a task. Seems > > to do the trick. > > That's ugly. Why would you re-add anything in a detach hook? This > is obviously broken. > It is used for the multicast address (only) and it made sense when the source address wasn't mandatory, eg. 0.0.0.0 -> 239.1.1.100 would find the next interface that points to the group after detaching the current one. The source is mandatory now, so the diff below should work as well. Reyk Index: sys/net/if_vxlan.c === RCS file: /cvs/src/sys/net/if_vxlan.c,v retrieving revision 1.54 diff -u -p -u -p -r1.54 if_vxlan.c --- sys/net/if_vxlan.c 13 Dec 2016 06:51:11 - 1.54 +++ sys/net/if_vxlan.c 3 Jan 2017 11:05:36 - @@ -947,17 +947,16 @@ vxlan_if_change(void *arg) { struct vxlan_softc *sc = arg; struct ifnet*ifp = >sc_ac.ac_if; - int s, error; + int s; /* * Reset the configuration after the parent interface disappeared. */ s = splnet(); - if ((error = vxlan_config(ifp, NULL, NULL)) != 0) { - /* The configured tunnel addresses are invalid, remove them */ - bzero(>sc_src, sizeof(sc->sc_src)); - bzero(>sc_dst, sizeof(sc->sc_dst)); - } + vxlan_multicast_cleanup(ifp); + memset(>sc_src, 0, sizeof(sc->sc_src)); + memset(>sc_dst, 0, sizeof(sc->sc_dst)); + sc->sc_dstport = htons(VXLAN_PORT); splx(s); }
Re: if attach/detach netlocks
On 02/01/17(Mon) 21:51, Mike Belopuhov wrote: > On Fri, Dec 30, 2016 at 18:57 +0100, Mike Belopuhov wrote: > > On Thu, Dec 29, 2016 at 09:30 +0100, Martin Pieuchot wrote: > > > On 29/12/16(Thu) 01:15, Alexander Bluhm wrote: > > > > On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote: > > > > > On 22/12/16(Thu) 20:45, Mike Belopuhov wrote: > > > > > > I think this is what is required here. Works here, but YMMV. > > > > > > > > > > splnet() in a pseudo-driver seems completely wrong, you could get rid > > > > > of > > > > > it. > > > > > > > > Yes, but that is another issue. Can we get the netlock splasserts > > > > fixed first? This diff looks good to me. > > > > > > Sure I'm ok with the diff. > > > > > > > I agree with Martin and have cooked a diff but couldn't test it yet. > > This is it for the reference. > > > > I got to test the diff and I had to make another adjustment: > vxlan_if_change is setup as a detach hook, however dohooks is > called very early in if_detach before we remove IP addresses > from the interface. It makes vxlan_config find these IP > addresses just fine and re-add its own detach hook again. Why not fix vxlan_if_change()? > This > repeats ad infinitum hogging the machine. I couldn't think of > anything better than deferring an operation via a task. Seems > to do the trick. That's ugly. Why would you re-add anything in a detach hook? This is obviously broken.
Re: if attach/detach netlocks
On Fri, Dec 30, 2016 at 18:57 +0100, Mike Belopuhov wrote: > On Thu, Dec 29, 2016 at 09:30 +0100, Martin Pieuchot wrote: > > On 29/12/16(Thu) 01:15, Alexander Bluhm wrote: > > > On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote: > > > > On 22/12/16(Thu) 20:45, Mike Belopuhov wrote: > > > > > I think this is what is required here. Works here, but YMMV. > > > > > > > > splnet() in a pseudo-driver seems completely wrong, you could get rid of > > > > it. > > > > > > Yes, but that is another issue. Can we get the netlock splasserts > > > fixed first? This diff looks good to me. > > > > Sure I'm ok with the diff. > > > > I agree with Martin and have cooked a diff but couldn't test it yet. > This is it for the reference. > I got to test the diff and I had to make another adjustment: vxlan_if_change is setup as a detach hook, however dohooks is called very early in if_detach before we remove IP addresses from the interface. It makes vxlan_config find these IP addresses just fine and re-add its own detach hook again. This repeats ad infinitum hogging the machine. I couldn't think of anything better than deferring an operation via a task. Seems to do the trick. OK? diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c index e9bc1cb8305..082e4ab5814 100644 --- sys/net/if_vxlan.c +++ sys/net/if_vxlan.c @@ -71,10 +71,12 @@ struct vxlan_softc { in_port_tsc_dstport; u_intsc_rdomain; int64_t sc_vnetid; u_int8_t sc_ttl; + struct task sc_task; + LIST_ENTRY(vxlan_softc) sc_entry; }; voidvxlanattach(int); int vxlanioctl(struct ifnet *, u_long, caddr_t); @@ -89,10 +91,11 @@ void vxlan_media_status(struct ifnet *, struct ifmediareq *); int vxlan_config(struct ifnet *, struct sockaddr *, struct sockaddr *); int vxlan_output(struct ifnet *, struct mbuf *); voidvxlan_addr_change(void *); voidvxlan_if_change(void *); voidvxlan_link_change(void *); +voidvxlan_config_defer(void *); int vxlan_sockaddr_cmp(struct sockaddr *, struct sockaddr *); uint16_t vxlan_sockaddr_port(struct sockaddr *); struct if_clonevxlan_cloner = @@ -166,10 +169,12 @@ vxlan_clone_create(struct if_clone *ifc, int unit) */ ifp->if_mtu = ETHERMTU - sizeof(struct ether_header); ifp->if_mtu -= sizeof(struct vxlanudphdr) + sizeof(struct ipovly); #endif + task_set(>sc_task, vxlan_config_defer, sc); + LIST_INSERT_HEAD(_tagh[VXLAN_TAGHASH(0)], sc, sc_entry); vxlan_enable++; return (0); } @@ -178,13 +183,13 @@ int vxlan_clone_destroy(struct ifnet *ifp) { struct vxlan_softc *sc = ifp->if_softc; int s; - s = splnet(); + NET_LOCK(s); vxlan_multicast_cleanup(ifp); - splx(s); + NET_UNLOCK(s); vxlan_enable--; LIST_REMOVE(sc, sc_entry); ifmedia_delete_instance(>sc_media, IFM_INST_ANY); @@ -392,11 +397,11 @@ int vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data) { struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; struct ifreq*ifr = (struct ifreq *)data; struct if_laddrreq *lifr = (struct if_laddrreq *)data; - int error = 0, s; + int error = 0; switch (cmd) { case SIOCSIFADDR: ifp->if_flags |= IFF_UP; /* FALLTHROUGH */ @@ -417,24 +422,20 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data) case SIOCSIFMEDIA: error = ifmedia_ioctl(ifp, ifr, >sc_media, cmd); break; case SIOCSLIFPHYADDR: - s = splnet(); error = vxlan_config(ifp, (struct sockaddr *)>addr, (struct sockaddr *)>dstaddr); - splx(s); break; case SIOCDIFPHYADDR: - s = splnet(); vxlan_multicast_cleanup(ifp); bzero(>sc_src, sizeof(sc->sc_src)); bzero(>sc_dst, sizeof(sc->sc_dst)); sc->sc_dstport = htons(VXLAN_PORT); - splx(s); break; case SIOCGLIFPHYADDR: if (sc->sc_dst.ss_family == AF_UNSPEC) { error = EADDRNOTAVAIL; @@ -451,14 +452,12 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data) ifr->ifr_rdomainid > RT_TABLEID_MAX || !rtable_exists(ifr->ifr_rdomainid)) { error = EINVAL; break; } - s = splnet(); sc->sc_rdomain = ifr->ifr_rdomainid; (void)vxlan_config(ifp, NULL, NULL); - splx(s); break; case SIOCGLIFPHYRTABLE: ifr->ifr_rdomainid =
Re: if attach/detach netlocks
On Thu, Dec 29, 2016 at 09:30 +0100, Martin Pieuchot wrote: > On 29/12/16(Thu) 01:15, Alexander Bluhm wrote: > > On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote: > > > On 22/12/16(Thu) 20:45, Mike Belopuhov wrote: > > > > I think this is what is required here. Works here, but YMMV. > > > > > > splnet() in a pseudo-driver seems completely wrong, you could get rid of > > > it. > > > > Yes, but that is another issue. Can we get the netlock splasserts > > fixed first? This diff looks good to me. > > Sure I'm ok with the diff. > I agree with Martin and have cooked a diff but couldn't test it yet. This is it for the reference. diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c index e9bc1cb8305..74437b8b79c 100644 --- sys/net/if_vxlan.c +++ sys/net/if_vxlan.c @@ -178,13 +178,13 @@ int vxlan_clone_destroy(struct ifnet *ifp) { struct vxlan_softc *sc = ifp->if_softc; int s; - s = splnet(); + NET_LOCK(s); vxlan_multicast_cleanup(ifp); - splx(s); + NET_UNLOCK(s); vxlan_enable--; LIST_REMOVE(sc, sc_entry); ifmedia_delete_instance(>sc_media, IFM_INST_ANY); @@ -392,11 +392,11 @@ int vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data) { struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; struct ifreq*ifr = (struct ifreq *)data; struct if_laddrreq *lifr = (struct if_laddrreq *)data; - int error = 0, s; + int error = 0; switch (cmd) { case SIOCSIFADDR: ifp->if_flags |= IFF_UP; /* FALLTHROUGH */ @@ -417,24 +417,20 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data) case SIOCSIFMEDIA: error = ifmedia_ioctl(ifp, ifr, >sc_media, cmd); break; case SIOCSLIFPHYADDR: - s = splnet(); error = vxlan_config(ifp, (struct sockaddr *)>addr, (struct sockaddr *)>dstaddr); - splx(s); break; case SIOCDIFPHYADDR: - s = splnet(); vxlan_multicast_cleanup(ifp); bzero(>sc_src, sizeof(sc->sc_src)); bzero(>sc_dst, sizeof(sc->sc_dst)); sc->sc_dstport = htons(VXLAN_PORT); - splx(s); break; case SIOCGLIFPHYADDR: if (sc->sc_dst.ss_family == AF_UNSPEC) { error = EADDRNOTAVAIL; @@ -451,14 +447,12 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data) ifr->ifr_rdomainid > RT_TABLEID_MAX || !rtable_exists(ifr->ifr_rdomainid)) { error = EINVAL; break; } - s = splnet(); sc->sc_rdomain = ifr->ifr_rdomainid; (void)vxlan_config(ifp, NULL, NULL); - splx(s); break; case SIOCGLIFPHYRTABLE: ifr->ifr_rdomainid = sc->sc_rdomain; break; @@ -468,14 +462,12 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data) error = EINVAL; break; } if (sc->sc_ttl == (u_int8_t)ifr->ifr_ttl) break; - s = splnet(); sc->sc_ttl = (u_int8_t)(ifr->ifr_ttl); (void)vxlan_config(ifp, NULL, NULL); - splx(s); break; case SIOCGLIFPHYTTL: ifr->ifr_ttl = (int)sc->sc_ttl; break; @@ -489,14 +481,12 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data) ifr->ifr_vnetid < VXLAN_VNI_MIN)) { error = EINVAL; break; } - s = splnet(); sc->sc_vnetid = (int)ifr->ifr_vnetid; (void)vxlan_config(ifp, NULL, NULL); - splx(s); break; case SIOCGVNETID: if ((sc->sc_vnetid != VXLAN_VNI_ANY) && (sc->sc_vnetid > VXLAN_VNI_MAX || @@ -507,14 +497,12 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data) ifr->ifr_vnetid = sc->sc_vnetid; break; case SIOCDVNETID: - s = splnet(); sc->sc_vnetid = VXLAN_VNI_UNSET; (void)vxlan_config(ifp, NULL, NULL); - splx(s); break; default: error = ether_ioctl(ifp, >sc_ac, cmd, data); break; @@ -923,57 +911,56 @@ vxlan_output(struct ifnet *ifp, struct mbuf *m) void vxlan_addr_change(void *arg) { struct vxlan_softc *sc = arg; struct ifnet*ifp = >sc_ac.ac_if; - int s, error; + int
Re: if attach/detach netlocks
On 29/12/16(Thu) 01:15, Alexander Bluhm wrote: > On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote: > > On 22/12/16(Thu) 20:45, Mike Belopuhov wrote: > > > I think this is what is required here. Works here, but YMMV. > > > > splnet() in a pseudo-driver seems completely wrong, you could get rid of > > it. > > Yes, but that is another issue. Can we get the netlock splasserts > fixed first? This diff looks good to me. Sure I'm ok with the diff. > > > diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c > > > index e9bc1cb8305..dfb71cf9467 100644 > > > --- sys/net/if_vxlan.c > > > +++ sys/net/if_vxlan.c > > > @@ -178,13 +178,15 @@ int > > > vxlan_clone_destroy(struct ifnet *ifp) > > > { > > > struct vxlan_softc *sc = ifp->if_softc; > > > int s; > > > > > > + NET_LOCK(s); > > > s = splnet(); > > > vxlan_multicast_cleanup(ifp); > > > splx(s); > > > + NET_UNLOCK(s); > > > > > > vxlan_enable--; > > > LIST_REMOVE(sc, sc_entry); > > > > > > ifmedia_delete_instance(>sc_media, IFM_INST_ANY); >
Re: if attach/detach netlocks
On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote: > On 22/12/16(Thu) 20:45, Mike Belopuhov wrote: > > I think this is what is required here. Works here, but YMMV. > > splnet() in a pseudo-driver seems completely wrong, you could get rid of > it. Yes, but that is another issue. Can we get the netlock splasserts fixed first? This diff looks good to me. bluhm > > diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c > > index e9bc1cb8305..dfb71cf9467 100644 > > --- sys/net/if_vxlan.c > > +++ sys/net/if_vxlan.c > > @@ -178,13 +178,15 @@ int > > vxlan_clone_destroy(struct ifnet *ifp) > > { > > struct vxlan_softc *sc = ifp->if_softc; > > int s; > > > > + NET_LOCK(s); > > s = splnet(); > > vxlan_multicast_cleanup(ifp); > > splx(s); > > + NET_UNLOCK(s); > > > > vxlan_enable--; > > LIST_REMOVE(sc, sc_entry); > > > > ifmedia_delete_instance(>sc_media, IFM_INST_ANY);
Re: if attach/detach netlocks
On 22/12/16(Thu) 20:45, Mike Belopuhov wrote: > On Wed, Dec 21, 2016 at 13:06 +0100, Alexander Bluhm wrote: > > On Wed, Dec 21, 2016 at 12:45:37PM +0100, Mike Belopuhov wrote: > > > Anyways, OK for the diff below? > > > > OK bluhm@ > > > > > diff --git sys/net/if_pflow.c sys/net/if_pflow.c > > > index 0df0b69fd8a..8592d98a743 100644 > > > --- sys/net/if_pflow.c > > > +++ sys/net/if_pflow.c > > > @@ -265,14 +265,11 @@ pflow_clone_destroy(struct ifnet *ifp) > > > if (timeout_initialized(>sc_tmo_tmpl)) > > > timeout_del(>sc_tmo_tmpl); > > > pflow_flush(sc); > > > m_freem(sc->send_nam); > > > if (sc->so != NULL) { > > > - /* XXXSMP breaks atomicity */ > > > - rw_exit_write(); > > > error = soclose(sc->so); > > > - rw_enter_write(); > > > sc->so = NULL; > > > } > > > if (sc->sc_flowdst != NULL) > > > free(sc->sc_flowdst, M_DEVBUF, sc->sc_flowdst->sa_len); > > > if (sc->sc_flowsrc != NULL) > > I think this is what is required here. Works here, but YMMV. splnet() in a pseudo-driver seems completely wrong, you could get rid of it. > > diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c > index e9bc1cb8305..dfb71cf9467 100644 > --- sys/net/if_vxlan.c > +++ sys/net/if_vxlan.c > @@ -178,13 +178,15 @@ int > vxlan_clone_destroy(struct ifnet *ifp) > { > struct vxlan_softc *sc = ifp->if_softc; > int s; > > + NET_LOCK(s); > s = splnet(); > vxlan_multicast_cleanup(ifp); > splx(s); > + NET_UNLOCK(s); > > vxlan_enable--; > LIST_REMOVE(sc, sc_entry); > > ifmedia_delete_instance(>sc_media, IFM_INST_ANY);
Re: if attach/detach netlocks
On Wed, Dec 21, 2016 at 13:06 +0100, Alexander Bluhm wrote: > On Wed, Dec 21, 2016 at 12:45:37PM +0100, Mike Belopuhov wrote: > > Anyways, OK for the diff below? > > OK bluhm@ > > > diff --git sys/net/if_pflow.c sys/net/if_pflow.c > > index 0df0b69fd8a..8592d98a743 100644 > > --- sys/net/if_pflow.c > > +++ sys/net/if_pflow.c > > @@ -265,14 +265,11 @@ pflow_clone_destroy(struct ifnet *ifp) > > if (timeout_initialized(>sc_tmo_tmpl)) > > timeout_del(>sc_tmo_tmpl); > > pflow_flush(sc); > > m_freem(sc->send_nam); > > if (sc->so != NULL) { > > - /* XXXSMP breaks atomicity */ > > - rw_exit_write(); > > error = soclose(sc->so); > > - rw_enter_write(); > > sc->so = NULL; > > } > > if (sc->sc_flowdst != NULL) > > free(sc->sc_flowdst, M_DEVBUF, sc->sc_flowdst->sa_len); > > if (sc->sc_flowsrc != NULL) I think this is what is required here. Works here, but YMMV. diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c index e9bc1cb8305..dfb71cf9467 100644 --- sys/net/if_vxlan.c +++ sys/net/if_vxlan.c @@ -178,13 +178,15 @@ int vxlan_clone_destroy(struct ifnet *ifp) { struct vxlan_softc *sc = ifp->if_softc; int s; + NET_LOCK(s); s = splnet(); vxlan_multicast_cleanup(ifp); splx(s); + NET_UNLOCK(s); vxlan_enable--; LIST_REMOVE(sc, sc_entry); ifmedia_delete_instance(>sc_media, IFM_INST_ANY);
Re: if attach/detach netlocks
On Tue, Dec 20, 2016 at 07:33:35PM +0100, Martin Pieuchot wrote: > On 20/12/16(Tue) 18:47, Mike Belopuhov wrote: > > On Tue, Dec 20, 2016 at 17:06 +0100, Martin Pieuchot wrote: > > > > > > You'll need to release the lock before calling ifc->ifc_create in > > > if_clone_create() and do the same dance in if_clone_destroy(). > > > > > > > Indeed. > > > > > But I think that's the way to go. If you can create/destroy > > > pseudo-interface without panicing we're good. > > > > > > > Seems to work here, but it feels wrong. > > I don't think we can do much better, let's go with that we'll refine > later. And another one. splassert: ip_output: want 1 have 0 Starting stack trace... ip_output(1,0,d09e32c1,d05129ab,d1001000) at ip_output+0x6d ip_output(d9766c00,d9804200,0,0,f5770c10) at ip_output+0x6d igmp_sendpkt(d3e9d800,d3e24640,17,2e0,d3704560) at igmp_sendpkt+0x119 igmp_leavegroup(d3e24640,d09e29de,d0bbe520,d03cdcb6,0) at igmp_leavegroup+0x65 in_delmulti(d3e24640,d3704560,c0186943,f5770e74,f5770e79) at in_delmulti+0x6a vxlan_multicast_cleanup(d3e8f000,f5770e74,770d4c,d046b999,d3e8f000) at vxlan_multicast_cleanup+0xbd vxlan_clone_destroy(d3e8f000,0,c0186943,d055d8cd,d961d7a0) at vxlan_clone_destroy+0x2a if_clone_destroy(f5770e74,0,d3e8f000,1,d3e8f000) at if_clone_destroy+0x7b ifioctl(d8e6beb0,80206979,f5770e74,d96ffcc0,d96f113c) at ifioctl+0x1ed soo_ioctl(d961ca08,80206979,f5770e74,d96ffcc0,d97b1f64) at soo_ioctl+0x21c sys_ioctl(d96ffcc0,f5770f5c,f5770f7c,0,f5770fa8) at sys_ioctl+0x19f syscall() at syscall+0x250 This time vxlan_multicast_cleanup() expects that it holds the lock. vxlan_clone_destroy() does an splnet(), but ip_output() expects to have the netlock also. bluhm
Re: if attach/detach netlocks
On Wed, Dec 21, 2016 at 12:45:37PM +0100, Mike Belopuhov wrote: > Anyways, OK for the diff below? OK bluhm@ > diff --git sys/net/if_pflow.c sys/net/if_pflow.c > index 0df0b69fd8a..8592d98a743 100644 > --- sys/net/if_pflow.c > +++ sys/net/if_pflow.c > @@ -265,14 +265,11 @@ pflow_clone_destroy(struct ifnet *ifp) > if (timeout_initialized(>sc_tmo_tmpl)) > timeout_del(>sc_tmo_tmpl); > pflow_flush(sc); > m_freem(sc->send_nam); > if (sc->so != NULL) { > - /* XXXSMP breaks atomicity */ > - rw_exit_write(); > error = soclose(sc->so); > - rw_enter_write(); > sc->so = NULL; > } > if (sc->sc_flowdst != NULL) > free(sc->sc_flowdst, M_DEVBUF, sc->sc_flowdst->sa_len); > if (sc->sc_flowsrc != NULL)
Re: if attach/detach netlocks
On Wed, Dec 21, 2016 at 00:13 +0100, Alexander Bluhm wrote: > On Tue, Dec 20, 2016 at 06:47:31PM +0100, Mike Belopuhov wrote: > > @@ -1109,11 +1115,16 @@ if_clone_destroy(const char *name) > > s = splnet(); > > if_down(ifp); > > splx(s); > > } > > > > - return ((*ifc->ifc_destroy)(ifp)); > > + /* XXXSMP breaks atomicity */ > > + rw_exit_write(); > > + ret = (*ifc->ifc_destroy)(ifp); > > + rw_enter_write(); > > + > > + return (ret); > > } > > This broke pflow again. > > panic: netlock: lock not held > Stopped at Debugger+0x7: leave >TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > *440943 4323 0 0x2 01 ifconfig > Debugger(d09facfd,f57a2ca8,d09d23c6,f57a2ca8,d76fd000) at Debugger+0x7 > panic(d09d23c6,d09dc32f,f57a2cec,d76fd000,0) at panic+0x71 > rw_assert_wrlock(d0b56f38,40,f57a2cec,d03e506b,d8e1eb00) at > rw_assert_wrlock+0x > 3d > rw_exit_write(d0b56f38,0,7a2d4c,d046b949,d76fd000) at rw_exit_write+0x19 > pflow_clone_destroy(d76fd000,0,c0186943,d055d8ed,d96ad7a0) at > pflow_clone_destr > oy+0x71 > if_clone_destroy(f57a2e74,0,d76fd000,1,d76fd000) at if_clone_destroy+0x7b > ifioctl(d9587648,80206979,f57a2e74,d97e22d8,d956d26c) at ifioctl+0x1ed > soo_ioctl(d96405a8,80206979,f57a2e74,d97e22d8,d97b1f64) at soo_ioctl+0x21c > sys_ioctl(d97e22d8,f57a2f5c,f57a2f7c,0,f57a2fa8) at sys_ioctl+0x19f > syscall() at syscall+0x250 > --- syscall (number 9) --- > 0x14: > https://www.openbsd.org/ddb.html describes the minimum info required in bug > reports. Insufficient info makes it difficult to find and fix bugs. > ddb{1}> > > Now we have two conflicting XXXSMP workarounds in if_clone_destroy() > and pflow_clone_destroy(). > > bluhm With the diff above, the workaround in pflow_clone_destroy is not needed. The diff below fixes this for me. However, I start doubting this approach. A lot of code is now run with the lock released. While I agree that the amount of lock recursions should be minimized, I don't think that eliminating them completely is a necessity at this point. Recursive rwlock would work just fine. We can add a flag to make rrw_enter fail to recurse for the purpose of minimizing the amount of recursions. Haesbaert has experimented extensively with (r)rwlocks instead of KERNEL_LOCKs (which is what we're doing, sort of) and found out (r)rwlocks to be about 25% slower in real life applications (make build) and produce a ton of IPI calls due to sleep/wakeup cycles. He then moved on to turnstile exclusive recursive locks (bmtx) [1] with better semantics to regain performance and lower down the number of IPIs. Anyways, OK for the diff below? [1] https://github.com/bitrig/bitrig/blob/smpns/sys/kern/kern_bmtx.c diff --git sys/net/if_pflow.c sys/net/if_pflow.c index 0df0b69fd8a..8592d98a743 100644 --- sys/net/if_pflow.c +++ sys/net/if_pflow.c @@ -265,14 +265,11 @@ pflow_clone_destroy(struct ifnet *ifp) if (timeout_initialized(>sc_tmo_tmpl)) timeout_del(>sc_tmo_tmpl); pflow_flush(sc); m_freem(sc->send_nam); if (sc->so != NULL) { - /* XXXSMP breaks atomicity */ - rw_exit_write(); error = soclose(sc->so); - rw_enter_write(); sc->so = NULL; } if (sc->sc_flowdst != NULL) free(sc->sc_flowdst, M_DEVBUF, sc->sc_flowdst->sa_len); if (sc->sc_flowsrc != NULL)
Re: if attach/detach netlocks
On Tue, Dec 20, 2016 at 06:47:31PM +0100, Mike Belopuhov wrote: > @@ -1109,11 +1115,16 @@ if_clone_destroy(const char *name) > s = splnet(); > if_down(ifp); > splx(s); > } > > - return ((*ifc->ifc_destroy)(ifp)); > + /* XXXSMP breaks atomicity */ > + rw_exit_write(); > + ret = (*ifc->ifc_destroy)(ifp); > + rw_enter_write(); > + > + return (ret); > } This broke pflow again. panic: netlock: lock not held Stopped at Debugger+0x7: leave TIDPIDUID PRFLAGS PFLAGS CPU COMMAND *440943 4323 0 0x2 01 ifconfig Debugger(d09facfd,f57a2ca8,d09d23c6,f57a2ca8,d76fd000) at Debugger+0x7 panic(d09d23c6,d09dc32f,f57a2cec,d76fd000,0) at panic+0x71 rw_assert_wrlock(d0b56f38,40,f57a2cec,d03e506b,d8e1eb00) at rw_assert_wrlock+0x 3d rw_exit_write(d0b56f38,0,7a2d4c,d046b949,d76fd000) at rw_exit_write+0x19 pflow_clone_destroy(d76fd000,0,c0186943,d055d8ed,d96ad7a0) at pflow_clone_destr oy+0x71 if_clone_destroy(f57a2e74,0,d76fd000,1,d76fd000) at if_clone_destroy+0x7b ifioctl(d9587648,80206979,f57a2e74,d97e22d8,d956d26c) at ifioctl+0x1ed soo_ioctl(d96405a8,80206979,f57a2e74,d97e22d8,d97b1f64) at soo_ioctl+0x21c sys_ioctl(d97e22d8,f57a2f5c,f57a2f7c,0,f57a2fa8) at sys_ioctl+0x19f syscall() at syscall+0x250 --- syscall (number 9) --- 0x14: https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. ddb{1}> Now we have two conflicting XXXSMP workarounds in if_clone_destroy() and pflow_clone_destroy(). bluhm
Re: if attach/detach netlocks
On 20/12/16(Tue) 18:47, Mike Belopuhov wrote: > On Tue, Dec 20, 2016 at 17:06 +0100, Martin Pieuchot wrote: > > > > You'll need to release the lock before calling ifc->ifc_create in > > if_clone_create() and do the same dance in if_clone_destroy(). > > > > Indeed. > > > But I think that's the way to go. If you can create/destroy > > pseudo-interface without panicing we're good. > > > > Seems to work here, but it feels wrong. I don't think we can do much better, let's go with that we'll refine later. ok mpi@ > > > > void > > > @@ -941,6 +941,7 @@ if_detach(struct ifnet *ifp) > > > > > > ifq_clr_oactive(>if_snd); > > > > > > + rw_enter_write(); > > > s = splnet(); > > > /* Other CPUs must not have a reference before we start destroying. */ > > > if_idxmap_remove(ifp); > > > @@ -1017,6 +1018,7 @@ if_detach(struct ifnet *ifp) > > > /* Announce that the interface is gone. */ > > > rt_ifannouncemsg(ifp, IFAN_DEPARTURE); > > > splx(s); > > > + rw_exit_write(); > > > > > > ifq_destroy(>if_snd); > > > } > > > > Please keep the NET_LOCK/UNLOCK() macro. The places where we > > grab/release the `netlock' directly are places that need attention. > > Done. > > diff --git sys/net/if.c sys/net/if.c > index 1e103564710..bd8b9267d5f 100644 > --- sys/net/if.c > +++ sys/net/if.c > @@ -523,15 +523,15 @@ if_attachhead(struct ifnet *ifp) > void > if_attach(struct ifnet *ifp) > { > int s; > > - s = splsoftnet(); > + NET_LOCK(s); > if_attach_common(ifp); > TAILQ_INSERT_TAIL(, ifp, if_list); > if_attachsetup(ifp); > - splx(s); > + NET_UNLOCK(s); > } > > void > if_attach_common(struct ifnet *ifp) > { > @@ -939,10 +939,11 @@ if_detach(struct ifnet *ifp) > /* Undo pseudo-driver changes. */ > if_deactivate(ifp); > > ifq_clr_oactive(>if_snd); > > + NET_LOCK(s); > s = splnet(); > /* Other CPUs must not have a reference before we start destroying. */ > if_idxmap_remove(ifp); > > ifp->if_start = if_detached_start; > @@ -1015,10 +1016,11 @@ if_detach(struct ifnet *ifp) > } > > /* Announce that the interface is gone. */ > rt_ifannouncemsg(ifp, IFAN_DEPARTURE); > splx(s); > + NET_UNLOCK(s); > > ifq_destroy(>if_snd); > } > > /* > @@ -1068,11 +1070,14 @@ if_clone_create(const char *name, int rdomain) > return (EINVAL); > > if (ifunit(name) != NULL) > return (EEXIST); > > + /* XXXSMP breaks atomicity */ > + rw_exit_write(); > ret = (*ifc->ifc_create)(ifc, unit); > + rw_enter_write(); > > if (ret != 0 || (ifp = ifunit(name)) == NULL) > return (ret); > > if_addgroup(ifp, ifc->ifc_name); > @@ -1088,10 +1093,11 @@ if_clone_create(const char *name, int rdomain) > int > if_clone_destroy(const char *name) > { > struct if_clone *ifc; > struct ifnet *ifp; > + int ret; > > splsoftassert(IPL_SOFTNET); > > ifc = if_clone_lookup(name, NULL); > if (ifc == NULL) > @@ -1109,11 +1115,16 @@ if_clone_destroy(const char *name) > s = splnet(); > if_down(ifp); > splx(s); > } > > - return ((*ifc->ifc_destroy)(ifp)); > + /* XXXSMP breaks atomicity */ > + rw_exit_write(); > + ret = (*ifc->ifc_destroy)(ifp); > + rw_enter_write(); > + > + return (ret); > } > > /* > * Look up a network interface cloner. > */
Re: if attach/detach netlocks
On Tue, Dec 20, 2016 at 17:06 +0100, Martin Pieuchot wrote: > > You'll need to release the lock before calling ifc->ifc_create in > if_clone_create() and do the same dance in if_clone_destroy(). > Indeed. > But I think that's the way to go. If you can create/destroy > pseudo-interface without panicing we're good. > Seems to work here, but it feels wrong. > > void > > @@ -941,6 +941,7 @@ if_detach(struct ifnet *ifp) > > > > ifq_clr_oactive(>if_snd); > > > > + rw_enter_write(); > > s = splnet(); > > /* Other CPUs must not have a reference before we start destroying. */ > > if_idxmap_remove(ifp); > > @@ -1017,6 +1018,7 @@ if_detach(struct ifnet *ifp) > > /* Announce that the interface is gone. */ > > rt_ifannouncemsg(ifp, IFAN_DEPARTURE); > > splx(s); > > + rw_exit_write(); > > > > ifq_destroy(>if_snd); > > } > > Please keep the NET_LOCK/UNLOCK() macro. The places where we > grab/release the `netlock' directly are places that need attention. Done. diff --git sys/net/if.c sys/net/if.c index 1e103564710..bd8b9267d5f 100644 --- sys/net/if.c +++ sys/net/if.c @@ -523,15 +523,15 @@ if_attachhead(struct ifnet *ifp) void if_attach(struct ifnet *ifp) { int s; - s = splsoftnet(); + NET_LOCK(s); if_attach_common(ifp); TAILQ_INSERT_TAIL(, ifp, if_list); if_attachsetup(ifp); - splx(s); + NET_UNLOCK(s); } void if_attach_common(struct ifnet *ifp) { @@ -939,10 +939,11 @@ if_detach(struct ifnet *ifp) /* Undo pseudo-driver changes. */ if_deactivate(ifp); ifq_clr_oactive(>if_snd); + NET_LOCK(s); s = splnet(); /* Other CPUs must not have a reference before we start destroying. */ if_idxmap_remove(ifp); ifp->if_start = if_detached_start; @@ -1015,10 +1016,11 @@ if_detach(struct ifnet *ifp) } /* Announce that the interface is gone. */ rt_ifannouncemsg(ifp, IFAN_DEPARTURE); splx(s); + NET_UNLOCK(s); ifq_destroy(>if_snd); } /* @@ -1068,11 +1070,14 @@ if_clone_create(const char *name, int rdomain) return (EINVAL); if (ifunit(name) != NULL) return (EEXIST); + /* XXXSMP breaks atomicity */ + rw_exit_write(); ret = (*ifc->ifc_create)(ifc, unit); + rw_enter_write(); if (ret != 0 || (ifp = ifunit(name)) == NULL) return (ret); if_addgroup(ifp, ifc->ifc_name); @@ -1088,10 +1093,11 @@ if_clone_create(const char *name, int rdomain) int if_clone_destroy(const char *name) { struct if_clone *ifc; struct ifnet *ifp; + int ret; splsoftassert(IPL_SOFTNET); ifc = if_clone_lookup(name, NULL); if (ifc == NULL) @@ -1109,11 +1115,16 @@ if_clone_destroy(const char *name) s = splnet(); if_down(ifp); splx(s); } - return ((*ifc->ifc_destroy)(ifp)); + /* XXXSMP breaks atomicity */ + rw_exit_write(); + ret = (*ifc->ifc_destroy)(ifp); + rw_enter_write(); + + return (ret); } /* * Look up a network interface cloner. */
Re: if attach/detach netlocks
On 20/12/16(Tue) 16:33, Mike Belopuhov wrote: > Attach: > > splassert: sowakeup: want 1 have 0 > Starting stack trace... > sowakeup() at sowakeup+0x41 > sorwakeup() at sorwakeup+0xb4 > route_input() at route_input+0x2d7 > if_attach() at if_attach+0x58 > xnf_attach() at xnf_attach+0x45f > config_attach() at config_attach+0x1bc > xen_attach_device() at xen_attach_device+0x146 > xen_hotplug() at xen_hotplug+0x23f > taskq_thread() at taskq_thread+0x6c > end trace frame: 0x0, count: 248 > End of stack trace. > > Detach: > > splassert: sowakeup: want 1 have 0 > Starting stack trace... > sowakeup() at sowakeup+0x41 > sorwakeup() at sorwakeup+0xb4 > route_input() at route_input+0x2d7 > if_detach() at if_detach+0x27e > xnf_detach() at xnf_detach+0x4d > config_detach() at config_detach+0x13c > xen_hotplug() at xen_hotplug+0x2e5 > taskq_thread() at taskq_thread+0x6c > end trace frame: 0x0, count: 249 > End of stack trace. > xnf2 detached > > The taskq running xen_hotplug is KERNEL_LOCK'ed. The diff below solves > the issue, but is it any good? > > diff --git sys/net/if.c sys/net/if.c > index 1e103564710..a5c25b89560 100644 > --- sys/net/if.c > +++ sys/net/if.c > @@ -525,11 +525,11 @@ if_attach(struct ifnet *ifp) > { > int s; > > - s = splsoftnet(); > + NET_LOCK(s); > if_attach_common(ifp); > TAILQ_INSERT_TAIL(, ifp, if_list); > if_attachsetup(ifp); > - splx(s); > + NET_UNLOCK(s); You'll need to release the lock before calling ifc->ifc_create in if_clone_create() and do the same dance in if_clone_destroy(). But I think that's the way to go. If you can create/destroy pseudo-interface without panicing we're good. > void > @@ -941,6 +941,7 @@ if_detach(struct ifnet *ifp) > > ifq_clr_oactive(>if_snd); > > + rw_enter_write(); > s = splnet(); > /* Other CPUs must not have a reference before we start destroying. */ > if_idxmap_remove(ifp); > @@ -1017,6 +1018,7 @@ if_detach(struct ifnet *ifp) > /* Announce that the interface is gone. */ > rt_ifannouncemsg(ifp, IFAN_DEPARTURE); > splx(s); > + rw_exit_write(); > > ifq_destroy(>if_snd); > } Please keep the NET_LOCK/UNLOCK() macro. The places where we grab/release the `netlock' directly are places that need attention.