Re: if attach/detach netlocks

2017-01-03 Thread Mike Belopuhov
On 3 January 2017 at 12:06, Reyk Floeter  wrote:
> On Tue, Jan 03, 2017 at 11:42:21AM +0100, Martin Pieuchot wrote:
>> On 02/01/17(Mon) 21:51, Mike Belopuhov wrote:
>> > I got to test the diff and I had to make another adjustment:
>> > vxlan_if_change is setup as a detach hook, however dohooks is
>> > called very early in if_detach before we remove IP addresses
>> > from the interface.  It makes vxlan_config find these IP
>> > addresses just fine and re-add its own detach hook again.
>>
>> Why not fix vxlan_if_change()?
>>
>> >   This
>> > repeats ad infinitum hogging the machine.  I couldn't think of
>> > anything better than deferring an operation via a task.  Seems
>> > to do the trick.
>>
>> That's ugly.  Why would you re-add anything in a detach hook?  This
>> is obviously broken.
>>
>
> It is used for the multicast address (only) and it made sense when the
> source address wasn't mandatory, eg. 0.0.0.0 -> 239.1.1.100 would find
> the next interface that points to the group after detaching the
> current one.
>
> The source is mandatory now, so the diff below should work as well.
>
> Reyk
>

Thanks, this looks good, OK mikeb



Re: if attach/detach netlocks

2017-01-03 Thread Reyk Floeter
On Tue, Jan 03, 2017 at 11:42:21AM +0100, Martin Pieuchot wrote:
> On 02/01/17(Mon) 21:51, Mike Belopuhov wrote:
> > On Fri, Dec 30, 2016 at 18:57 +0100, Mike Belopuhov wrote:
> > > On Thu, Dec 29, 2016 at 09:30 +0100, Martin Pieuchot wrote:
> > > > On 29/12/16(Thu) 01:15, Alexander Bluhm wrote:
> > > > > On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote:
> > > > > > On 22/12/16(Thu) 20:45, Mike Belopuhov wrote:
> > > > > > > I think this is what is required here.  Works here, but YMMV.
> > > > > > 
> > > > > > splnet() in a pseudo-driver seems completely wrong, you could get 
> > > > > > rid of
> > > > > > it.
> > > > > 
> > > > > Yes, but that is another issue.  Can we get the netlock splasserts
> > > > > fixed first?  This diff looks good to me.
> > > > 
> > > > Sure I'm ok with the diff.
> > > > 
> > > 
> > > I agree with Martin and have cooked a diff but couldn't test it yet.
> > > This is it for the reference.
> > > 
> > 
> > I got to test the diff and I had to make another adjustment:
> > vxlan_if_change is setup as a detach hook, however dohooks is
> > called very early in if_detach before we remove IP addresses
> > from the interface.  It makes vxlan_config find these IP
> > addresses just fine and re-add its own detach hook again.
> 
> Why not fix vxlan_if_change()?
> 
> >   This
> > repeats ad infinitum hogging the machine.  I couldn't think of
> > anything better than deferring an operation via a task.  Seems
> > to do the trick.
> 
> That's ugly.  Why would you re-add anything in a detach hook?  This
> is obviously broken.
> 

It is used for the multicast address (only) and it made sense when the
source address wasn't mandatory, eg. 0.0.0.0 -> 239.1.1.100 would find
the next interface that points to the group after detaching the
current one.

The source is mandatory now, so the diff below should work as well.

Reyk

Index: sys/net/if_vxlan.c
===
RCS file: /cvs/src/sys/net/if_vxlan.c,v
retrieving revision 1.54
diff -u -p -u -p -r1.54 if_vxlan.c
--- sys/net/if_vxlan.c  13 Dec 2016 06:51:11 -  1.54
+++ sys/net/if_vxlan.c  3 Jan 2017 11:05:36 -
@@ -947,17 +947,16 @@ vxlan_if_change(void *arg)
 {
struct vxlan_softc  *sc = arg;
struct ifnet*ifp = >sc_ac.ac_if;
-   int  s, error;
+   int  s;
 
/*
 * Reset the configuration after the parent interface disappeared.
 */
s = splnet();
-   if ((error = vxlan_config(ifp, NULL, NULL)) != 0) {
-   /* The configured tunnel addresses are invalid, remove them */
-   bzero(>sc_src, sizeof(sc->sc_src));
-   bzero(>sc_dst, sizeof(sc->sc_dst));
-   }
+   vxlan_multicast_cleanup(ifp);
+   memset(>sc_src, 0, sizeof(sc->sc_src));
+   memset(>sc_dst, 0, sizeof(sc->sc_dst));
+   sc->sc_dstport = htons(VXLAN_PORT);
splx(s);
 }
 



Re: if attach/detach netlocks

2017-01-03 Thread Martin Pieuchot
On 02/01/17(Mon) 21:51, Mike Belopuhov wrote:
> On Fri, Dec 30, 2016 at 18:57 +0100, Mike Belopuhov wrote:
> > On Thu, Dec 29, 2016 at 09:30 +0100, Martin Pieuchot wrote:
> > > On 29/12/16(Thu) 01:15, Alexander Bluhm wrote:
> > > > On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote:
> > > > > On 22/12/16(Thu) 20:45, Mike Belopuhov wrote:
> > > > > > I think this is what is required here.  Works here, but YMMV.
> > > > > 
> > > > > splnet() in a pseudo-driver seems completely wrong, you could get rid 
> > > > > of
> > > > > it.
> > > > 
> > > > Yes, but that is another issue.  Can we get the netlock splasserts
> > > > fixed first?  This diff looks good to me.
> > > 
> > > Sure I'm ok with the diff.
> > > 
> > 
> > I agree with Martin and have cooked a diff but couldn't test it yet.
> > This is it for the reference.
> > 
> 
> I got to test the diff and I had to make another adjustment:
> vxlan_if_change is setup as a detach hook, however dohooks is
> called very early in if_detach before we remove IP addresses
> from the interface.  It makes vxlan_config find these IP
> addresses just fine and re-add its own detach hook again.

Why not fix vxlan_if_change()?

>   This
> repeats ad infinitum hogging the machine.  I couldn't think of
> anything better than deferring an operation via a task.  Seems
> to do the trick.

That's ugly.  Why would you re-add anything in a detach hook?  This
is obviously broken.



Re: if attach/detach netlocks

2017-01-02 Thread Mike Belopuhov
On Fri, Dec 30, 2016 at 18:57 +0100, Mike Belopuhov wrote:
> On Thu, Dec 29, 2016 at 09:30 +0100, Martin Pieuchot wrote:
> > On 29/12/16(Thu) 01:15, Alexander Bluhm wrote:
> > > On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote:
> > > > On 22/12/16(Thu) 20:45, Mike Belopuhov wrote:
> > > > > I think this is what is required here.  Works here, but YMMV.
> > > > 
> > > > splnet() in a pseudo-driver seems completely wrong, you could get rid of
> > > > it.
> > > 
> > > Yes, but that is another issue.  Can we get the netlock splasserts
> > > fixed first?  This diff looks good to me.
> > 
> > Sure I'm ok with the diff.
> > 
> 
> I agree with Martin and have cooked a diff but couldn't test it yet.
> This is it for the reference.
> 

I got to test the diff and I had to make another adjustment:
vxlan_if_change is setup as a detach hook, however dohooks is
called very early in if_detach before we remove IP addresses
from the interface.  It makes vxlan_config find these IP
addresses just fine and re-add its own detach hook again. This
repeats ad infinitum hogging the machine.  I couldn't think of
anything better than deferring an operation via a task.  Seems
to do the trick.

OK?

diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c
index e9bc1cb8305..082e4ab5814 100644
--- sys/net/if_vxlan.c
+++ sys/net/if_vxlan.c
@@ -71,10 +71,12 @@ struct vxlan_softc {
in_port_tsc_dstport;
u_intsc_rdomain;
int64_t  sc_vnetid;
u_int8_t sc_ttl;
 
+   struct task  sc_task;
+
LIST_ENTRY(vxlan_softc)  sc_entry;
 };
 
 voidvxlanattach(int);
 int vxlanioctl(struct ifnet *, u_long, caddr_t);
@@ -89,10 +91,11 @@ void vxlan_media_status(struct ifnet *, struct 
ifmediareq *);
 int vxlan_config(struct ifnet *, struct sockaddr *, struct sockaddr *);
 int vxlan_output(struct ifnet *, struct mbuf *);
 voidvxlan_addr_change(void *);
 voidvxlan_if_change(void *);
 voidvxlan_link_change(void *);
+voidvxlan_config_defer(void *);
 
 int vxlan_sockaddr_cmp(struct sockaddr *, struct sockaddr *);
 uint16_t vxlan_sockaddr_port(struct sockaddr *);
 
 struct if_clonevxlan_cloner =
@@ -166,10 +169,12 @@ vxlan_clone_create(struct if_clone *ifc, int unit)
 */
ifp->if_mtu = ETHERMTU - sizeof(struct ether_header);
ifp->if_mtu -= sizeof(struct vxlanudphdr) + sizeof(struct ipovly);
 #endif
 
+   task_set(>sc_task, vxlan_config_defer, sc);
+
LIST_INSERT_HEAD(_tagh[VXLAN_TAGHASH(0)], sc, sc_entry);
vxlan_enable++;
 
return (0);
 }
@@ -178,13 +183,13 @@ int
 vxlan_clone_destroy(struct ifnet *ifp)
 {
struct vxlan_softc  *sc = ifp->if_softc;
int  s;
 
-   s = splnet();
+   NET_LOCK(s);
vxlan_multicast_cleanup(ifp);
-   splx(s);
+   NET_UNLOCK(s);
 
vxlan_enable--;
LIST_REMOVE(sc, sc_entry);
 
ifmedia_delete_instance(>sc_media, IFM_INST_ANY);
@@ -392,11 +397,11 @@ int
 vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
 {
struct vxlan_softc  *sc = (struct vxlan_softc *)ifp->if_softc;
struct ifreq*ifr = (struct ifreq *)data;
struct if_laddrreq  *lifr = (struct if_laddrreq *)data;
-   int  error = 0, s;
+   int  error = 0;
 
switch (cmd) {
case SIOCSIFADDR:
ifp->if_flags |= IFF_UP;
/* FALLTHROUGH */
@@ -417,24 +422,20 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
case SIOCSIFMEDIA:
error = ifmedia_ioctl(ifp, ifr, >sc_media, cmd);
break;
 
case SIOCSLIFPHYADDR:
-   s = splnet();
error = vxlan_config(ifp,
(struct sockaddr *)>addr,
(struct sockaddr *)>dstaddr);
-   splx(s);
break;
 
case SIOCDIFPHYADDR:
-   s = splnet();
vxlan_multicast_cleanup(ifp);
bzero(>sc_src, sizeof(sc->sc_src));
bzero(>sc_dst, sizeof(sc->sc_dst));
sc->sc_dstport = htons(VXLAN_PORT);
-   splx(s);
break;
 
case SIOCGLIFPHYADDR:
if (sc->sc_dst.ss_family == AF_UNSPEC) {
error = EADDRNOTAVAIL;
@@ -451,14 +452,12 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
ifr->ifr_rdomainid > RT_TABLEID_MAX ||
!rtable_exists(ifr->ifr_rdomainid)) {
error = EINVAL;
break;
}
-   s = splnet();
sc->sc_rdomain = ifr->ifr_rdomainid;
(void)vxlan_config(ifp, NULL, NULL);
-   splx(s);
break;
 
case SIOCGLIFPHYRTABLE:
ifr->ifr_rdomainid = 

Re: if attach/detach netlocks

2016-12-30 Thread Mike Belopuhov
On Thu, Dec 29, 2016 at 09:30 +0100, Martin Pieuchot wrote:
> On 29/12/16(Thu) 01:15, Alexander Bluhm wrote:
> > On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote:
> > > On 22/12/16(Thu) 20:45, Mike Belopuhov wrote:
> > > > I think this is what is required here.  Works here, but YMMV.
> > > 
> > > splnet() in a pseudo-driver seems completely wrong, you could get rid of
> > > it.
> > 
> > Yes, but that is another issue.  Can we get the netlock splasserts
> > fixed first?  This diff looks good to me.
> 
> Sure I'm ok with the diff.
> 

I agree with Martin and have cooked a diff but couldn't test it yet.
This is it for the reference.

diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c
index e9bc1cb8305..74437b8b79c 100644
--- sys/net/if_vxlan.c
+++ sys/net/if_vxlan.c
@@ -178,13 +178,13 @@ int
 vxlan_clone_destroy(struct ifnet *ifp)
 {
struct vxlan_softc  *sc = ifp->if_softc;
int  s;
 
-   s = splnet();
+   NET_LOCK(s);
vxlan_multicast_cleanup(ifp);
-   splx(s);
+   NET_UNLOCK(s);
 
vxlan_enable--;
LIST_REMOVE(sc, sc_entry);
 
ifmedia_delete_instance(>sc_media, IFM_INST_ANY);
@@ -392,11 +392,11 @@ int
 vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
 {
struct vxlan_softc  *sc = (struct vxlan_softc *)ifp->if_softc;
struct ifreq*ifr = (struct ifreq *)data;
struct if_laddrreq  *lifr = (struct if_laddrreq *)data;
-   int  error = 0, s;
+   int  error = 0;
 
switch (cmd) {
case SIOCSIFADDR:
ifp->if_flags |= IFF_UP;
/* FALLTHROUGH */
@@ -417,24 +417,20 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
case SIOCSIFMEDIA:
error = ifmedia_ioctl(ifp, ifr, >sc_media, cmd);
break;
 
case SIOCSLIFPHYADDR:
-   s = splnet();
error = vxlan_config(ifp,
(struct sockaddr *)>addr,
(struct sockaddr *)>dstaddr);
-   splx(s);
break;
 
case SIOCDIFPHYADDR:
-   s = splnet();
vxlan_multicast_cleanup(ifp);
bzero(>sc_src, sizeof(sc->sc_src));
bzero(>sc_dst, sizeof(sc->sc_dst));
sc->sc_dstport = htons(VXLAN_PORT);
-   splx(s);
break;
 
case SIOCGLIFPHYADDR:
if (sc->sc_dst.ss_family == AF_UNSPEC) {
error = EADDRNOTAVAIL;
@@ -451,14 +447,12 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
ifr->ifr_rdomainid > RT_TABLEID_MAX ||
!rtable_exists(ifr->ifr_rdomainid)) {
error = EINVAL;
break;
}
-   s = splnet();
sc->sc_rdomain = ifr->ifr_rdomainid;
(void)vxlan_config(ifp, NULL, NULL);
-   splx(s);
break;
 
case SIOCGLIFPHYRTABLE:
ifr->ifr_rdomainid = sc->sc_rdomain;
break;
@@ -468,14 +462,12 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
error = EINVAL;
break;
}
if (sc->sc_ttl == (u_int8_t)ifr->ifr_ttl)
break;
-   s = splnet();
sc->sc_ttl = (u_int8_t)(ifr->ifr_ttl);
(void)vxlan_config(ifp, NULL, NULL);
-   splx(s);
break;
 
case SIOCGLIFPHYTTL:
ifr->ifr_ttl = (int)sc->sc_ttl;
break;
@@ -489,14 +481,12 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
 ifr->ifr_vnetid < VXLAN_VNI_MIN)) {
error = EINVAL;
break;
}
 
-   s = splnet();
sc->sc_vnetid = (int)ifr->ifr_vnetid;
(void)vxlan_config(ifp, NULL, NULL);
-   splx(s);
break;
 
case SIOCGVNETID:
if ((sc->sc_vnetid != VXLAN_VNI_ANY) &&
(sc->sc_vnetid > VXLAN_VNI_MAX ||
@@ -507,14 +497,12 @@ vxlanioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
 
ifr->ifr_vnetid = sc->sc_vnetid;
break;
 
case SIOCDVNETID:
-   s = splnet();
sc->sc_vnetid = VXLAN_VNI_UNSET;
(void)vxlan_config(ifp, NULL, NULL);
-   splx(s);
break;
 
default:
error = ether_ioctl(ifp, >sc_ac, cmd, data);
break;
@@ -923,57 +911,56 @@ vxlan_output(struct ifnet *ifp, struct mbuf *m)
 void
 vxlan_addr_change(void *arg)
 {
struct vxlan_softc  *sc = arg;
struct ifnet*ifp = >sc_ac.ac_if;
-   int  s, error;
+   int  

Re: if attach/detach netlocks

2016-12-29 Thread Martin Pieuchot
On 29/12/16(Thu) 01:15, Alexander Bluhm wrote:
> On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote:
> > On 22/12/16(Thu) 20:45, Mike Belopuhov wrote:
> > > I think this is what is required here.  Works here, but YMMV.
> > 
> > splnet() in a pseudo-driver seems completely wrong, you could get rid of
> > it.
> 
> Yes, but that is another issue.  Can we get the netlock splasserts
> fixed first?  This diff looks good to me.

Sure I'm ok with the diff.

> > > diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c
> > > index e9bc1cb8305..dfb71cf9467 100644
> > > --- sys/net/if_vxlan.c
> > > +++ sys/net/if_vxlan.c
> > > @@ -178,13 +178,15 @@ int
> > >  vxlan_clone_destroy(struct ifnet *ifp)
> > >  {
> > >   struct vxlan_softc  *sc = ifp->if_softc;
> > >   int  s;
> > >  
> > > + NET_LOCK(s);
> > >   s = splnet();
> > >   vxlan_multicast_cleanup(ifp);
> > >   splx(s);
> > > + NET_UNLOCK(s);
> > >  
> > >   vxlan_enable--;
> > >   LIST_REMOVE(sc, sc_entry);
> > >  
> > >   ifmedia_delete_instance(>sc_media, IFM_INST_ANY);
> 



Re: if attach/detach netlocks

2016-12-28 Thread Alexander Bluhm
On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote:
> On 22/12/16(Thu) 20:45, Mike Belopuhov wrote:
> > I think this is what is required here.  Works here, but YMMV.
> 
> splnet() in a pseudo-driver seems completely wrong, you could get rid of
> it.

Yes, but that is another issue.  Can we get the netlock splasserts
fixed first?  This diff looks good to me.

bluhm

> > diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c
> > index e9bc1cb8305..dfb71cf9467 100644
> > --- sys/net/if_vxlan.c
> > +++ sys/net/if_vxlan.c
> > @@ -178,13 +178,15 @@ int
> >  vxlan_clone_destroy(struct ifnet *ifp)
> >  {
> > struct vxlan_softc  *sc = ifp->if_softc;
> > int  s;
> >  
> > +   NET_LOCK(s);
> > s = splnet();
> > vxlan_multicast_cleanup(ifp);
> > splx(s);
> > +   NET_UNLOCK(s);
> >  
> > vxlan_enable--;
> > LIST_REMOVE(sc, sc_entry);
> >  
> > ifmedia_delete_instance(>sc_media, IFM_INST_ANY);



Re: if attach/detach netlocks

2016-12-22 Thread Martin Pieuchot
On 22/12/16(Thu) 20:45, Mike Belopuhov wrote:
> On Wed, Dec 21, 2016 at 13:06 +0100, Alexander Bluhm wrote:
> > On Wed, Dec 21, 2016 at 12:45:37PM +0100, Mike Belopuhov wrote:
> > > Anyways, OK for the diff below?
> > 
> > OK bluhm@
> > 
> > > diff --git sys/net/if_pflow.c sys/net/if_pflow.c
> > > index 0df0b69fd8a..8592d98a743 100644
> > > --- sys/net/if_pflow.c
> > > +++ sys/net/if_pflow.c
> > > @@ -265,14 +265,11 @@ pflow_clone_destroy(struct ifnet *ifp)
> > >   if (timeout_initialized(>sc_tmo_tmpl))
> > >   timeout_del(>sc_tmo_tmpl);
> > >   pflow_flush(sc);
> > >   m_freem(sc->send_nam);
> > >   if (sc->so != NULL) {
> > > - /* XXXSMP breaks atomicity */
> > > - rw_exit_write();
> > >   error = soclose(sc->so);
> > > - rw_enter_write();
> > >   sc->so = NULL;
> > >   }
> > >   if (sc->sc_flowdst != NULL)
> > >   free(sc->sc_flowdst, M_DEVBUF, sc->sc_flowdst->sa_len);
> > >   if (sc->sc_flowsrc != NULL)
> 
> I think this is what is required here.  Works here, but YMMV.

splnet() in a pseudo-driver seems completely wrong, you could get rid of
it.

> 
> diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c
> index e9bc1cb8305..dfb71cf9467 100644
> --- sys/net/if_vxlan.c
> +++ sys/net/if_vxlan.c
> @@ -178,13 +178,15 @@ int
>  vxlan_clone_destroy(struct ifnet *ifp)
>  {
>   struct vxlan_softc  *sc = ifp->if_softc;
>   int  s;
>  
> + NET_LOCK(s);
>   s = splnet();
>   vxlan_multicast_cleanup(ifp);
>   splx(s);
> + NET_UNLOCK(s);
>  
>   vxlan_enable--;
>   LIST_REMOVE(sc, sc_entry);
>  
>   ifmedia_delete_instance(>sc_media, IFM_INST_ANY);



Re: if attach/detach netlocks

2016-12-22 Thread Mike Belopuhov
On Wed, Dec 21, 2016 at 13:06 +0100, Alexander Bluhm wrote:
> On Wed, Dec 21, 2016 at 12:45:37PM +0100, Mike Belopuhov wrote:
> > Anyways, OK for the diff below?
> 
> OK bluhm@
> 
> > diff --git sys/net/if_pflow.c sys/net/if_pflow.c
> > index 0df0b69fd8a..8592d98a743 100644
> > --- sys/net/if_pflow.c
> > +++ sys/net/if_pflow.c
> > @@ -265,14 +265,11 @@ pflow_clone_destroy(struct ifnet *ifp)
> > if (timeout_initialized(>sc_tmo_tmpl))
> > timeout_del(>sc_tmo_tmpl);
> > pflow_flush(sc);
> > m_freem(sc->send_nam);
> > if (sc->so != NULL) {
> > -   /* XXXSMP breaks atomicity */
> > -   rw_exit_write();
> > error = soclose(sc->so);
> > -   rw_enter_write();
> > sc->so = NULL;
> > }
> > if (sc->sc_flowdst != NULL)
> > free(sc->sc_flowdst, M_DEVBUF, sc->sc_flowdst->sa_len);
> > if (sc->sc_flowsrc != NULL)

I think this is what is required here.  Works here, but YMMV.

diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c
index e9bc1cb8305..dfb71cf9467 100644
--- sys/net/if_vxlan.c
+++ sys/net/if_vxlan.c
@@ -178,13 +178,15 @@ int
 vxlan_clone_destroy(struct ifnet *ifp)
 {
struct vxlan_softc  *sc = ifp->if_softc;
int  s;
 
+   NET_LOCK(s);
s = splnet();
vxlan_multicast_cleanup(ifp);
splx(s);
+   NET_UNLOCK(s);
 
vxlan_enable--;
LIST_REMOVE(sc, sc_entry);
 
ifmedia_delete_instance(>sc_media, IFM_INST_ANY);



Re: if attach/detach netlocks

2016-12-21 Thread Alexander Bluhm
On Tue, Dec 20, 2016 at 07:33:35PM +0100, Martin Pieuchot wrote:
> On 20/12/16(Tue) 18:47, Mike Belopuhov wrote:
> > On Tue, Dec 20, 2016 at 17:06 +0100, Martin Pieuchot wrote:
> > > 
> > > You'll need to release the lock before calling ifc->ifc_create in
> > > if_clone_create() and do the same dance in if_clone_destroy().
> > >
> > 
> > Indeed.
> > 
> > > But I think that's the way to go.  If you can create/destroy
> > > pseudo-interface without panicing we're good.
> > >
> > 
> > Seems to work here, but it feels wrong.
> 
> I don't think we can do much better, let's go with that we'll refine
> later.

And another one.

splassert: ip_output: want 1 have 0
Starting stack trace...
ip_output(1,0,d09e32c1,d05129ab,d1001000) at ip_output+0x6d
ip_output(d9766c00,d9804200,0,0,f5770c10) at ip_output+0x6d
igmp_sendpkt(d3e9d800,d3e24640,17,2e0,d3704560) at igmp_sendpkt+0x119
igmp_leavegroup(d3e24640,d09e29de,d0bbe520,d03cdcb6,0) at igmp_leavegroup+0x65
in_delmulti(d3e24640,d3704560,c0186943,f5770e74,f5770e79) at in_delmulti+0x6a
vxlan_multicast_cleanup(d3e8f000,f5770e74,770d4c,d046b999,d3e8f000) at 
vxlan_multicast_cleanup+0xbd
vxlan_clone_destroy(d3e8f000,0,c0186943,d055d8cd,d961d7a0) at 
vxlan_clone_destroy+0x2a
if_clone_destroy(f5770e74,0,d3e8f000,1,d3e8f000) at if_clone_destroy+0x7b
ifioctl(d8e6beb0,80206979,f5770e74,d96ffcc0,d96f113c) at ifioctl+0x1ed
soo_ioctl(d961ca08,80206979,f5770e74,d96ffcc0,d97b1f64) at soo_ioctl+0x21c
sys_ioctl(d96ffcc0,f5770f5c,f5770f7c,0,f5770fa8) at sys_ioctl+0x19f
syscall() at syscall+0x250

This time vxlan_multicast_cleanup() expects that it holds the lock.
vxlan_clone_destroy() does an splnet(), but ip_output() expects to
have the netlock also.

bluhm



Re: if attach/detach netlocks

2016-12-21 Thread Alexander Bluhm
On Wed, Dec 21, 2016 at 12:45:37PM +0100, Mike Belopuhov wrote:
> Anyways, OK for the diff below?

OK bluhm@

> diff --git sys/net/if_pflow.c sys/net/if_pflow.c
> index 0df0b69fd8a..8592d98a743 100644
> --- sys/net/if_pflow.c
> +++ sys/net/if_pflow.c
> @@ -265,14 +265,11 @@ pflow_clone_destroy(struct ifnet *ifp)
>   if (timeout_initialized(>sc_tmo_tmpl))
>   timeout_del(>sc_tmo_tmpl);
>   pflow_flush(sc);
>   m_freem(sc->send_nam);
>   if (sc->so != NULL) {
> - /* XXXSMP breaks atomicity */
> - rw_exit_write();
>   error = soclose(sc->so);
> - rw_enter_write();
>   sc->so = NULL;
>   }
>   if (sc->sc_flowdst != NULL)
>   free(sc->sc_flowdst, M_DEVBUF, sc->sc_flowdst->sa_len);
>   if (sc->sc_flowsrc != NULL)



Re: if attach/detach netlocks

2016-12-21 Thread Mike Belopuhov
On Wed, Dec 21, 2016 at 00:13 +0100, Alexander Bluhm wrote:
> On Tue, Dec 20, 2016 at 06:47:31PM +0100, Mike Belopuhov wrote:
> > @@ -1109,11 +1115,16 @@ if_clone_destroy(const char *name)
> > s = splnet();
> > if_down(ifp);
> > splx(s);
> > }
> >  
> > -   return ((*ifc->ifc_destroy)(ifp));
> > +   /* XXXSMP breaks atomicity */
> > +   rw_exit_write();
> > +   ret = (*ifc->ifc_destroy)(ifp);
> > +   rw_enter_write();
> > +
> > +   return (ret);
> >  }
> 
> This broke pflow again.
> 
> panic: netlock: lock not held
> Stopped at  Debugger+0x7:   leave
>TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> *440943   4323  0 0x2  01  ifconfig
> Debugger(d09facfd,f57a2ca8,d09d23c6,f57a2ca8,d76fd000) at Debugger+0x7
> panic(d09d23c6,d09dc32f,f57a2cec,d76fd000,0) at panic+0x71
> rw_assert_wrlock(d0b56f38,40,f57a2cec,d03e506b,d8e1eb00) at 
> rw_assert_wrlock+0x
> 3d
> rw_exit_write(d0b56f38,0,7a2d4c,d046b949,d76fd000) at rw_exit_write+0x19
> pflow_clone_destroy(d76fd000,0,c0186943,d055d8ed,d96ad7a0) at 
> pflow_clone_destr
> oy+0x71
> if_clone_destroy(f57a2e74,0,d76fd000,1,d76fd000) at if_clone_destroy+0x7b
> ifioctl(d9587648,80206979,f57a2e74,d97e22d8,d956d26c) at ifioctl+0x1ed
> soo_ioctl(d96405a8,80206979,f57a2e74,d97e22d8,d97b1f64) at soo_ioctl+0x21c
> sys_ioctl(d97e22d8,f57a2f5c,f57a2f7c,0,f57a2fa8) at sys_ioctl+0x19f
> syscall() at syscall+0x250
> --- syscall (number 9) ---
> 0x14:
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{1}> 
> 
> Now we have two conflicting XXXSMP workarounds in if_clone_destroy()
> and pflow_clone_destroy().
> 
> bluhm

With the diff above, the workaround in pflow_clone_destroy is not
needed.  The diff below fixes this for me.  However, I start doubting
this approach.  A lot of code is now run with the lock released.
While I agree that the amount of lock recursions should be minimized,
I don't think that eliminating them completely is a necessity at this
point.

Recursive rwlock would work just fine.  We can add a flag to make
rrw_enter fail to recurse for the purpose of minimizing the amount of
recursions.  Haesbaert has experimented extensively with (r)rwlocks
instead of KERNEL_LOCKs (which is what we're doing, sort of) and
found out (r)rwlocks to be about 25% slower in real life applications
(make build) and produce a ton of IPI calls due to sleep/wakeup cycles.
He then moved on to turnstile exclusive recursive locks (bmtx) [1] with
better semantics to regain performance and lower down the number of
IPIs.

Anyways, OK for the diff below?

[1] https://github.com/bitrig/bitrig/blob/smpns/sys/kern/kern_bmtx.c

diff --git sys/net/if_pflow.c sys/net/if_pflow.c
index 0df0b69fd8a..8592d98a743 100644
--- sys/net/if_pflow.c
+++ sys/net/if_pflow.c
@@ -265,14 +265,11 @@ pflow_clone_destroy(struct ifnet *ifp)
if (timeout_initialized(>sc_tmo_tmpl))
timeout_del(>sc_tmo_tmpl);
pflow_flush(sc);
m_freem(sc->send_nam);
if (sc->so != NULL) {
-   /* XXXSMP breaks atomicity */
-   rw_exit_write();
error = soclose(sc->so);
-   rw_enter_write();
sc->so = NULL;
}
if (sc->sc_flowdst != NULL)
free(sc->sc_flowdst, M_DEVBUF, sc->sc_flowdst->sa_len);
if (sc->sc_flowsrc != NULL)



Re: if attach/detach netlocks

2016-12-20 Thread Alexander Bluhm
On Tue, Dec 20, 2016 at 06:47:31PM +0100, Mike Belopuhov wrote:
> @@ -1109,11 +1115,16 @@ if_clone_destroy(const char *name)
>   s = splnet();
>   if_down(ifp);
>   splx(s);
>   }
>  
> - return ((*ifc->ifc_destroy)(ifp));
> + /* XXXSMP breaks atomicity */
> + rw_exit_write();
> + ret = (*ifc->ifc_destroy)(ifp);
> + rw_enter_write();
> +
> + return (ret);
>  }

This broke pflow again.

panic: netlock: lock not held
Stopped at  Debugger+0x7:   leave
   TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
*440943   4323  0 0x2  01  ifconfig
Debugger(d09facfd,f57a2ca8,d09d23c6,f57a2ca8,d76fd000) at Debugger+0x7
panic(d09d23c6,d09dc32f,f57a2cec,d76fd000,0) at panic+0x71
rw_assert_wrlock(d0b56f38,40,f57a2cec,d03e506b,d8e1eb00) at rw_assert_wrlock+0x
3d
rw_exit_write(d0b56f38,0,7a2d4c,d046b949,d76fd000) at rw_exit_write+0x19
pflow_clone_destroy(d76fd000,0,c0186943,d055d8ed,d96ad7a0) at pflow_clone_destr
oy+0x71
if_clone_destroy(f57a2e74,0,d76fd000,1,d76fd000) at if_clone_destroy+0x7b
ifioctl(d9587648,80206979,f57a2e74,d97e22d8,d956d26c) at ifioctl+0x1ed
soo_ioctl(d96405a8,80206979,f57a2e74,d97e22d8,d97b1f64) at soo_ioctl+0x21c
sys_ioctl(d97e22d8,f57a2f5c,f57a2f7c,0,f57a2fa8) at sys_ioctl+0x19f
syscall() at syscall+0x250
--- syscall (number 9) ---
0x14:
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{1}> 

Now we have two conflicting XXXSMP workarounds in if_clone_destroy()
and pflow_clone_destroy().

bluhm



Re: if attach/detach netlocks

2016-12-20 Thread Martin Pieuchot
On 20/12/16(Tue) 18:47, Mike Belopuhov wrote:
> On Tue, Dec 20, 2016 at 17:06 +0100, Martin Pieuchot wrote:
> > 
> > You'll need to release the lock before calling ifc->ifc_create in
> > if_clone_create() and do the same dance in if_clone_destroy().
> >
> 
> Indeed.
> 
> > But I think that's the way to go.  If you can create/destroy
> > pseudo-interface without panicing we're good.
> >
> 
> Seems to work here, but it feels wrong.

I don't think we can do much better, let's go with that we'll refine
later.

ok mpi@

> 
> > >  void
> > > @@ -941,6 +941,7 @@ if_detach(struct ifnet *ifp)
> > >  
> > >   ifq_clr_oactive(>if_snd);
> > >  
> > > + rw_enter_write();
> > >   s = splnet();
> > >   /* Other CPUs must not have a reference before we start destroying. */
> > >   if_idxmap_remove(ifp);
> > > @@ -1017,6 +1018,7 @@ if_detach(struct ifnet *ifp)
> > >   /* Announce that the interface is gone. */
> > >   rt_ifannouncemsg(ifp, IFAN_DEPARTURE);
> > >   splx(s);
> > > + rw_exit_write();
> > >  
> > >   ifq_destroy(>if_snd);
> > >  }
> > 
> > Please keep the NET_LOCK/UNLOCK() macro.  The places where we
> > grab/release the `netlock' directly are places that need attention.
> 
> Done.
> 
> diff --git sys/net/if.c sys/net/if.c
> index 1e103564710..bd8b9267d5f 100644
> --- sys/net/if.c
> +++ sys/net/if.c
> @@ -523,15 +523,15 @@ if_attachhead(struct ifnet *ifp)
>  void
>  if_attach(struct ifnet *ifp)
>  {
>   int s;
>  
> - s = splsoftnet();
> + NET_LOCK(s);
>   if_attach_common(ifp);
>   TAILQ_INSERT_TAIL(, ifp, if_list);
>   if_attachsetup(ifp);
> - splx(s);
> + NET_UNLOCK(s);
>  }
>  
>  void
>  if_attach_common(struct ifnet *ifp)
>  {
> @@ -939,10 +939,11 @@ if_detach(struct ifnet *ifp)
>   /* Undo pseudo-driver changes. */
>   if_deactivate(ifp);
>  
>   ifq_clr_oactive(>if_snd);
>  
> + NET_LOCK(s);
>   s = splnet();
>   /* Other CPUs must not have a reference before we start destroying. */
>   if_idxmap_remove(ifp);
>  
>   ifp->if_start = if_detached_start;
> @@ -1015,10 +1016,11 @@ if_detach(struct ifnet *ifp)
>   }
>  
>   /* Announce that the interface is gone. */
>   rt_ifannouncemsg(ifp, IFAN_DEPARTURE);
>   splx(s);
> + NET_UNLOCK(s);
>  
>   ifq_destroy(>if_snd);
>  }
>  
>  /*
> @@ -1068,11 +1070,14 @@ if_clone_create(const char *name, int rdomain)
>   return (EINVAL);
>  
>   if (ifunit(name) != NULL)
>   return (EEXIST);
>  
> + /* XXXSMP breaks atomicity */
> + rw_exit_write();
>   ret = (*ifc->ifc_create)(ifc, unit);
> + rw_enter_write();
>  
>   if (ret != 0 || (ifp = ifunit(name)) == NULL)
>   return (ret);
>  
>   if_addgroup(ifp, ifc->ifc_name);
> @@ -1088,10 +1093,11 @@ if_clone_create(const char *name, int rdomain)
>  int
>  if_clone_destroy(const char *name)
>  {
>   struct if_clone *ifc;
>   struct ifnet *ifp;
> + int ret;
>  
>   splsoftassert(IPL_SOFTNET);
>  
>   ifc = if_clone_lookup(name, NULL);
>   if (ifc == NULL)
> @@ -1109,11 +1115,16 @@ if_clone_destroy(const char *name)
>   s = splnet();
>   if_down(ifp);
>   splx(s);
>   }
>  
> - return ((*ifc->ifc_destroy)(ifp));
> + /* XXXSMP breaks atomicity */
> + rw_exit_write();
> + ret = (*ifc->ifc_destroy)(ifp);
> + rw_enter_write();
> +
> + return (ret);
>  }
>  
>  /*
>   * Look up a network interface cloner.
>   */



Re: if attach/detach netlocks

2016-12-20 Thread Mike Belopuhov
On Tue, Dec 20, 2016 at 17:06 +0100, Martin Pieuchot wrote:
> 
> You'll need to release the lock before calling ifc->ifc_create in
> if_clone_create() and do the same dance in if_clone_destroy().
>

Indeed.

> But I think that's the way to go.  If you can create/destroy
> pseudo-interface without panicing we're good.
>

Seems to work here, but it feels wrong.

> >  void
> > @@ -941,6 +941,7 @@ if_detach(struct ifnet *ifp)
> >  
> > ifq_clr_oactive(>if_snd);
> >  
> > +   rw_enter_write();
> > s = splnet();
> > /* Other CPUs must not have a reference before we start destroying. */
> > if_idxmap_remove(ifp);
> > @@ -1017,6 +1018,7 @@ if_detach(struct ifnet *ifp)
> > /* Announce that the interface is gone. */
> > rt_ifannouncemsg(ifp, IFAN_DEPARTURE);
> > splx(s);
> > +   rw_exit_write();
> >  
> > ifq_destroy(>if_snd);
> >  }
> 
> Please keep the NET_LOCK/UNLOCK() macro.  The places where we
> grab/release the `netlock' directly are places that need attention.

Done.

diff --git sys/net/if.c sys/net/if.c
index 1e103564710..bd8b9267d5f 100644
--- sys/net/if.c
+++ sys/net/if.c
@@ -523,15 +523,15 @@ if_attachhead(struct ifnet *ifp)
 void
 if_attach(struct ifnet *ifp)
 {
int s;
 
-   s = splsoftnet();
+   NET_LOCK(s);
if_attach_common(ifp);
TAILQ_INSERT_TAIL(, ifp, if_list);
if_attachsetup(ifp);
-   splx(s);
+   NET_UNLOCK(s);
 }
 
 void
 if_attach_common(struct ifnet *ifp)
 {
@@ -939,10 +939,11 @@ if_detach(struct ifnet *ifp)
/* Undo pseudo-driver changes. */
if_deactivate(ifp);
 
ifq_clr_oactive(>if_snd);
 
+   NET_LOCK(s);
s = splnet();
/* Other CPUs must not have a reference before we start destroying. */
if_idxmap_remove(ifp);
 
ifp->if_start = if_detached_start;
@@ -1015,10 +1016,11 @@ if_detach(struct ifnet *ifp)
}
 
/* Announce that the interface is gone. */
rt_ifannouncemsg(ifp, IFAN_DEPARTURE);
splx(s);
+   NET_UNLOCK(s);
 
ifq_destroy(>if_snd);
 }
 
 /*
@@ -1068,11 +1070,14 @@ if_clone_create(const char *name, int rdomain)
return (EINVAL);
 
if (ifunit(name) != NULL)
return (EEXIST);
 
+   /* XXXSMP breaks atomicity */
+   rw_exit_write();
ret = (*ifc->ifc_create)(ifc, unit);
+   rw_enter_write();
 
if (ret != 0 || (ifp = ifunit(name)) == NULL)
return (ret);
 
if_addgroup(ifp, ifc->ifc_name);
@@ -1088,10 +1093,11 @@ if_clone_create(const char *name, int rdomain)
 int
 if_clone_destroy(const char *name)
 {
struct if_clone *ifc;
struct ifnet *ifp;
+   int ret;
 
splsoftassert(IPL_SOFTNET);
 
ifc = if_clone_lookup(name, NULL);
if (ifc == NULL)
@@ -1109,11 +1115,16 @@ if_clone_destroy(const char *name)
s = splnet();
if_down(ifp);
splx(s);
}
 
-   return ((*ifc->ifc_destroy)(ifp));
+   /* XXXSMP breaks atomicity */
+   rw_exit_write();
+   ret = (*ifc->ifc_destroy)(ifp);
+   rw_enter_write();
+
+   return (ret);
 }
 
 /*
  * Look up a network interface cloner.
  */



Re: if attach/detach netlocks

2016-12-20 Thread Martin Pieuchot
On 20/12/16(Tue) 16:33, Mike Belopuhov wrote:
> Attach:
> 
>   splassert: sowakeup: want 1 have 0
>   Starting stack trace...
>   sowakeup() at sowakeup+0x41
>   sorwakeup() at sorwakeup+0xb4
>   route_input() at route_input+0x2d7
>   if_attach() at if_attach+0x58
>   xnf_attach() at xnf_attach+0x45f
>   config_attach() at config_attach+0x1bc
>   xen_attach_device() at xen_attach_device+0x146
>   xen_hotplug() at xen_hotplug+0x23f
>   taskq_thread() at taskq_thread+0x6c
>   end trace frame: 0x0, count: 248
>   End of stack trace.
> 
> Detach:
> 
>   splassert: sowakeup: want 1 have 0
>   Starting stack trace...
>   sowakeup() at sowakeup+0x41
>   sorwakeup() at sorwakeup+0xb4
>   route_input() at route_input+0x2d7
>   if_detach() at if_detach+0x27e
>   xnf_detach() at xnf_detach+0x4d
>   config_detach() at config_detach+0x13c
>   xen_hotplug() at xen_hotplug+0x2e5
>   taskq_thread() at taskq_thread+0x6c
>   end trace frame: 0x0, count: 249
>   End of stack trace.
>   xnf2 detached
> 
> The taskq running xen_hotplug is KERNEL_LOCK'ed.  The diff below solves
> the issue, but is it any good?
> 
> diff --git sys/net/if.c sys/net/if.c
> index 1e103564710..a5c25b89560 100644
> --- sys/net/if.c
> +++ sys/net/if.c
> @@ -525,11 +525,11 @@ if_attach(struct ifnet *ifp)
>  {
>   int s;
>  
> - s = splsoftnet();
> + NET_LOCK(s);
>   if_attach_common(ifp);
>   TAILQ_INSERT_TAIL(, ifp, if_list);
>   if_attachsetup(ifp);
> - splx(s);
> + NET_UNLOCK(s);

You'll need to release the lock before calling ifc->ifc_create in
if_clone_create() and do the same dance in if_clone_destroy().

But I think that's the way to go.  If you can create/destroy
pseudo-interface without panicing we're good.

>  void
> @@ -941,6 +941,7 @@ if_detach(struct ifnet *ifp)
>  
>   ifq_clr_oactive(>if_snd);
>  
> + rw_enter_write();
>   s = splnet();
>   /* Other CPUs must not have a reference before we start destroying. */
>   if_idxmap_remove(ifp);
> @@ -1017,6 +1018,7 @@ if_detach(struct ifnet *ifp)
>   /* Announce that the interface is gone. */
>   rt_ifannouncemsg(ifp, IFAN_DEPARTURE);
>   splx(s);
> + rw_exit_write();
>  
>   ifq_destroy(>if_snd);
>  }

Please keep the NET_LOCK/UNLOCK() macro.  The places where we
grab/release the `netlock' directly are places that need attention.



if attach/detach netlocks

2016-12-20 Thread Mike Belopuhov
Attach:

  splassert: sowakeup: want 1 have 0
  Starting stack trace...
  sowakeup() at sowakeup+0x41
  sorwakeup() at sorwakeup+0xb4
  route_input() at route_input+0x2d7
  if_attach() at if_attach+0x58
  xnf_attach() at xnf_attach+0x45f
  config_attach() at config_attach+0x1bc
  xen_attach_device() at xen_attach_device+0x146
  xen_hotplug() at xen_hotplug+0x23f
  taskq_thread() at taskq_thread+0x6c
  end trace frame: 0x0, count: 248
  End of stack trace.

Detach:

  splassert: sowakeup: want 1 have 0
  Starting stack trace...
  sowakeup() at sowakeup+0x41
  sorwakeup() at sorwakeup+0xb4
  route_input() at route_input+0x2d7
  if_detach() at if_detach+0x27e
  xnf_detach() at xnf_detach+0x4d
  config_detach() at config_detach+0x13c
  xen_hotplug() at xen_hotplug+0x2e5
  taskq_thread() at taskq_thread+0x6c
  end trace frame: 0x0, count: 249
  End of stack trace.
  xnf2 detached

The taskq running xen_hotplug is KERNEL_LOCK'ed.  The diff below solves
the issue, but is it any good?

diff --git sys/net/if.c sys/net/if.c
index 1e103564710..a5c25b89560 100644
--- sys/net/if.c
+++ sys/net/if.c
@@ -525,11 +525,11 @@ if_attach(struct ifnet *ifp)
 {
int s;
 
-   s = splsoftnet();
+   NET_LOCK(s);
if_attach_common(ifp);
TAILQ_INSERT_TAIL(, ifp, if_list);
if_attachsetup(ifp);
-   splx(s);
+   NET_UNLOCK(s);
 }
 
 void
@@ -941,6 +941,7 @@ if_detach(struct ifnet *ifp)
 
ifq_clr_oactive(>if_snd);
 
+   rw_enter_write();
s = splnet();
/* Other CPUs must not have a reference before we start destroying. */
if_idxmap_remove(ifp);
@@ -1017,6 +1018,7 @@ if_detach(struct ifnet *ifp)
/* Announce that the interface is gone. */
rt_ifannouncemsg(ifp, IFAN_DEPARTURE);
splx(s);
+   rw_exit_write();
 
ifq_destroy(>if_snd);
 }