Re: Change the way we handle interface/connected networks

2015-04-23 Thread Martin Pieuchot
On 15/04/15(Wed) 23:00, Claudio Jeker wrote:
> On Wed, Mar 18, 2015 at 05:46:34AM +0100, Claudio Jeker wrote:
> > On Tue, Mar 17, 2015 at 05:35:21PM +0100, Martin Pieuchot wrote:
> > > On 12/02/15(Thu) 12:35, Martin Pieuchot wrote:
> > > > On 10/02/15(Tue) 03:04, Claudio Jeker wrote:
> > > > > There is no need to not allow the same network to be configured more 
> > > > > then
> > > > > once. Instead just rely on the multipath and priority handling of the
> > > > > routing table to select the right route.
> > > > > Additionally this removes cloned routes (arp/npd cache) when the 
> > > > > interface
> > > > > goes down or when the any of the multipath cloning route is changed.
> > > > > 
> > > > > With this it is possible to run 2 dhclients on wired and wireless 
> > > > > with a
> > > > > bridged network. Active TCP sessions still fail when the cable is
> > > > > unplugged. To fix this more is needed.
> > > > > 
> > > > > This changes a fundamental part of the network stack and therefor 
> > > > > broad
> > > > > testing is needed to find all the hidden dragons.
> > > > 
> > It is broken for IPv6 and I could not find the proper fix yet. I think I
> > now why it goes wrong but the nd6 code is a nightmare.
> > 
> > I will send out a new diff once I have IPv6 fixed.
> >  
> 
> Unsurprisingly IPv6 needs to be special and is not using rt_ifa_add or
> rt_ifa_del in all cases. 

Not yet!  That would be nice to convert them :)

>  There are three special cases that do the same
> dance but use ifa->ifa_addr as the gateway and because of this the
> resulting interface routes are not catched by the nd6 code (RTF_LLINFO is
> missing). When the routes are then cloned nd6 is not invoced and
> everything points back to the host. Oups.

It still think we need to use ifa->ifa_addr as gateway in some cases,
see below.

> The following updated diff seems to fix this but I only minimally tested
> the IPv6 part. People using IPv6 may want to give this a spin.

Same here.  I mostly played with dhclient with 3 interfaces and apart
the fact that only interfaces with different priorities will get a
cloning route it works fine.  But that can be improve later.

> IMO the net/if_var.h and netinet/ip_carp.c changes could be commited
> before the rest since there should be no noticeable change in how carp
> works.

I agree.

I have some comments inline:

> @@ -1106,16 +1117,20 @@ rt_ifa_add(struct ifaddr *ifa, int flags
>  {
>   struct rtentry  *rt, *nrt = NULL;
>   struct sockaddr_rtlabel  sa_rl;
> + struct sockaddr_dl   sa_dl = { sizeof(sa_dl), AF_LINK };
>   struct rt_addrinfo   info;
>   u_short  rtableid = ifa->ifa_ifp->if_rdomain;
> - u_int8_t prio = RTP_CONNECTED;
> + u_int8_t prio = ifa->ifa_ifp->if_priority + RTP_STATIC;
>   int  error;
>  
> + sa_dl.sdl_type = ifa->ifa_ifp->if_type;
> + sa_dl.sdl_index = ifa->ifa_ifp->if_index;

I do not like having more sdl stuff here but I don't see a simpler way
to do that right now since cloning routes need a sockaddr_dl with the
right ifp index but an empty address.

I considered using ifp->if_sadl instead but that means we need to clear
the link-layer address somewhere and this wouldn't help if we change it
after adding a local or cloning route.

IMHO we cannot fix this without improving the RTM_RESOLVE logic. See my
comment below.  I think we should be able to call ifa_rtrequest() before
adding the rtentry to the routing table, this would prevent another layer
violation and properly bail if the ARP or ND informations are incorrects.

>   memset(&info, 0, sizeof(info));
>   info.rti_ifa = ifa;
> - info.rti_flags = flags;
> + info.rti_flags = flags | RTF_MPATH;
>   info.rti_info[RTAX_DST] = dst;
> - info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr;
> + info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)&sa_dl;

Here I strongly believe that we should only use a sockaddr_dl as gateway
if we are creating a RTF_CLONING or RTF_LLINFO route.

The first reason is that p2p interfaces generally add a route like:

dstaddr localaddr   UH  0   0   -   pppoe0

And I'm afraid using a sockaddr_dl might break things.

The second reason is that having sockaddr_dl in the gateway is tied to
the RTM_RESOLVE mechanism related to ifa_rtrequest, which does not apply
to IFF_LOOPBACK or IFF_POINTOPOINT interfaces.


> Index: netinet/if_ether.c
> ===
> RCS file: /cvs/src/sys/netinet/if_ether.c,v
> retrieving revision 1.150
> diff -u -p -r1.150 if_ether.c
> --- netinet/if_ether.c10 Apr 2015 13:58:20 -  1.150
> +++ netinet/if_ether.c12 Apr 2015 11:47:03 -
> @@ -121,8 +121,6 @@ void  db_print_llinfo(caddr_t);
>  int  db_show_radix_node(struct radix_node *, void *, u_int);
>  #endif
>  
> -static const struct sockaddr_dl null_sd

Re: Change the way we handle interface/connected networks

2015-04-15 Thread Claudio Jeker
On Wed, Mar 18, 2015 at 05:46:34AM +0100, Claudio Jeker wrote:
> On Tue, Mar 17, 2015 at 05:35:21PM +0100, Martin Pieuchot wrote:
> > On 12/02/15(Thu) 12:35, Martin Pieuchot wrote:
> > > On 10/02/15(Tue) 03:04, Claudio Jeker wrote:
> > > > There is no need to not allow the same network to be configured more 
> > > > then
> > > > once. Instead just rely on the multipath and priority handling of the
> > > > routing table to select the right route.
> > > > Additionally this removes cloned routes (arp/npd cache) when the 
> > > > interface
> > > > goes down or when the any of the multipath cloning route is changed.
> > > > 
> > > > With this it is possible to run 2 dhclients on wired and wireless with a
> > > > bridged network. Active TCP sessions still fail when the cable is
> > > > unplugged. To fix this more is needed.
> > > > 
> > > > This changes a fundamental part of the network stack and therefor broad
> > > > testing is needed to find all the hidden dragons.
> > > 
> > > Here's version of the diff rebased on top of the recent changes.
> > 
> > I think it's the time to get this in, then as a second step put the
> > dhclient(8) bits.
> > 
> > Claudio you have my ok.
> 
> It is broken for IPv6 and I could not find the proper fix yet. I think I
> now why it goes wrong but the nd6 code is a nightmare.
> 
> I will send out a new diff once I have IPv6 fixed.
>  

Unsurprisingly IPv6 needs to be special and is not using rt_ifa_add or
rt_ifa_del in all cases. There are three special cases that do the same
dance but use ifa->ifa_addr as the gateway and because of this the
resulting interface routes are not catched by the nd6 code (RTF_LLINFO is
missing). When the routes are then cloned nd6 is not invoced and
everything points back to the host. Oups.
The following updated diff seems to fix this but I only minimally tested
the IPv6 part. People using IPv6 may want to give this a spin.

IMO the net/if_var.h and netinet/ip_carp.c changes could be commited
before the rest since there should be no noticeable change in how carp
works.
-- 
:wq Claudio

Index: net/if_var.h
===
RCS file: /cvs/src/sys/net/if_var.h,v
retrieving revision 1.24
diff -u -p -r1.24 if_var.h
--- net/if_var.h7 Apr 2015 10:46:20 -   1.24
+++ net/if_var.h12 Apr 2015 11:47:03 -
@@ -389,6 +389,7 @@ do {
\
 /* default interface priorities */
 #define IF_WIRED_DEFAULT_PRIORITY  0
 #define IF_WIRELESS_DEFAULT_PRIORITY   4
+#define IF_CARP_DEFAULT_PRIORITY   15
 
 /*
  * Network stack input queues.
Index: net/route.c
===
RCS file: /cvs/src/sys/net/route.c,v
retrieving revision 1.208
diff -u -p -r1.208 route.c
--- net/route.c 26 Mar 2015 11:02:44 -  1.208
+++ net/route.c 4 Apr 2015 08:31:34 -
@@ -554,6 +554,16 @@ rtdeletemsg(struct rtentry *rt, u_int ta
return (error);
 }
 
+static inline int
+rtequal(struct rtentry *a, struct rtentry *b)
+{
+   if (memcmp(rt_key(a), rt_key(b), rt_key(a)->sa_len) == 0 &&
+   memcmp(rt_mask(a), rt_mask(b), rt_mask(a)->sa_len) == 0)
+   return 1;
+   else
+   return 0;
+}
+
 int
 rtflushclone1(struct radix_node *rn, void *arg, u_int id)
 {
@@ -561,7 +571,8 @@ rtflushclone1(struct radix_node *rn, voi
 
rt = (struct rtentry *)rn;
parent = (struct rtentry *)arg;
-   if ((rt->rt_flags & RTF_CLONED) != 0 && rt->rt_parent == parent)
+   if ((rt->rt_flags & RTF_CLONED) != 0 && (rt->rt_parent == parent ||
+   rtequal(rt->rt_parent, parent)))
rtdeletemsg(rt, id);
return 0;
 }
@@ -1106,16 +1117,20 @@ rt_ifa_add(struct ifaddr *ifa, int flags
 {
struct rtentry  *rt, *nrt = NULL;
struct sockaddr_rtlabel  sa_rl;
+   struct sockaddr_dl   sa_dl = { sizeof(sa_dl), AF_LINK };
struct rt_addrinfo   info;
u_short  rtableid = ifa->ifa_ifp->if_rdomain;
-   u_int8_t prio = RTP_CONNECTED;
+   u_int8_t prio = ifa->ifa_ifp->if_priority + RTP_STATIC;
int  error;
 
+   sa_dl.sdl_type = ifa->ifa_ifp->if_type;
+   sa_dl.sdl_index = ifa->ifa_ifp->if_index;
+
memset(&info, 0, sizeof(info));
info.rti_ifa = ifa;
-   info.rti_flags = flags;
+   info.rti_flags = flags | RTF_MPATH;
info.rti_info[RTAX_DST] = dst;
-   info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr;
+   info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)&sa_dl;
info.rti_info[RTAX_LABEL] =
rtlabel_id2sa(ifa->ifa_ifp->if_rtlabelid, &sa_rl);
 
@@ -1170,8 +1185,9 @@ rt_ifa_del(struct ifaddr *ifa, int flags
struct sockaddr *deldst;
struct rt_addrinfo   info;
struct sockaddr_rtlabel  sa_rl;
+   struct sockaddr_dl   sa_dl 

Re: Change the way we handle interface/connected networks

2015-03-17 Thread Claudio Jeker
On Tue, Mar 17, 2015 at 05:35:21PM +0100, Martin Pieuchot wrote:
> On 12/02/15(Thu) 12:35, Martin Pieuchot wrote:
> > On 10/02/15(Tue) 03:04, Claudio Jeker wrote:
> > > There is no need to not allow the same network to be configured more then
> > > once. Instead just rely on the multipath and priority handling of the
> > > routing table to select the right route.
> > > Additionally this removes cloned routes (arp/npd cache) when the interface
> > > goes down or when the any of the multipath cloning route is changed.
> > > 
> > > With this it is possible to run 2 dhclients on wired and wireless with a
> > > bridged network. Active TCP sessions still fail when the cable is
> > > unplugged. To fix this more is needed.
> > > 
> > > This changes a fundamental part of the network stack and therefor broad
> > > testing is needed to find all the hidden dragons.
> > 
> > Here's version of the diff rebased on top of the recent changes.
> 
> I think it's the time to get this in, then as a second step put the
> dhclient(8) bits.
> 
> Claudio you have my ok.

It is broken for IPv6 and I could not find the proper fix yet. I think I
now why it goes wrong but the nd6 code is a nightmare.

I will send out a new diff once I have IPv6 fixed.
 
> > Index: net/if_var.h
> > ===
> > RCS file: /cvs/src/sys/net/if_var.h,v
> > retrieving revision 1.20
> > diff -u -p -r1.20 if_var.h
> > --- net/if_var.h9 Feb 2015 03:09:57 -   1.20
> > +++ net/if_var.h12 Feb 2015 11:08:40 -
> > @@ -392,6 +392,7 @@ do {
> > \
> >  /* default interface priorities */
> >  #define IF_WIRED_DEFAULT_PRIORITY  0
> >  #define IF_WIRELESS_DEFAULT_PRIORITY   4
> > +#define IF_CARP_DEFAULT_PRIORITY   15
> >  
> >  extern struct ifnet_head ifnet;
> >  extern struct ifnet *lo0ifp;
> > Index: net/route.c
> > ===
> > RCS file: /cvs/src/sys/net/route.c,v
> > retrieving revision 1.206
> > diff -u -p -r1.206 route.c
> > --- net/route.c 11 Feb 2015 23:34:43 -  1.206
> > +++ net/route.c 12 Feb 2015 11:08:40 -
> > @@ -554,6 +554,16 @@ rtdeletemsg(struct rtentry *rt, u_int ta
> > return (error);
> >  }
> >  
> > +static inline int
> > +rtequal(struct rtentry *a, struct rtentry *b)
> > +{
> > +   if (memcmp(rt_key(a), rt_key(b), rt_key(a)->sa_len) == 0 &&
> > +   memcmp(rt_mask(a), rt_mask(b), rt_mask(a)->sa_len) == 0)
> > +   return 1;
> > +   else
> > +   return 0;
> > +}
> > +
> >  int
> >  rtflushclone1(struct radix_node *rn, void *arg, u_int id)
> >  {
> > @@ -561,7 +571,8 @@ rtflushclone1(struct radix_node *rn, voi
> >  
> > rt = (struct rtentry *)rn;
> > parent = (struct rtentry *)arg;
> > -   if ((rt->rt_flags & RTF_CLONED) != 0 && rt->rt_parent == parent)
> > +   if ((rt->rt_flags & RTF_CLONED) != 0 && (rt->rt_parent == parent ||
> > +   rtequal(rt->rt_parent, parent)))
> > rtdeletemsg(rt, id);
> > return 0;
> >  }
> > @@ -1106,16 +1117,20 @@ rt_ifa_add(struct ifaddr *ifa, int flags
> >  {
> > struct rtentry  *rt, *nrt = NULL;
> > struct sockaddr_rtlabel  sa_rl;
> > +   struct sockaddr_dl   sa_dl = { sizeof(sa_dl), AF_LINK };
> > struct rt_addrinfo   info;
> > u_short  rtableid = ifa->ifa_ifp->if_rdomain;
> > -   u_int8_t prio = RTP_CONNECTED;
> > +   u_int8_t prio = ifa->ifa_ifp->if_priority + RTP_STATIC;
> > int  error;
> >  
> > +   sa_dl.sdl_type = ifa->ifa_ifp->if_type;
> > +   sa_dl.sdl_index = ifa->ifa_ifp->if_index;
> > +
> > memset(&info, 0, sizeof(info));
> > info.rti_ifa = ifa;
> > -   info.rti_flags = flags;
> > +   info.rti_flags = flags | RTF_MPATH;
> > info.rti_info[RTAX_DST] = dst;
> > -   info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr;
> > +   info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)&sa_dl;
> > info.rti_info[RTAX_LABEL] =
> > rtlabel_id2sa(ifa->ifa_ifp->if_rtlabelid, &sa_rl);
> >  
> > @@ -1161,8 +1176,9 @@ rt_ifa_del(struct ifaddr *ifa, int flags
> > struct sockaddr *deldst;
> > struct rt_addrinfo   info;
> > struct sockaddr_rtlabel  sa_rl;
> > +   struct sockaddr_dl   sa_dl = { sizeof(sa_dl), AF_LINK };
> > u_short  rtableid = ifa->ifa_ifp->if_rdomain;
> > -   u_int8_t prio = RTP_CONNECTED;
> > +   u_int8_t prio = ifa->ifa_ifp->if_priority + RTP_STATIC;
> > int  error;
> >  
> > if ((flags & RTF_HOST) == 0 && ifa->ifa_netmask) {
> > @@ -1187,10 +1203,14 @@ rt_ifa_del(struct ifaddr *ifa, int flags
> > }
> > }
> >  
> > +   sa_dl.sdl_type = ifa->ifa_ifp->if_type;
> > +   sa_dl.sdl_index = ifa->ifa_ifp->if_index;
> > +
> > memset(&info, 0, sizeof(info));
> > info.rti_ifa = ifa;
> > info.rti_flags 

Re: Change the way we handle interface/connected networks

2015-03-17 Thread Martin Pieuchot
On 12/02/15(Thu) 12:35, Martin Pieuchot wrote:
> On 10/02/15(Tue) 03:04, Claudio Jeker wrote:
> > There is no need to not allow the same network to be configured more then
> > once. Instead just rely on the multipath and priority handling of the
> > routing table to select the right route.
> > Additionally this removes cloned routes (arp/npd cache) when the interface
> > goes down or when the any of the multipath cloning route is changed.
> > 
> > With this it is possible to run 2 dhclients on wired and wireless with a
> > bridged network. Active TCP sessions still fail when the cable is
> > unplugged. To fix this more is needed.
> > 
> > This changes a fundamental part of the network stack and therefor broad
> > testing is needed to find all the hidden dragons.
> 
> Here's version of the diff rebased on top of the recent changes.

I think it's the time to get this in, then as a second step put the
dhclient(8) bits.

Claudio you have my ok.

> Index: net/if_var.h
> ===
> RCS file: /cvs/src/sys/net/if_var.h,v
> retrieving revision 1.20
> diff -u -p -r1.20 if_var.h
> --- net/if_var.h  9 Feb 2015 03:09:57 -   1.20
> +++ net/if_var.h  12 Feb 2015 11:08:40 -
> @@ -392,6 +392,7 @@ do {  
> \
>  /* default interface priorities */
>  #define IF_WIRED_DEFAULT_PRIORITY0
>  #define IF_WIRELESS_DEFAULT_PRIORITY 4
> +#define IF_CARP_DEFAULT_PRIORITY 15
>  
>  extern struct ifnet_head ifnet;
>  extern struct ifnet *lo0ifp;
> Index: net/route.c
> ===
> RCS file: /cvs/src/sys/net/route.c,v
> retrieving revision 1.206
> diff -u -p -r1.206 route.c
> --- net/route.c   11 Feb 2015 23:34:43 -  1.206
> +++ net/route.c   12 Feb 2015 11:08:40 -
> @@ -554,6 +554,16 @@ rtdeletemsg(struct rtentry *rt, u_int ta
>   return (error);
>  }
>  
> +static inline int
> +rtequal(struct rtentry *a, struct rtentry *b)
> +{
> + if (memcmp(rt_key(a), rt_key(b), rt_key(a)->sa_len) == 0 &&
> + memcmp(rt_mask(a), rt_mask(b), rt_mask(a)->sa_len) == 0)
> + return 1;
> + else
> + return 0;
> +}
> +
>  int
>  rtflushclone1(struct radix_node *rn, void *arg, u_int id)
>  {
> @@ -561,7 +571,8 @@ rtflushclone1(struct radix_node *rn, voi
>  
>   rt = (struct rtentry *)rn;
>   parent = (struct rtentry *)arg;
> - if ((rt->rt_flags & RTF_CLONED) != 0 && rt->rt_parent == parent)
> + if ((rt->rt_flags & RTF_CLONED) != 0 && (rt->rt_parent == parent ||
> + rtequal(rt->rt_parent, parent)))
>   rtdeletemsg(rt, id);
>   return 0;
>  }
> @@ -1106,16 +1117,20 @@ rt_ifa_add(struct ifaddr *ifa, int flags
>  {
>   struct rtentry  *rt, *nrt = NULL;
>   struct sockaddr_rtlabel  sa_rl;
> + struct sockaddr_dl   sa_dl = { sizeof(sa_dl), AF_LINK };
>   struct rt_addrinfo   info;
>   u_short  rtableid = ifa->ifa_ifp->if_rdomain;
> - u_int8_t prio = RTP_CONNECTED;
> + u_int8_t prio = ifa->ifa_ifp->if_priority + RTP_STATIC;
>   int  error;
>  
> + sa_dl.sdl_type = ifa->ifa_ifp->if_type;
> + sa_dl.sdl_index = ifa->ifa_ifp->if_index;
> +
>   memset(&info, 0, sizeof(info));
>   info.rti_ifa = ifa;
> - info.rti_flags = flags;
> + info.rti_flags = flags | RTF_MPATH;
>   info.rti_info[RTAX_DST] = dst;
> - info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr;
> + info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)&sa_dl;
>   info.rti_info[RTAX_LABEL] =
>   rtlabel_id2sa(ifa->ifa_ifp->if_rtlabelid, &sa_rl);
>  
> @@ -1161,8 +1176,9 @@ rt_ifa_del(struct ifaddr *ifa, int flags
>   struct sockaddr *deldst;
>   struct rt_addrinfo   info;
>   struct sockaddr_rtlabel  sa_rl;
> + struct sockaddr_dl   sa_dl = { sizeof(sa_dl), AF_LINK };
>   u_short  rtableid = ifa->ifa_ifp->if_rdomain;
> - u_int8_t prio = RTP_CONNECTED;
> + u_int8_t prio = ifa->ifa_ifp->if_priority + RTP_STATIC;
>   int  error;
>  
>   if ((flags & RTF_HOST) == 0 && ifa->ifa_netmask) {
> @@ -1187,10 +1203,14 @@ rt_ifa_del(struct ifaddr *ifa, int flags
>   }
>   }
>  
> + sa_dl.sdl_type = ifa->ifa_ifp->if_type;
> + sa_dl.sdl_index = ifa->ifa_ifp->if_index;
> +
>   memset(&info, 0, sizeof(info));
>   info.rti_ifa = ifa;
>   info.rti_flags = flags;
>   info.rti_info[RTAX_DST] = dst;
> + info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)&sa_dl;
>   info.rti_info[RTAX_LABEL] =
>   rtlabel_id2sa(ifa->ifa_ifp->if_rtlabelid, &sa_rl);
>  
> @@ -1692,6 +1715,15 @@ rt_if_linkstate_change(struct radix_node
>   }
>   } else {
>   if (rt->rt_flags &

Re: Change the way we handle interface/connected networks

2015-02-12 Thread Martin Pieuchot
On 10/02/15(Tue) 03:04, Claudio Jeker wrote:
> There is no need to not allow the same network to be configured more then
> once. Instead just rely on the multipath and priority handling of the
> routing table to select the right route.
> Additionally this removes cloned routes (arp/npd cache) when the interface
> goes down or when the any of the multipath cloning route is changed.
> 
> With this it is possible to run 2 dhclients on wired and wireless with a
> bridged network. Active TCP sessions still fail when the cable is
> unplugged. To fix this more is needed.
> 
> This changes a fundamental part of the network stack and therefor broad
> testing is needed to find all the hidden dragons.

Here's version of the diff rebased on top of the recent changes.

Index: net/if_var.h
===
RCS file: /cvs/src/sys/net/if_var.h,v
retrieving revision 1.20
diff -u -p -r1.20 if_var.h
--- net/if_var.h9 Feb 2015 03:09:57 -   1.20
+++ net/if_var.h12 Feb 2015 11:08:40 -
@@ -392,6 +392,7 @@ do {
\
 /* default interface priorities */
 #define IF_WIRED_DEFAULT_PRIORITY  0
 #define IF_WIRELESS_DEFAULT_PRIORITY   4
+#define IF_CARP_DEFAULT_PRIORITY   15
 
 extern struct ifnet_head ifnet;
 extern struct ifnet *lo0ifp;
Index: net/route.c
===
RCS file: /cvs/src/sys/net/route.c,v
retrieving revision 1.206
diff -u -p -r1.206 route.c
--- net/route.c 11 Feb 2015 23:34:43 -  1.206
+++ net/route.c 12 Feb 2015 11:08:40 -
@@ -554,6 +554,16 @@ rtdeletemsg(struct rtentry *rt, u_int ta
return (error);
 }
 
+static inline int
+rtequal(struct rtentry *a, struct rtentry *b)
+{
+   if (memcmp(rt_key(a), rt_key(b), rt_key(a)->sa_len) == 0 &&
+   memcmp(rt_mask(a), rt_mask(b), rt_mask(a)->sa_len) == 0)
+   return 1;
+   else
+   return 0;
+}
+
 int
 rtflushclone1(struct radix_node *rn, void *arg, u_int id)
 {
@@ -561,7 +571,8 @@ rtflushclone1(struct radix_node *rn, voi
 
rt = (struct rtentry *)rn;
parent = (struct rtentry *)arg;
-   if ((rt->rt_flags & RTF_CLONED) != 0 && rt->rt_parent == parent)
+   if ((rt->rt_flags & RTF_CLONED) != 0 && (rt->rt_parent == parent ||
+   rtequal(rt->rt_parent, parent)))
rtdeletemsg(rt, id);
return 0;
 }
@@ -1106,16 +1117,20 @@ rt_ifa_add(struct ifaddr *ifa, int flags
 {
struct rtentry  *rt, *nrt = NULL;
struct sockaddr_rtlabel  sa_rl;
+   struct sockaddr_dl   sa_dl = { sizeof(sa_dl), AF_LINK };
struct rt_addrinfo   info;
u_short  rtableid = ifa->ifa_ifp->if_rdomain;
-   u_int8_t prio = RTP_CONNECTED;
+   u_int8_t prio = ifa->ifa_ifp->if_priority + RTP_STATIC;
int  error;
 
+   sa_dl.sdl_type = ifa->ifa_ifp->if_type;
+   sa_dl.sdl_index = ifa->ifa_ifp->if_index;
+
memset(&info, 0, sizeof(info));
info.rti_ifa = ifa;
-   info.rti_flags = flags;
+   info.rti_flags = flags | RTF_MPATH;
info.rti_info[RTAX_DST] = dst;
-   info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr;
+   info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)&sa_dl;
info.rti_info[RTAX_LABEL] =
rtlabel_id2sa(ifa->ifa_ifp->if_rtlabelid, &sa_rl);
 
@@ -1161,8 +1176,9 @@ rt_ifa_del(struct ifaddr *ifa, int flags
struct sockaddr *deldst;
struct rt_addrinfo   info;
struct sockaddr_rtlabel  sa_rl;
+   struct sockaddr_dl   sa_dl = { sizeof(sa_dl), AF_LINK };
u_short  rtableid = ifa->ifa_ifp->if_rdomain;
-   u_int8_t prio = RTP_CONNECTED;
+   u_int8_t prio = ifa->ifa_ifp->if_priority + RTP_STATIC;
int  error;
 
if ((flags & RTF_HOST) == 0 && ifa->ifa_netmask) {
@@ -1187,10 +1203,14 @@ rt_ifa_del(struct ifaddr *ifa, int flags
}
}
 
+   sa_dl.sdl_type = ifa->ifa_ifp->if_type;
+   sa_dl.sdl_index = ifa->ifa_ifp->if_index;
+
memset(&info, 0, sizeof(info));
info.rti_ifa = ifa;
info.rti_flags = flags;
info.rti_info[RTAX_DST] = dst;
+   info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)&sa_dl;
info.rti_info[RTAX_LABEL] =
rtlabel_id2sa(ifa->ifa_ifp->if_rtlabelid, &sa_rl);
 
@@ -1692,6 +1715,15 @@ rt_if_linkstate_change(struct radix_node
}
} else {
if (rt->rt_flags & RTF_UP) {
+   /*
+* Remove cloned routes (mainly arp) to
+* down interfaces so we have a chance to
+* clone a new route from a better source.
+   

Re: Change the way we handle interface/connected networks

2015-02-09 Thread Martin Pieuchot
On 10/02/15(Tue) 03:04, Claudio Jeker wrote:
> There is no need to not allow the same network to be configured more then
> once. Instead just rely on the multipath and priority handling of the
> routing table to select the right route.
> Additionally this removes cloned routes (arp/npd cache) when the interface
> goes down or when the any of the multipath cloning route is changed.
> 
> With this it is possible to run 2 dhclients on wired and wireless with a
> bridged network. Active TCP sessions still fail when the cable is
> unplugged. To fix this more is needed.
> 
> This changes a fundamental part of the network stack and therefor broad
> testing is needed to find all the hidden dragons.

I like this a lot.  Should should also kill the IFA_ROUTE flag.  It was
only used to indicate which ifa was owning the cloning route of its
corresponding subnet.

One more comment:

> @@ -1655,6 +1675,9 @@ rt_if_track(struct ifnet *ifp)
>   return;
>  
>   for (tid = 0; tid <= rtbl_id_max; tid++) {
> + /* skip rtables that are not in the rdomain of the ifp */
> + if (rtable_l2(tid) != ifp->if_rdomain)
> + continue;

This chunk is not strictly related an should already go in.