On Fri, Feb 01, 2019 at 02:40:17PM -0200, Martin Pieuchot wrote:
> On 31/01/19(Thu) 13:31, David Gwynne wrote:
> > On Wed, Jan 30, 2019 at 11:54:45AM -0200, Martin Pieuchot wrote:
> > > On 30/01/19(Wed) 11:48, David Gwynne wrote:
> > > > mpls uses AF_MPLS routes with RTF_LOCAL set on them to know which tags
> > > > are used as input for the mpe and mpw interfaces. setting this up
> > > > currently goes through rt_ifa_add, but that has a couple of features
> > > > that are undesirable for mpls.
> > > > 
> > > > Firstly, rt_ifa_add unconditionally sets RTF_MPATH on the routes it
> > > > adds, which means multiple mpe and mpw interfaces can "own" the same
> > > > input tag. mpe tries to work around this by maintaining a global list of
> > > > mpe interfaces, and iterates over them when a new label is added. That's
> > > > ok (sort of) for mpe, but it doesnt take the tags used by mpw into
> > > > account.
> > > 
> > > I don't understand how this work, where are these 'tag' and 'label'
> > > stored?  What do you mean with "'own' the same input tag"?
> > 
> > I'm using tag and label interchangably here, but it refers to the number
> > that appears in the MPLS shim header on the wire. Each label is supposed
> > to represent a single "forwarding equivalence class", ie, each MPLS label
> > is only supposed to do one thing. When we're talkign about mpe and mpw
> > input, we're talking about configuring each interface with a local
> > MPLS label. Packets sent to the label on the local machine should
> > end up as packets coming in on an mpe or mpw interface.
> > 
> > Right now mpw and mpe use rt_ifa_add() to claim ownership of a local
> > tag. Once something is using a tag, nothing else should be able to add
> > an entry to the mpls rtable with that same tag. Because rt_ifa_add()
> > adds RTF_MPATH unconditionally, it is possible to add two things with
> > the same label to the mpls rtable.
> > 
> > > What is the current 'gateway' value of MPLS RTF_LOCAL routes? 
> > 
> > It's the struct sockaddr_dl *if_sadl in struct ifnet.
> > 
> > > > Secondly, I'd like to start pulling apart the restriction on the use of
> > > > mpls only in rdomain 1. rt_ifa_add doesn't help this situation because
> > > > it assumes that we're adding a route inside the rdomain the interface is
> > > > in, rather than the one it tunnels in. Changing this assumption means
> > > > forking rt_ifa_add, and oh look, that's what I've started here.
> > > 
> > > I'm afraid of adding new rtrequest(9) calls, especially with new error
> > > conditions.  Why can't you add the rdomain of the tunnel?
> > 
> > ifp->if_rdomain refers to the rdomain of the traffic going over an
> > interface. For tunnels it is the rdomain of packets inside the
> > tunnel. Another way of saying it is that a tunnel provides an overlay
> > on top of an underlay network, and ifp->if_rdomain is the rdomain
> > of the overlay traffic.
> > 
> > When we're adding the mpls rtable entries we're specifying connectivity
> > outside the tunnel, aka routes in the network underlay. Currently
> > rt_ifa_{add,del} force anything with RTF_MPLS into rdomain 0 despite
> > what the overlay is, and despite where you might want to run the
> > underlay. One of the configurations claudio@ and I talked about for
> > the firewalls at work was running MPLS in rdomain 1 and leaving our
> > current config running in rdomain 0. This model is not currently
> > possible with RTF_MPLS routes forced to the mpls rtable in rdomain 0.
> 
> I don't understand how you're going to select the rtable.  It will be
> different than the MPLS interface's rdomain, right?

The tunnel interfaces have SIOCSLIFPHYRTABLE and SIOCGLIFPHYRTABLE. I
was going to reuse that for specifying the rdomain MPLS "tunnels" in.

> The current codes for !AF_MPLS routes use the current rdomain because
> that's where the L2 entries live in our kernel.

Well, IP things still add their own l2 to the interfaces inside the
rdomain just like now. My current theory is that MPLS using it to add
the input tag was an elegant^Wconvenient hack since you want to do
mostly the same thing as IP, just outside the current rdomain. The hack
assumed MPLS could only exist in rdomain 0, hence the hardcoded values.

>
> > > If all you need is remove RTF_MPATH, then do so :)
> > > 
> > > We can add it to all rt_ifa_add(9) calls to be explicit!
> > 
> > That won't solve the rdomain thing though.
> > 
> > How about making rt_ifa_{add,del} as wrappers around the thing that
> > let's you not specify RTF_MPATH, and explicitly configure the rdomain?
> > This effectively replaces rt_ifa_{add,del} with rt_ifa_{add,del}_rdomain
> > respectively, but provides the old names as wrappers on the new one.
> 
> If all you're after is specifying these two parameters lets add them to
> the existing rt_ifa_add() call :o)  We need fewer APIs!

k.

Let's do that in two steps. The diff below removes the implicit
RTF_MPATH in rt_ifa_add() and has the callers set it explicitly.

A second diff will add the rdomain argument.

OK?

Index: net/route.c
===================================================================
RCS file: /cvs/src/sys/net/route.c,v
retrieving revision 1.379
diff -u -p -r1.379 route.c
--- net/route.c 23 Nov 2018 16:24:11 -0000      1.379
+++ net/route.c 2 Feb 2019 09:56:31 -0000
@@ -1042,7 +1042,7 @@ rt_ifa_add(struct ifaddr *ifa, int flags
 
        memset(&info, 0, sizeof(info));
        info.rti_ifa = ifa;
-       info.rti_flags = flags | RTF_MPATH;
+       info.rti_flags = flags;
        info.rti_info[RTAX_DST] = dst;
        if (flags & RTF_LLINFO)
                info.rti_info[RTAX_GATEWAY] = sdltosa(ifp->if_sadl);
Index: netinet/in.c
===================================================================
RCS file: /cvs/src/sys/netinet/in.c,v
retrieving revision 1.160
diff -u -p -r1.160 in.c
--- netinet/in.c        11 Jul 2018 21:18:23 -0000      1.160
+++ netinet/in.c        2 Feb 2019 09:56:31 -0000
@@ -694,7 +694,7 @@ in_purgeaddr(struct ifaddr *ifa)
 int
 in_addhost(struct in_ifaddr *ia, struct sockaddr_in *dst)
 {
-       return rt_ifa_add(&ia->ia_ifa, RTF_HOST, sintosa(dst));
+       return rt_ifa_add(&ia->ia_ifa, RTF_HOST | RTF_MPATH, sintosa(dst));
 }
 
 int
@@ -712,12 +712,13 @@ in_insert_prefix(struct in_ifaddr *ia)
        struct ifaddr *ifa = &ia->ia_ifa;
        int error;
 
-       error = rt_ifa_add(ifa, RTF_CLONING | RTF_CONNECTED, ifa->ifa_addr);
+       error = rt_ifa_add(ifa, RTF_CLONING | RTF_CONNECTED | RTF_MPATH,
+           ifa->ifa_addr);
        if (error)
                return (error);
 
        if (ia->ia_broadaddr.sin_addr.s_addr != 0)
-               error = rt_ifa_add(ifa, RTF_HOST | RTF_BROADCAST,
+               error = rt_ifa_add(ifa, RTF_HOST | RTF_BROADCAST | RTF_MPATH,
                    ifa->ifa_broadaddr);
 
        return (error);
Index: netinet/ip_mroute.c
===================================================================
RCS file: /cvs/src/sys/netinet/ip_mroute.c,v
retrieving revision 1.123
diff -u -p -r1.123 ip_mroute.c
--- netinet/ip_mroute.c 10 Oct 2018 11:46:59 -0000      1.123
+++ netinet/ip_mroute.c 2 Feb 2019 09:56:31 -0000
@@ -1308,7 +1308,7 @@ rt_mcast_add(struct ifnet *ifp, struct s
                return (NULL);
        }
 
-       rv = rt_ifa_add(ifa, RTF_HOST | RTF_MULTICAST, group);
+       rv = rt_ifa_add(ifa, RTF_HOST | RTF_MULTICAST | RTF_MPATH, group);
        if (rv != 0) {
                DPRINTF("rt_ifa_add failed (%d)", rv);
                return (NULL);
Index: netinet6/in6.c
===================================================================
RCS file: /cvs/src/sys/netinet6/in6.c,v
retrieving revision 1.228
diff -u -p -r1.228 in6.c
--- netinet6/in6.c      5 Oct 2018 07:06:09 -0000       1.228
+++ netinet6/in6.c      2 Feb 2019 09:56:31 -0000
@@ -376,7 +376,8 @@ in6_ioctl_change_ifaddr(u_long cmd, cadd
                        break;  /* No need to install a connected route. */
                }
 
-               error = rt_ifa_add(&ia6->ia_ifa, RTF_CLONING | RTF_CONNECTED,
+               error = rt_ifa_add(&ia6->ia_ifa,
+                   RTF_CLONING | RTF_CONNECTED | RTF_MPATH,
                    ia6->ia_ifa.ifa_addr);
                if (error) {
                        in6_purgeaddr(&ia6->ia_ifa);
@@ -982,7 +983,8 @@ in6_ifinit(struct ifnet *ifp, struct in6
        if ((ifp->if_flags & IFF_POINTOPOINT) && plen == 128 &&
            ia6->ia_dstaddr.sin6_family == AF_INET6) {
                ifa = &ia6->ia_ifa;
-               error = rt_ifa_add(ifa, RTF_HOST, ifa->ifa_dstaddr);
+               error = rt_ifa_add(ifa, RTF_HOST | RTF_MPATH,
+                   ifa->ifa_dstaddr);
                if (error != 0)
                        return (error);
                ia6->ia_flags |= IFA_ROUTE;
Index: netinet6/in6_ifattach.c
===================================================================
RCS file: /cvs/src/sys/netinet6/in6_ifattach.c,v
retrieving revision 1.111
diff -u -p -r1.111 in6_ifattach.c
--- netinet6/in6_ifattach.c     5 Oct 2018 07:06:09 -0000       1.111
+++ netinet6/in6_ifattach.c     2 Feb 2019 09:56:31 -0000
@@ -374,7 +374,7 @@ in6_ifattach_linklocal(struct ifnet *ifp
                return (0); /* No need to install a connected route. */
        }
 
-       flags = RTF_CONNECTED;
+       flags = RTF_CONNECTED | RTF_MPATH;
        if ((ifp->if_flags & IFF_POINTOPOINT) == 0)
                flags |= RTF_CLONING;
 

Reply via email to