jsing@ and matthew@ reported a problem with /32 cloning routes, used at
least on Google Compute Engine.

The problem is that when I have the following routing table, no route
can be clone from 10.128.0.1/32:

10.128.0.1/32      10.128.0.3         UCS        1        0     -     8 vio0
10.128.0.3         70:5f:ca:21:8d:70  UHLl       0        0     -     1 vio0
10.128.0.3/32      10.128.0.3         UCn        0        0     -     4 vio0

The reason is that cloned route inherit the priority of their parent.  So
when the parent is a /32, the kernel will try to insert a cloned route with
the same destination and priority.  This is a multipath case.

However if we use the RTF_MPATH here we still end up with broken ARP.
Because the ARP lookup is built on top of the route lookup, we want to
ensure that no RTF_CLONING route is found.

Hence the fix below that use a higher priority for cloned routes.  With it
the table now looks like:

10.128.0.1         link#1             UHLc       0        1     -     7 vio0
10.128.0.1/32      10.128.0.3         UCS        1        0     -     8 vio0
10.128.0.3         70:5f:ca:21:8d:70  UHLl       0        0     -     1 vio0
10.128.0.3/32      10.128.0.3         UCn        0        0     -     4 vio0

Since priority 1 is reserved, it is safe to use priority of the cloning
route minus one.

ok?

Index: net/route.c
===================================================================
RCS file: /cvs/src/sys/net/route.c,v
retrieving revision 1.342
diff -u -p -r1.342 route.c
--- net/route.c 4 Dec 2016 09:46:39 -0000       1.342
+++ net/route.c 13 Dec 2016 08:46:33 -0000
@@ -252,8 +252,15 @@ rt_match(struct sockaddr *dst, uint32_t 
                        info.rti_info[RTAX_DST] = dst;
 
                        KERNEL_LOCK();
-                       error = rtrequest(RTM_RESOLVE, &info, RTP_DEFAULT,
-                           &rt, tableid);
+                       /*
+                        * The priority of cloned route should be different
+                        * to avoid conflict with /32 cloning routes.
+                        *
+                        * It should also be higher to let the ARP layer find
+                        * cloned routes instead of the cloning one.
+                        */
+                       error = rtrequest(RTM_RESOLVE, &info,
+                           rt->rt_priority - 1, &rt, tableid);
                        if (error) {
                                rt_missmsg(RTM_MISS, &info, 0, RTP_NONE, 0,
                                    error, tableid);
@@ -1046,15 +1053,14 @@ rtrequest(int req, struct rt_addrinfo *i
                ifa->ifa_refcnt++;
                rt->rt_ifa = ifa;
                rt->rt_ifidx = ifp->if_index;
-               if (rt->rt_flags & RTF_CLONED) {
-                       /*
-                        * Copy both metrics and a back pointer to the cloned
-                        * route's parent.
-                        */
-                       rt->rt_rmx = (*ret_nrt)->rt_rmx; /* copy metrics */
-                       rt->rt_priority = (*ret_nrt)->rt_priority;
-                       rt->rt_parent = *ret_nrt;        /* Back ptr. to 
parent. */
-                       rtref(rt->rt_parent);
+               /*
+                * Copy metrics and a back pointer from the cloned
+                * route's parent.
+                */
+               if (ISSET(rt->rt_flags, RTF_CLONED)) {
+                       rtref(*ret_nrt);
+                       rt->rt_parent = *ret_nrt;
+                       rt->rt_rmx = (*ret_nrt)->rt_rmx;
                }
 
                /*

Reply via email to