Tested this, but fails to work for me.

I’ve tested it together and without the link local patch (Patch 2/4 in your series)
on top of current Quagga Master.

BGP does the multi path, but Zebra still only uses one path (the 2nd)
It looks to me like zebra replaces the 1st route instead of adding the nexthop.

If you want to try what I did, then checkout branch test/ipv6_ecmp from my bgptool
at https://git-us.netdef.org/projects/NETDEF/repos/bgptool

See doc inside README_ipv6_ecmp.md for details on how I test
https://git-us.netdef.org/projects/NETDEF/repos/bgptool/browse/README_ipv6_ecmp.md?at=refs%2Fheads%2Ftest%2Fipv6_ecmp


Below is what I see.

Regards,
   Martin Winter



dut# sh bgp
BGP table version is 0, local router ID is 192.168.1.101
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
            i internal, r RIB-failure, S Stale, R Removed
Origin codes: i - IGP, e - EGP, ? - incomplete

 Network          Next Hop            Metric LocPrf Weight Path
*=i3ffe:5:0:a::/64  fc00:192:168:2::1
100 1000 0 65001 65100 i
*>i                 fc00:192:168:1::1
100 1000 0 65001 65100 i

Total number of prefixes 1
dut# sh bgp 3ffe:5:0:a::/64
BGP routing table entry for 3ffe:5:0:a::/64
Paths: (2 available, best #2, table Default-IP-Routing-Table)
Not advertised to any peer
65001 65100
  fc00:192:168:2::1 from fc00:192:168:2::1 (192.168.2.1)
  (fe80::21c:42ff:fe82:2070)
    Origin IGP, metric 100, localpref 1000, valid, internal, multipath
    Last update: Sat Nov  7 03:45:09 2015

65001 65100
  fc00:192:168:1::1 from fc00:192:168:1::1 (192.168.1.1)
  (fe80::21c:42ff:feda:3815)
Origin IGP, metric 100, localpref 1000, valid, internal, multipath, best
    Last update: Sat Nov  7 03:45:08 2015

dut# sh ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
     O - OSPFv6, I - IS-IS, B - BGP, A - Babel,
     > - selected route, * - FIB route

C>* ::1/128 is directly connected, lo
B>* 3ffe:5:0:a::/64 [200/100] via fe80::21c:42ff:fe82:2070, eth2, 00:00:41
C>* fc00:192:168:1::/64 is directly connected, eth1
C>* fc00:192:168:2::/64 is directly connected, eth2
C * fe80::/64 is directly connected, eth1
C * fe80::/64 is directly connected, eth2

Looking at the bgpd log, I see:

2015/11/07 03:45:08 BGP: fc00:192:168:1::1 [FSM] Timer (routeadv timer expire) 2015/11/07 03:45:08 BGP: fc00:192:168:1::1 rcvd UPDATE w/ attr: , origin i, mp_nexthop fc00:192:168:1::1(fe80::21c:42ff:feda:3815), localpref 1000, metric 100, path 65001 65100
2015/11/07 03:45:08 BGP: fc00:192:168:1::1 rcvd 3ffe:5:0:a::/64
2015/11/07 03:45:08 BGP: Zebra send: IPv6 route add 3ffe:5:0:a::/64 nexthop fe80::21c:42ff:feda:3815 metric 100 2015/11/07 03:45:09 BGP: fc00:192:168:2::1 [FSM] Timer (routeadv timer expire) 2015/11/07 03:45:09 BGP: fc00:192:168:2::1 rcvd UPDATE w/ attr: , origin i, mp_nexthop fc00:192:168:2::1(fe80::21c:42ff:fe82:2070), localpref 1000, metric 100, path 65001 65100
2015/11/07 03:45:09 BGP: fc00:192:168:2::1 rcvd 3ffe:5:0:a::/64
2015/11/07 03:45:09 BGP: 3ffe:5:0:a::/64 add mpath nexthop 0.0.0.0 peer (null) 2015/11/07 03:45:09 BGP: Zebra send: IPv6 route add 3ffe:5:0:a::/64 nexthop fe80::21c:42ff:fe82:2070 metric 100 2015/11/07 03:45:13 BGP: fc00:192:168:1::1 [FSM] Timer (routeadv timer expire)

and in zebra log:

2015/11/07 03:45:08 ZEBRA: rib_delnode: 3ffe:5:0:a::/64 vrf 0: rn 0xce1330, rib 0xcd5580, removing 2015/11/07 03:45:08 ZEBRA: rib_process: 3ffe:5:0:a::/64 vrf 0: Removing existing route, fib 0xcd5580 2015/11/07 03:45:08 ZEBRA: rib_process: 3ffe:5:0:a::/64 vrf 0: Adding route, select 0xce1f50 2015/11/07 03:45:08 ZEBRA: rib_process: 3ffe:5:0:a::/64 vrf 0: Deleting fib 0xcd5580, rn 0xce1330 2015/11/07 03:45:08 ZEBRA: rib_unlink: 3ffe:5:0:a::/64 vrf 0: rn 0xce1330, rib 0xcd5580
2015/11/07 03:45:09 ZEBRA: zebra message comes from socket [15]
2015/11/07 03:45:09 ZEBRA: zebra message received [ZEBRA_IPV6_ROUTE_ADD] 63 in VRF 0 2015/11/07 03:45:09 ZEBRA: rib_link: 3ffe:5:0:a::/64 vrf 0: rn 0xce1330, rib 0xce13f0 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: called rib_addnode (0xce1330, 0xce13f0) on new RIB entry 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: dumping RIB entry 0xce13f0 for 3ffe:5:0:a::/64 vrf 0 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: refcnt == 0, uptime == 1446896709, type == 9, table == 0 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: metric == 100, distance == 200, flags == 9, status == 0 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: nexthop_num == 1, nexthop_active_num == 0, nexthop_fib_num == 0 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: NH fe80::21c:42ff:fe82:2070 with flags
2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: dump complete
2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: calling rib_delnode (0xce1330, 0xce1f50) on existing RIB entry 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: dumping RIB entry 0xce1f50 for 3ffe:5:0:a::/64 vrf 0 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: refcnt == 0, uptime == 1446896708, type == 9, table == 0 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: metric == 100, distance == 200, flags == 25, status == 0 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: nexthop_num == 1, nexthop_active_num == 1, nexthop_fib_num == 0 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: NH fe80::21c:42ff:feda:3815 with flags ACTIVE FIB




On 6 Nov 2015, at 8:57, Donald Sharp wrote:

From: Ayan Banerjee <[email protected]>

Signed-off-by: Ayan Banerjee <[email protected]>
Signed-off-by: Dinesh G Dutt <[email protected]>
Reviewed-by: Scott Feldman <[email protected]>
---
bgpd/bgp_main.c  |    2 +
bgpd/bgp_vty.c   |    6 +++
bgpd/bgp_zebra.c | 130 +++++++++++++++++++++++++++++++++++++++++++++---------
bgpd/bgp_zebra.h |    2 +
4 files changed, 120 insertions(+), 20 deletions(-)

diff --git a/bgpd/bgp_main.c b/bgpd/bgp_main.c
index 7c2988c..13e0dea 100644
--- a/bgpd/bgp_main.c
+++ b/bgpd/bgp_main.c
@@ -300,6 +300,8 @@ bgp_exit (int status)
  zclient_free (zlookup);
if (bgp_nexthop_buf)
  stream_free (bgp_nexthop_buf);
+  if (bgp_ifindices_buf)
+    stream_free (bgp_ifindices_buf);

/* reverse bgp_master_init */
if (master)
diff --git a/bgpd/bgp_vty.c b/bgpd/bgp_vty.c
index 4fd255f..3f2c49a 100644
--- a/bgpd/bgp_vty.c
+++ b/bgpd/bgp_vty.c
@@ -9179,12 +9179,18 @@ bgp_vty_init (void)
install_element (BGP_IPV4_NODE, &bgp_maxpaths_cmd);
install_element (BGP_IPV4_NODE, &no_bgp_maxpaths_cmd);
install_element (BGP_IPV4_NODE, &no_bgp_maxpaths_arg_cmd);
+  install_element (BGP_IPV6_NODE, &bgp_maxpaths_cmd);
+  install_element (BGP_IPV6_NODE, &no_bgp_maxpaths_cmd);
+  install_element (BGP_IPV6_NODE, &no_bgp_maxpaths_arg_cmd);
install_element (BGP_NODE, &bgp_maxpaths_ibgp_cmd);
install_element (BGP_NODE, &no_bgp_maxpaths_ibgp_cmd);
install_element (BGP_NODE, &no_bgp_maxpaths_ibgp_arg_cmd);
install_element (BGP_IPV4_NODE, &bgp_maxpaths_ibgp_cmd);
install_element (BGP_IPV4_NODE, &no_bgp_maxpaths_ibgp_cmd);
install_element (BGP_IPV4_NODE, &no_bgp_maxpaths_ibgp_arg_cmd);
+  install_element (BGP_IPV6_NODE, &bgp_maxpaths_ibgp_cmd);
+  install_element (BGP_IPV6_NODE, &no_bgp_maxpaths_ibgp_cmd);
+  install_element (BGP_IPV6_NODE, &no_bgp_maxpaths_ibgp_arg_cmd);

/* "timers bgp" commands. */
install_element (BGP_NODE, &bgp_timers_cmd);
diff --git a/bgpd/bgp_zebra.c b/bgpd/bgp_zebra.c
index 2616351..5e25da9 100644
--- a/bgpd/bgp_zebra.c
+++ b/bgpd/bgp_zebra.c
@@ -45,6 +45,7 @@ struct in_addr router_id_zebra;

/* Growable buffer for nexthops sent to zebra */
struct stream *bgp_nexthop_buf = NULL;
+struct stream *bgp_ifindices_buf = NULL;

/* Router-id update message from zebra. */
static int
@@ -674,6 +675,7 @@ bgp_zebra_announce (struct prefix *p, struct bgp_info *info, struct bgp *bgp, sa
struct peer *peer;
struct bgp_info *mpinfo;
size_t oldsize, newsize;
+  u_int32_t nhcount;

if (zclient->sock < 0)
  return;
@@ -694,26 +696,27 @@ bgp_zebra_announce (struct prefix *p, struct bgp_info *info, struct bgp *bgp, sa
    || CHECK_FLAG (peer->flags, PEER_FLAG_DISABLE_CONNECTED_CHECK))
  SET_FLAG (flags, ZEBRA_FLAG_INTERNAL);

-  /* resize nexthop buffer size if necessary */
-  if ((oldsize = stream_get_size (bgp_nexthop_buf)) <
- (sizeof (struct in_addr *) * (bgp_info_mpath_count (info) + 1)))
-    {
- newsize = (sizeof (struct in_addr *) * (bgp_info_mpath_count (info) + 1));
-      newsize = stream_resize (bgp_nexthop_buf, newsize);
-      if (newsize == oldsize)
-       {
-         zlog_err ("can't resize nexthop buffer");
-         return;
-       }
-    }
-
-  stream_reset (bgp_nexthop_buf);
+  nhcount = 1 + bgp_info_mpath_count (info);

if (p->family == AF_INET)
  {
    struct zapi_ipv4 api;
    struct in_addr *nexthop;

+      /* resize nexthop buffer size if necessary */
+      if ((oldsize = stream_get_size (bgp_nexthop_buf)) <
+          (sizeof (struct in_addr *) * nhcount))
+        {
+          newsize = (sizeof (struct in_addr *) * nhcount);
+          newsize = stream_resize (bgp_nexthop_buf, newsize);
+          if (newsize == oldsize)
+            {
+                 zlog_err ("can't resize nexthop buffer");
+                 return;
+            }
+        }
+      stream_reset (bgp_nexthop_buf);
+
    api.vrf_id = VRF_DEFAULT;
    api.flags = flags;
    nexthop = &info->attr->nexthop;
@@ -729,7 +732,7 @@ bgp_zebra_announce (struct prefix *p, struct bgp_info *info, struct bgp *bgp, sa
    api.message = 0;
    api.safi = safi;
    SET_FLAG (api.message, ZAPI_MESSAGE_NEXTHOP);
-      api.nexthop_num = 1 + bgp_info_mpath_count (info);
+      api.nexthop_num = nhcount;
    api.nexthop = (struct in_addr **)STREAM_DATA (bgp_nexthop_buf);
    api.ifindex_num = 0;
    SET_FLAG (api.message, ZAPI_MESSAGE_METRIC);
@@ -763,16 +766,46 @@ bgp_zebra_announce (struct prefix *p, struct bgp_info *info, struct bgp *bgp, sa
                     (struct prefix_ipv4 *) p, &api);
  }
#ifdef HAVE_IPV6
+
/* We have to think about a IPv6 link-local address curse. */
if (p->family == AF_INET6)
  {
    unsigned int ifindex;
    struct in6_addr *nexthop;
    struct zapi_ipv6 api;
+      int valid_nh_count = 0;
+
+      /* resize nexthop buffer size if necessary */
+      if ((oldsize = stream_get_size (bgp_nexthop_buf)) <
+          (sizeof (struct in6_addr *) * nhcount))
+        {
+          newsize = (sizeof (struct in6_addr *) * nhcount);
+          newsize = stream_resize (bgp_nexthop_buf, newsize);
+          if (newsize == oldsize)
+            {
+              zlog_err ("can't resize nexthop buffer");
+              return;
+            }
+        }
+      stream_reset (bgp_nexthop_buf);
+
+      /* resize ifindices buffer size if necessary */
+      if ((oldsize = stream_get_size (bgp_ifindices_buf)) <
+          (sizeof (unsigned int) * nhcount))
+        {
+          newsize = (sizeof (unsigned int) * nhcount);
+          newsize = stream_resize (bgp_ifindices_buf, newsize);
+          if (newsize == oldsize)
+            {
+              zlog_err ("can't resize nexthop buffer");
+              return;
+            }
+        }
+      stream_reset (bgp_ifindices_buf);

    ifindex = 0;
    nexthop = NULL;
-
+
    assert (info->attr->extra);

    /* Only global address nexthop exists. */
@@ -803,6 +836,62 @@ bgp_zebra_announce (struct prefix *p, struct bgp_info *info, struct bgp *bgp, sa
          else if (info->peer->nexthop.ifp)
            ifindex = info->peer->nexthop.ifp->ifindex;
        }
+ stream_put (bgp_nexthop_buf, &nexthop, sizeof (struct in6_addr *)); + stream_put (bgp_ifindices_buf, &ifindex, sizeof (unsigned int));
+      valid_nh_count++;
+
+      for (mpinfo = bgp_info_mpath_first (info); mpinfo;
+           mpinfo = bgp_info_mpath_next (mpinfo))
+       {
+          /* Only global address nexthop exists. */
+          if (mpinfo->attr->extra->mp_nexthop_len == 16)
+            {
+              nexthop = &mpinfo->attr->extra->mp_nexthop_global;
+            }
+          /* If both global and link-local address present. */
+                 if (mpinfo->attr->extra->mp_nexthop_len == 32)
+            {
+              /* Workaround for Cisco's nexthop bug.  */
+ if (IN6_IS_ADDR_UNSPECIFIED (&mpinfo->attr->extra->mp_nexthop_global) + && mpinfo->peer->su_remote->sa.sa_family == AF_INET6)
+                {
+ nexthop = &mpinfo->peer->su_remote->sin6.sin6_addr;
+                }
+              else
+                {
+                  nexthop = &mpinfo->attr->extra->mp_nexthop_local;
+               }
+
+              if (mpinfo->peer->nexthop.ifp)
+                {
+                  ifindex = mpinfo->peer->nexthop.ifp->ifindex;
+                }
+            }
+             if (nexthop == NULL)
+               {
+                 continue;
+               }
+
+          if (IN6_IS_ADDR_LINKLOCAL (nexthop) && ! ifindex)
+               {
+                 if (mpinfo->peer->ifname)
+                {
+                   ifindex = if_nametoindex (mpinfo->peer->ifname);
+               }
+                 else if (mpinfo->peer->nexthop.ifp)
+                       {
+                          ifindex = mpinfo->peer->nexthop.ifp->ifindex;
+                       }
+                }
+             if (ifindex == 0)
+               {
+                 continue;
+               }
+
+ stream_put (bgp_nexthop_buf, &nexthop, sizeof (struct in6_addr *)); + stream_put (bgp_ifindices_buf, &ifindex, sizeof (unsigned int));
+          valid_nh_count++;
+       }

    /* Make Zebra API structure. */
    api.vrf_id = VRF_DEFAULT;
@@ -811,11 +900,11 @@ bgp_zebra_announce (struct prefix *p, struct bgp_info *info, struct bgp *bgp, sa
    api.message = 0;
    api.safi = safi;
    SET_FLAG (api.message, ZAPI_MESSAGE_NEXTHOP);
-      api.nexthop_num = 1;
-      api.nexthop = &nexthop;
+      api.nexthop_num = valid_nh_count;
+ api.nexthop = (struct in6_addr **)STREAM_DATA (bgp_nexthop_buf);
    SET_FLAG (api.message, ZAPI_MESSAGE_IFINDEX);
-      api.ifindex_num = 1;
-      api.ifindex = &ifindex;
+      api.ifindex_num = valid_nh_count;
+      api.ifindex = (unsigned int *)STREAM_DATA (bgp_ifindices_buf);
    SET_FLAG (api.message, ZAPI_MESSAGE_METRIC);
    api.metric = info->attr->med;

@@ -1115,4 +1204,5 @@ bgp_zebra_init (void)
#endif /* HAVE_IPV6 */

bgp_nexthop_buf = stream_new(BGP_NEXTHOP_BUF_SIZE);
+  bgp_ifindices_buf = stream_new(BGP_IFINDICES_BUF_SIZE);
}
diff --git a/bgpd/bgp_zebra.h b/bgpd/bgp_zebra.h
index 8099193..466758e 100644
--- a/bgpd/bgp_zebra.h
+++ b/bgpd/bgp_zebra.h
@@ -22,8 +22,10 @@ Boston, MA 02111-1307, USA.  */
#define _QUAGGA_BGP_ZEBRA_H

#define BGP_NEXTHOP_BUF_SIZE (8 * sizeof (struct in_addr *))
+#define BGP_IFINDICES_BUF_SIZE (8 * sizeof (unsigned int))

extern struct stream *bgp_nexthop_buf;
+extern struct stream *bgp_ifindices_buf;

extern void bgp_zebra_init (void);
extern int bgp_if_update_all (void);
--
1.7.10.4


_______________________________________________
Quagga-dev mailing list
[email protected]
https://lists.quagga.net/mailman/listinfo/quagga-dev

_______________________________________________
Quagga-dev mailing list
[email protected]
https://lists.quagga.net/mailman/listinfo/quagga-dev

Reply via email to