Martin - You are correct. On my flight home last night( 4am is the best time to get home! ), I found the zebra multipath patch that I forgot to apply! Unfortunately for me the patch depends on code that was removed due to no-one using it on mainline. So I was working on figuring out if I could write code around it.
Updates in the next few days and thanks for testing. donald On Sat, Nov 7, 2015 at 7:49 AM, Martin Winter <[email protected] > wrote: > Tested this, but fails to work for me. > > I’ve tested it together and without the link local patch (Patch 2/4 in > your series) > on top of current Quagga Master. > > BGP does the multi path, but Zebra still only uses one path (the 2nd) > It looks to me like zebra replaces the 1st route instead of adding the > nexthop. > > If you want to try what I did, then checkout branch test/ipv6_ecmp from my > bgptool > at https://git-us.netdef.org/projects/NETDEF/repos/bgptool > > See doc inside README_ipv6_ecmp.md for details on how I test > > https://git-us.netdef.org/projects/NETDEF/repos/bgptool/browse/README_ipv6_ecmp.md?at=refs%2Fheads%2Ftest%2Fipv6_ecmp > > > Below is what I see. > > Regards, > Martin Winter > > > > dut# sh bgp >> BGP table version is 0, local router ID is 192.168.1.101 >> Status codes: s suppressed, d damped, h history, * valid, > best, = >> multipath, >> i internal, r RIB-failure, S Stale, R Removed >> Origin codes: i - IGP, e - EGP, ? - incomplete >> >> Network Next Hop Metric LocPrf Weight Path >> *=i3ffe:5:0:a::/64 fc00:192:168:2::1 >> 100 1000 0 65001 65100 i >> *>i fc00:192:168:1::1 >> 100 1000 0 65001 65100 i >> >> Total number of prefixes 1 >> dut# sh bgp 3ffe:5:0:a::/64 >> BGP routing table entry for 3ffe:5:0:a::/64 >> Paths: (2 available, best #2, table Default-IP-Routing-Table) >> Not advertised to any peer >> 65001 65100 >> fc00:192:168:2::1 from fc00:192:168:2::1 (192.168.2.1) >> (fe80::21c:42ff:fe82:2070) >> Origin IGP, metric 100, localpref 1000, valid, internal, multipath >> Last update: Sat Nov 7 03:45:09 2015 >> >> 65001 65100 >> fc00:192:168:1::1 from fc00:192:168:1::1 (192.168.1.1) >> (fe80::21c:42ff:feda:3815) >> Origin IGP, metric 100, localpref 1000, valid, internal, multipath, >> best >> Last update: Sat Nov 7 03:45:08 2015 >> >> dut# sh ipv6 route >> Codes: K - kernel route, C - connected, S - static, R - RIPng, >> O - OSPFv6, I - IS-IS, B - BGP, A - Babel, >> > - selected route, * - FIB route >> >> C>* ::1/128 is directly connected, lo >> B>* 3ffe:5:0:a::/64 [200/100] via fe80::21c:42ff:fe82:2070, eth2, 00:00:41 >> C>* fc00:192:168:1::/64 is directly connected, eth1 >> C>* fc00:192:168:2::/64 is directly connected, eth2 >> C * fe80::/64 is directly connected, eth1 >> C * fe80::/64 is directly connected, eth2 >> > > Looking at the bgpd log, I see: > > 2015/11/07 03:45:08 BGP: fc00:192:168:1::1 [FSM] Timer (routeadv timer > expire) > 2015/11/07 03:45:08 BGP: fc00:192:168:1::1 rcvd UPDATE w/ attr: , origin > i, mp_nexthop fc00:192:168:1::1(fe80::21c:42ff:feda:3815), localpref 1000, > metric 100, path 65001 65100 > 2015/11/07 03:45:08 BGP: fc00:192:168:1::1 rcvd 3ffe:5:0:a::/64 > 2015/11/07 03:45:08 BGP: Zebra send: IPv6 route add 3ffe:5:0:a::/64 > nexthop fe80::21c:42ff:feda:3815 metric 100 > 2015/11/07 03:45:09 BGP: fc00:192:168:2::1 [FSM] Timer (routeadv timer > expire) > 2015/11/07 03:45:09 BGP: fc00:192:168:2::1 rcvd UPDATE w/ attr: , origin > i, mp_nexthop fc00:192:168:2::1(fe80::21c:42ff:fe82:2070), localpref 1000, > metric 100, path 65001 65100 > 2015/11/07 03:45:09 BGP: fc00:192:168:2::1 rcvd 3ffe:5:0:a::/64 > 2015/11/07 03:45:09 BGP: 3ffe:5:0:a::/64 add mpath nexthop 0.0.0.0 peer > (null) > 2015/11/07 03:45:09 BGP: Zebra send: IPv6 route add 3ffe:5:0:a::/64 > nexthop fe80::21c:42ff:fe82:2070 metric 100 > 2015/11/07 03:45:13 BGP: fc00:192:168:1::1 [FSM] Timer (routeadv timer > expire) > > and in zebra log: > > 2015/11/07 03:45:08 ZEBRA: rib_delnode: 3ffe:5:0:a::/64 vrf 0: rn > 0xce1330, rib 0xcd5580, removing > 2015/11/07 03:45:08 ZEBRA: rib_process: 3ffe:5:0:a::/64 vrf 0: Removing > existing route, fib 0xcd5580 > 2015/11/07 03:45:08 ZEBRA: rib_process: 3ffe:5:0:a::/64 vrf 0: Adding > route, select 0xce1f50 > 2015/11/07 03:45:08 ZEBRA: rib_process: 3ffe:5:0:a::/64 vrf 0: Deleting > fib 0xcd5580, rn 0xce1330 > 2015/11/07 03:45:08 ZEBRA: rib_unlink: 3ffe:5:0:a::/64 vrf 0: rn 0xce1330, > rib 0xcd5580 > 2015/11/07 03:45:09 ZEBRA: zebra message comes from socket [15] > 2015/11/07 03:45:09 ZEBRA: zebra message received [ZEBRA_IPV6_ROUTE_ADD] > 63 in VRF 0 > 2015/11/07 03:45:09 ZEBRA: rib_link: 3ffe:5:0:a::/64 vrf 0: rn 0xce1330, > rib 0xce13f0 > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: called rib_addnode (0xce1330, > 0xce13f0) on new RIB entry > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: dumping RIB entry 0xce13f0 for > 3ffe:5:0:a::/64 vrf 0 > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: refcnt == 0, uptime == > 1446896709, type == 9, table == 0 > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: metric == 100, distance == 200, > flags == 9, status == 0 > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: nexthop_num == 1, > nexthop_active_num == 0, nexthop_fib_num == 0 > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: NH fe80::21c:42ff:fe82:2070 with > flags > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: dump complete > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: calling rib_delnode (0xce1330, > 0xce1f50) on existing RIB entry > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: dumping RIB entry 0xce1f50 for > 3ffe:5:0:a::/64 vrf 0 > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: refcnt == 0, uptime == > 1446896708, type == 9, table == 0 > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: metric == 100, distance == 200, > flags == 25, status == 0 > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: nexthop_num == 1, > nexthop_active_num == 1, nexthop_fib_num == 0 > 2015/11/07 03:45:09 ZEBRA: rib_add_ipv6: NH fe80::21c:42ff:feda:3815 with > flags ACTIVE FIB > > > > > On 6 Nov 2015, at 8:57, Donald Sharp wrote: > > From: Ayan Banerjee <[email protected]> >> >> Signed-off-by: Ayan Banerjee <[email protected]> >> Signed-off-by: Dinesh G Dutt <[email protected]> >> Reviewed-by: Scott Feldman <[email protected]> >> --- >> bgpd/bgp_main.c | 2 + >> bgpd/bgp_vty.c | 6 +++ >> bgpd/bgp_zebra.c | 130 >> +++++++++++++++++++++++++++++++++++++++++++++--------- >> bgpd/bgp_zebra.h | 2 + >> 4 files changed, 120 insertions(+), 20 deletions(-) >> >> diff --git a/bgpd/bgp_main.c b/bgpd/bgp_main.c >> index 7c2988c..13e0dea 100644 >> --- a/bgpd/bgp_main.c >> +++ b/bgpd/bgp_main.c >> @@ -300,6 +300,8 @@ bgp_exit (int status) >> zclient_free (zlookup); >> if (bgp_nexthop_buf) >> stream_free (bgp_nexthop_buf); >> + if (bgp_ifindices_buf) >> + stream_free (bgp_ifindices_buf); >> >> /* reverse bgp_master_init */ >> if (master) >> diff --git a/bgpd/bgp_vty.c b/bgpd/bgp_vty.c >> index 4fd255f..3f2c49a 100644 >> --- a/bgpd/bgp_vty.c >> +++ b/bgpd/bgp_vty.c >> @@ -9179,12 +9179,18 @@ bgp_vty_init (void) >> install_element (BGP_IPV4_NODE, &bgp_maxpaths_cmd); >> install_element (BGP_IPV4_NODE, &no_bgp_maxpaths_cmd); >> install_element (BGP_IPV4_NODE, &no_bgp_maxpaths_arg_cmd); >> + install_element (BGP_IPV6_NODE, &bgp_maxpaths_cmd); >> + install_element (BGP_IPV6_NODE, &no_bgp_maxpaths_cmd); >> + install_element (BGP_IPV6_NODE, &no_bgp_maxpaths_arg_cmd); >> install_element (BGP_NODE, &bgp_maxpaths_ibgp_cmd); >> install_element (BGP_NODE, &no_bgp_maxpaths_ibgp_cmd); >> install_element (BGP_NODE, &no_bgp_maxpaths_ibgp_arg_cmd); >> install_element (BGP_IPV4_NODE, &bgp_maxpaths_ibgp_cmd); >> install_element (BGP_IPV4_NODE, &no_bgp_maxpaths_ibgp_cmd); >> install_element (BGP_IPV4_NODE, &no_bgp_maxpaths_ibgp_arg_cmd); >> + install_element (BGP_IPV6_NODE, &bgp_maxpaths_ibgp_cmd); >> + install_element (BGP_IPV6_NODE, &no_bgp_maxpaths_ibgp_cmd); >> + install_element (BGP_IPV6_NODE, &no_bgp_maxpaths_ibgp_arg_cmd); >> >> /* "timers bgp" commands. */ >> install_element (BGP_NODE, &bgp_timers_cmd); >> diff --git a/bgpd/bgp_zebra.c b/bgpd/bgp_zebra.c >> index 2616351..5e25da9 100644 >> --- a/bgpd/bgp_zebra.c >> +++ b/bgpd/bgp_zebra.c >> @@ -45,6 +45,7 @@ struct in_addr router_id_zebra; >> >> /* Growable buffer for nexthops sent to zebra */ >> struct stream *bgp_nexthop_buf = NULL; >> +struct stream *bgp_ifindices_buf = NULL; >> >> /* Router-id update message from zebra. */ >> static int >> @@ -674,6 +675,7 @@ bgp_zebra_announce (struct prefix *p, struct bgp_info >> *info, struct bgp *bgp, sa >> struct peer *peer; >> struct bgp_info *mpinfo; >> size_t oldsize, newsize; >> + u_int32_t nhcount; >> >> if (zclient->sock < 0) >> return; >> @@ -694,26 +696,27 @@ bgp_zebra_announce (struct prefix *p, struct >> bgp_info *info, struct bgp *bgp, sa >> || CHECK_FLAG (peer->flags, PEER_FLAG_DISABLE_CONNECTED_CHECK)) >> SET_FLAG (flags, ZEBRA_FLAG_INTERNAL); >> >> - /* resize nexthop buffer size if necessary */ >> - if ((oldsize = stream_get_size (bgp_nexthop_buf)) < >> - (sizeof (struct in_addr *) * (bgp_info_mpath_count (info) + 1))) >> - { >> - newsize = (sizeof (struct in_addr *) * (bgp_info_mpath_count >> (info) + 1)); >> - newsize = stream_resize (bgp_nexthop_buf, newsize); >> - if (newsize == oldsize) >> - { >> - zlog_err ("can't resize nexthop buffer"); >> - return; >> - } >> - } >> - >> - stream_reset (bgp_nexthop_buf); >> + nhcount = 1 + bgp_info_mpath_count (info); >> >> if (p->family == AF_INET) >> { >> struct zapi_ipv4 api; >> struct in_addr *nexthop; >> >> + /* resize nexthop buffer size if necessary */ >> + if ((oldsize = stream_get_size (bgp_nexthop_buf)) < >> + (sizeof (struct in_addr *) * nhcount)) >> + { >> + newsize = (sizeof (struct in_addr *) * nhcount); >> + newsize = stream_resize (bgp_nexthop_buf, newsize); >> + if (newsize == oldsize) >> + { >> + zlog_err ("can't resize nexthop buffer"); >> + return; >> + } >> + } >> + stream_reset (bgp_nexthop_buf); >> + >> api.vrf_id = VRF_DEFAULT; >> api.flags = flags; >> nexthop = &info->attr->nexthop; >> @@ -729,7 +732,7 @@ bgp_zebra_announce (struct prefix *p, struct bgp_info >> *info, struct bgp *bgp, sa >> api.message = 0; >> api.safi = safi; >> SET_FLAG (api.message, ZAPI_MESSAGE_NEXTHOP); >> - api.nexthop_num = 1 + bgp_info_mpath_count (info); >> + api.nexthop_num = nhcount; >> api.nexthop = (struct in_addr **)STREAM_DATA (bgp_nexthop_buf); >> api.ifindex_num = 0; >> SET_FLAG (api.message, ZAPI_MESSAGE_METRIC); >> @@ -763,16 +766,46 @@ bgp_zebra_announce (struct prefix *p, struct >> bgp_info *info, struct bgp *bgp, sa >> (struct prefix_ipv4 *) p, &api); >> } >> #ifdef HAVE_IPV6 >> + >> /* We have to think about a IPv6 link-local address curse. */ >> if (p->family == AF_INET6) >> { >> unsigned int ifindex; >> struct in6_addr *nexthop; >> struct zapi_ipv6 api; >> + int valid_nh_count = 0; >> + >> + /* resize nexthop buffer size if necessary */ >> + if ((oldsize = stream_get_size (bgp_nexthop_buf)) < >> + (sizeof (struct in6_addr *) * nhcount)) >> + { >> + newsize = (sizeof (struct in6_addr *) * nhcount); >> + newsize = stream_resize (bgp_nexthop_buf, newsize); >> + if (newsize == oldsize) >> + { >> + zlog_err ("can't resize nexthop buffer"); >> + return; >> + } >> + } >> + stream_reset (bgp_nexthop_buf); >> + >> + /* resize ifindices buffer size if necessary */ >> + if ((oldsize = stream_get_size (bgp_ifindices_buf)) < >> + (sizeof (unsigned int) * nhcount)) >> + { >> + newsize = (sizeof (unsigned int) * nhcount); >> + newsize = stream_resize (bgp_ifindices_buf, newsize); >> + if (newsize == oldsize) >> + { >> + zlog_err ("can't resize nexthop buffer"); >> + return; >> + } >> + } >> + stream_reset (bgp_ifindices_buf); >> >> ifindex = 0; >> nexthop = NULL; >> - >> + >> assert (info->attr->extra); >> >> /* Only global address nexthop exists. */ >> @@ -803,6 +836,62 @@ bgp_zebra_announce (struct prefix *p, struct >> bgp_info *info, struct bgp *bgp, sa >> else if (info->peer->nexthop.ifp) >> ifindex = info->peer->nexthop.ifp->ifindex; >> } >> + stream_put (bgp_nexthop_buf, &nexthop, sizeof (struct in6_addr *)); >> + stream_put (bgp_ifindices_buf, &ifindex, sizeof (unsigned int)); >> + valid_nh_count++; >> + >> + for (mpinfo = bgp_info_mpath_first (info); mpinfo; >> + mpinfo = bgp_info_mpath_next (mpinfo)) >> + { >> + /* Only global address nexthop exists. */ >> + if (mpinfo->attr->extra->mp_nexthop_len == 16) >> + { >> + nexthop = &mpinfo->attr->extra->mp_nexthop_global; >> + } >> + /* If both global and link-local address present. */ >> + if (mpinfo->attr->extra->mp_nexthop_len == 32) >> + { >> + /* Workaround for Cisco's nexthop bug. */ >> + if (IN6_IS_ADDR_UNSPECIFIED >> (&mpinfo->attr->extra->mp_nexthop_global) >> + && mpinfo->peer->su_remote->sa.sa_family == AF_INET6) >> + { >> + nexthop = &mpinfo->peer->su_remote->sin6.sin6_addr; >> + } >> + else >> + { >> + nexthop = &mpinfo->attr->extra->mp_nexthop_local; >> + } >> + >> + if (mpinfo->peer->nexthop.ifp) >> + { >> + ifindex = mpinfo->peer->nexthop.ifp->ifindex; >> + } >> + } >> + if (nexthop == NULL) >> + { >> + continue; >> + } >> + >> + if (IN6_IS_ADDR_LINKLOCAL (nexthop) && ! ifindex) >> + { >> + if (mpinfo->peer->ifname) >> + { >> + ifindex = if_nametoindex (mpinfo->peer->ifname); >> + } >> + else if (mpinfo->peer->nexthop.ifp) >> + { >> + ifindex = mpinfo->peer->nexthop.ifp->ifindex; >> + } >> + } >> + if (ifindex == 0) >> + { >> + continue; >> + } >> + >> + stream_put (bgp_nexthop_buf, &nexthop, sizeof (struct in6_addr >> *)); >> + stream_put (bgp_ifindices_buf, &ifindex, sizeof (unsigned >> int)); >> + valid_nh_count++; >> + } >> >> /* Make Zebra API structure. */ >> api.vrf_id = VRF_DEFAULT; >> @@ -811,11 +900,11 @@ bgp_zebra_announce (struct prefix *p, struct >> bgp_info *info, struct bgp *bgp, sa >> api.message = 0; >> api.safi = safi; >> SET_FLAG (api.message, ZAPI_MESSAGE_NEXTHOP); >> - api.nexthop_num = 1; >> - api.nexthop = &nexthop; >> + api.nexthop_num = valid_nh_count; >> + api.nexthop = (struct in6_addr **)STREAM_DATA (bgp_nexthop_buf); >> SET_FLAG (api.message, ZAPI_MESSAGE_IFINDEX); >> - api.ifindex_num = 1; >> - api.ifindex = &ifindex; >> + api.ifindex_num = valid_nh_count; >> + api.ifindex = (unsigned int *)STREAM_DATA (bgp_ifindices_buf); >> SET_FLAG (api.message, ZAPI_MESSAGE_METRIC); >> api.metric = info->attr->med; >> >> @@ -1115,4 +1204,5 @@ bgp_zebra_init (void) >> #endif /* HAVE_IPV6 */ >> >> bgp_nexthop_buf = stream_new(BGP_NEXTHOP_BUF_SIZE); >> + bgp_ifindices_buf = stream_new(BGP_IFINDICES_BUF_SIZE); >> } >> diff --git a/bgpd/bgp_zebra.h b/bgpd/bgp_zebra.h >> index 8099193..466758e 100644 >> --- a/bgpd/bgp_zebra.h >> +++ b/bgpd/bgp_zebra.h >> @@ -22,8 +22,10 @@ Boston, MA 02111-1307, USA. */ >> #define _QUAGGA_BGP_ZEBRA_H >> >> #define BGP_NEXTHOP_BUF_SIZE (8 * sizeof (struct in_addr *)) >> +#define BGP_IFINDICES_BUF_SIZE (8 * sizeof (unsigned int)) >> >> extern struct stream *bgp_nexthop_buf; >> +extern struct stream *bgp_ifindices_buf; >> >> extern void bgp_zebra_init (void); >> extern int bgp_if_update_all (void); >> -- >> 1.7.10.4 >> >> >> _______________________________________________ >> Quagga-dev mailing list >> [email protected] >> https://lists.quagga.net/mailman/listinfo/quagga-dev >> >
_______________________________________________ Quagga-dev mailing list [email protected] https://lists.quagga.net/mailman/listinfo/quagga-dev
