On 11/12/07, Claudio Jeker <[EMAIL PROTECTED]> wrote:
>
> On Tue, Nov 06, 2007 at 06:26:47PM +0100, Tony Sarendal wrote:
> > New version. Less duplication and a nice feature as bonus.
> > With softreconfig in enabled the looped prefixes are accepted
> > into the Adj-RIB-In.
> >
> > This means that I can tell if my neighbor AS is using
> > a path via myself. Either I'm tired or that is cool.
> >
> > router-02# bgpctl show rib 192.168.0.0
> > flags: * = Valid, > = Selected, I = via IBGP, A = Announced
> > origin: i = IGP, e = EGP, ? = Incomplete
> >
> > flags destination gateway lpref med aspath origin
> > *> 192.168.0.0/16 192.168.100.5 100 0 65100 i
> > * 192.168.0.0/16 172.17.1.1 100 0 65200 65100 i
> > * 192.168.0.0/16 172.17.1.5 100 0 65200 65200 65200
> 65200 65100 i
> > router-02#
> >
> > I now kill the peering that 65200 has to 65100, removing their
> > direct path to 192.168.0.0/16.
> >
> > router-02# bgpctl show rib 192.168.0.0
> > flags: * = Valid, > = Selected, I = via IBGP, A = Announced
> > origin: i = IGP, e = EGP, ? = Incomplete
> >
> > flags destination gateway lpref med aspath origin
> > *> 192.168.0.0/16 192.168.100.5 100 0 65100 i
> > router-02#
> >
> > Sweet, the looping issue is gone.
> > Here is the bonus:
> >
> > router-02# bgpctl show rib neigh 172.17.1.5 in | grep 65300
> > * 172.17.0.2/32 172.17.1.5 100 0 65200 65300 i
> > * 192.168.0.0/16 172.17.1.5 100 0 65200 65300 65100
> i
> > * 192.168.100.4/30 172.17.1.5 100 0 65200 65300 i
> > router-02#
> >
> > I now see the paths that the peer uses my network to access.
> > Note that this depends a bit on remote implementation.
> > I think this works agains a cisco router.
> >
> > /Tony
> >
> >
> > Index: rde.c
> > ===================================================================
> > RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
> > retrieving revision 1.228
> > diff -u -r1.228 rde.c
> > --- rde.c 16 Sep 2007 15:20:50 -0000 1.228
> > +++ rde.c 6 Nov 2007 17:08:50 -0000
> > @@ -919,12 +919,6 @@
> > /* shift to NLRI information */
> > p += 2 + attrpath_len;
> >
> > - /* aspath needs to be loop free nota bene this is not a hard error
> */
> > - if (peer->conf.ebgp && !aspath_loopfree(asp->aspath, conf->as)) {
> > - error = 0;
> > - goto done;
> > - }
> > -
> > /* parse nlri prefix */
> > while (nlri_len > 0) {
> > if ((pos = rde_update_get_prefix(p, nlri_len, &prefix,
> > @@ -977,10 +971,18 @@
> > if (fasp == NULL)
> > fasp = asp;
> >
> > - rde_update_log("update", peer,
> &fasp->nexthop->exit_nexthop,
> > - &prefix, prefixlen);
> > - path_update(peer, fasp, &prefix, prefixlen, F_LOCAL);
> > -
> > + rde_update_log("update", peer,
> > + &fasp->nexthop->exit_nexthop,&prefix,
> > + prefixlen);
> > + /* handle an update with loop as a withdraw */
> > + if (peer->conf.ebgp && !aspath_loopfree(asp->aspath,
> > + conf->as))
> > + prefix_remove(peer, &prefix, prefixlen,
> > + F_LOCAL);
> > + else
> > + path_update(peer, fasp, &prefix, prefixlen,
> > + F_LOCAL);
> > +
> > /* free modified aspath */
> > if (fasp != asp)
> > path_put(fasp);
> > @@ -1075,9 +1077,15 @@
> >
> > rde_update_log("update", peer,
> > &asp->nexthop->exit_nexthop,
> > - &prefix, prefixlen);
> > - path_update(peer, fasp, &prefix,
> prefixlen,
> > - F_LOCAL);
> > + &prefix, prefixlen);
> > + /* handle an update with loop as a
> withdraw */
> > + if (peer->conf.ebgp &&
> > +
> !aspath_loopfree(asp->aspath,conf->as))
> > + prefix_remove(peer, &prefix,
> > + prefixlen,F_LOCAL);
> > + else
> > + path_update(peer, fasp, &prefix,
> > + prefixlen,F_LOCAL);
> >
> > /* free modified aspath */
> > if (fasp != asp)
>
> I looked a bit closer at this problem and the RFC mentions that pathes
> with loops need to be inserted into the RIB and will be ignored in phase 2
> of the decision process.
>
> So this diff does just about that. It does not remove any prefix if there
> is a loop but instead is ignoring them during the route decision process.
> This seems to work for me but I'm currently unable to do larger tests.
>
> --
> :wq Claudio
>
> Index: rde.c
> ===================================================================
> RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
> retrieving revision 1.228
> diff -u -p -r1.228 rde.c
> --- rde.c 16 Sep 2007 15:20:50 -0000 1.228
> +++ rde.c 6 Nov 2007 18:27:42 -0000
> @@ -920,10 +920,8 @@ rde_update_dispatch(struct imsg *imsg)
> p += 2 + attrpath_len;
>
> /* aspath needs to be loop free nota bene this is not a hard error
> */
> - if (peer->conf.ebgp && !aspath_loopfree(asp->aspath, conf->as)) {
> - error = 0;
> - goto done;
> - }
> + if (peer->conf.ebgp && !aspath_loopfree(asp->aspath, conf->as))
> + asp->flags |= F_ATTR_ASLOOP;
>
> /* parse nlri prefix */
> while (nlri_len > 0) {
> Index: rde.h
> ===================================================================
> RCS file: /cvs/src/usr.sbin/bgpd/rde.h,v
> retrieving revision 1.100
> diff -u -p -r1.100 rde.h
> --- rde.h 1 Jun 2007 04:17:30 -0000 1.100
> +++ rde.h 6 Nov 2007 19:17:56 -0000
> @@ -154,6 +154,7 @@ LIST_HEAD(prefix_head, prefix);
> #define F_ATTR_MP_REACH 0x00040
> #define F_ATTR_MP_UNREACH 0x00080
> #define F_ATTR_AS4BYTE_NEW 0x00100 /* NEW_ASPATH or
> NEW_AGGREGATOR */
> +#define F_ATTR_ASLOOP 0x00200
> #define F_PREFIX_ANNOUNCED 0x01000
> #define F_NEXTHOP_REJECT 0x02000
> #define F_NEXTHOP_BLACKHOLE 0x04000
> Index: rde_decide.c
> ===================================================================
> RCS file: /cvs/src/usr.sbin/bgpd/rde_decide.c,v
> retrieving revision 1.48
> diff -u -p -r1.48 rde_decide.c
> --- rde_decide.c 11 May 2007 11:27:59 -0000 1.48
> +++ rde_decide.c 12 Nov 2007 05:43:20 -0000
> @@ -120,6 +120,12 @@ prefix_cmp(struct prefix *p1, struct pre
> return (-1);
> if (!(p2->flags & F_LOCAL))
> return (1);
> +
> + /* only loop free pathes are eligible */
> + if (p1->flags & F_ATTR_ASLOOP)
> + return (-1);
> + if (p2->flags & F_ATTR_ASLOOP)
> + return (1);
>
> asp1 = p1->aspath;
> asp2 = p2->aspath;
> @@ -239,8 +245,8 @@ prefix_evaluate(struct prefix *p, struct
>
> xp = LIST_FIRST(&pte->prefix_h);
> if (xp == NULL || !(xp->flags & F_LOCAL) ||
> - (xp->aspath->nexthop != NULL && xp->aspath->nexthop->state !=
> - NEXTHOP_REACH))
> + (xp->flags & F_ATTR_ASLOOP) || (xp->aspath->nexthop != NULL &&
> + xp->aspath->nexthop->state != NEXTHOP_REACH))
> /* xp is ineligible */
> xp = NULL;
>
>
as4 advertises 172.19.0.0/16 to as2.
as1, as2 and as3 configured in a triangle, with a primary/standby peering
between as2 &as3.
See below how router as2 is out of sync with as3.
as2# date ; bgpctl show rib 172.19.0.0/16
Mon Nov 12 08:51:59 GMT 2007
flags: * = Valid, > = Selected, I = via IBGP, A = Announced
origin: i = IGP, e = EGP, ? = Incomplete
flags destination gateway lpref med aspath origin
*> 172.19.0.0/16 172.17.1.10 100 0 4 i
* 172.19.0.0/16 172.17.1.6 100 0 3 3 3 2 4 i
as2#
as3# date ; bgpctl show rib 172.19.0.0/16
Mon Nov 12 08:52:13 GMT 2007
flags: * = Valid, > = Selected, I = via IBGP, A = Announced
origin: i = IGP, e = EGP, ? = Incomplete
flags destination gateway lpref med aspath origin
*> 172.19.0.0/16 172.17.1.1 100 0 2 4 i
* 172.19.0.0/16 192.168.1.5 100 0 1 2 4 i
* 172.19.0.0/16 172.17.1.5 100 0 2 2 2 4 i
as3#
I shutdown peering as2-as4:
as2# date ; bgpctl show rib 172.19.0.0/16
Mon Nov 12 08:53:07 GMT 2007
flags: * = Valid, > = Selected, I = via IBGP, A = Announced
origin: i = IGP, e = EGP, ? = Incomplete
flags destination gateway lpref med aspath origin
*> 172.19.0.0/16 172.17.1.6 100 0 3 3 3 2 1 3 2 2 2 1 3
2 1 3 2 2 2 3 1 2 3 2 2 2 3 1 2 3 2 2 2 3 1 2 3 1 2 3 1 2 3 3 3 1 2 3 1 2 3
3 3 1 2 3 2 2 2 3 1 2 3 3 3 2 3 3 3 1 2 3 1 2 3 3 3 1 2 3 2 2 2 1 3 2 1 3 2
1 3 2 1 3 2 2 2 1 3 2 3 3 3 2 1 3 2 1 3 2 3 3 3 2 3 3 3 2 1 3 2 2 2 3 2 2 2
3 2 2 2 3 2 2 2 1 3 2 1 3 2 2 2 1 3 2 1 3 2 1 3 2 1 3 2 3 3 3 1 2 3 1 2 4 i
as2# date ; bgpctl show rib 172.19.0.0/16
Mon Nov 12 08:53:09 GMT 2007
flags: * = Valid, > = Selected, I = via IBGP, A = Announced
origin: i = IGP, e = EGP, ? = Incomplete
flags destination gateway lpref med aspath origin
*> 172.19.0.0/16 192.168.1.1 100 0 1 3 2 2 2 1 3 2 2 2 1
3 2 3 3 3 2 3 3 3 1 2 3 1 2 3 1 2 3 2 2 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1
2 3 2 2 2 1 3 2 1 3 2 3 3 3 2 3 3 3 2 1 3 2 1 3 2 3 3 3 2 3 3 3 2 3 3 3 2 1
3 2 2 2 1 3 2 1 3 2 2 2 3 1 2 3 2 2 2 3 1 2 3 2 2 2 3 1 2 3 1 2 3 1 2 3 3 3
1 2 3 1 2 3 3 3 1 2 3 2 2 2 3 1 2 3 3 3 2 3 3 3 1 2 3 1 2 3 3 3 1 2 3 2 2 2
1 3 2 1 3 2 1 3 2 1 3 2 2 2 1 3 2 3 3 3 2 1 3 2 1 3 2 3 3 3 2 3 3 3 2 1 3 2
2 2 3 2 2 2 3 2 2 2 3 2 2 2 1 3 2 1 3 2 2 2 1 3 2 1 3 2 1 3 2 1 3 2 3 3 3 1
2 3 1 2 4 i
as2# date ; bgpctl show rib 172.19.0.0/16
Mon Nov 12 08:53:09 GMT 2007
flags: * = Valid, > = Selected, I = via IBGP, A = Announced
origin: i = IGP, e = EGP, ? = Incomplete
flags destination gateway lpref med aspath origin
as2#
bgpd now crashes:
Nov 12 08:53:13 as3 bgpd[24367]: fatal in RDE: aspath_count: would overflow
Nov 12 08:53:13 as3 bgpd[27761]: Lost child: route decision engine exited
Nov 12 08:53:13 as3 bgpd[14219]: fatal in SE: session_dispatch_imsg: pipe
closed: Connection refused
Nov 12 08:53:13 as3 bgpd[27761]: can't remove connected route from interface
with index 0: not found
The crash does not happen every time, some times the network handles this
ok,
after the initial bursts of updates and loops. The last test I did with
flapping as2-a4
crashed all bgpd's in the network, except for the stranded as4 of course.
I will look closer at this later, hopefully later today.
I do have a more detailed report with timestamps and matching tcpdumps
if you want it, otherwise I'll dig more on my side and get back to you.
/Tony