On 11/4/07, Tony Sarendal <[EMAIL PROTECTED]> wrote:
>
> On 11/4/07, Tony Sarendal <[EMAIL PROTECTED]> wrote:
>
> >
> > bgpd does not re-route correctly when I shut down a transit when I
> > use a bgp-only design, causing black-holes for some prefixes.
> >
> > router-01 and router-02 are in the same AS and peer with the same
> > transit provider.
> > router-01 and router-02 have two ibgp peerings, primary and standby
> > path.
> > router-01 sets localpref 60 on all transit prefixes, router-02 sets
> > local-pref 50.
> > When I take down the transit on router-01 I see this on router-02:
> >
> > router-02# bgpctl show rib | head -n 10
> > flags: * = Valid, > = Selected, I = via IBGP, A = Announced
> > origin: i = IGP, e = EGP, ? = Incomplete
> >
> > flags destination gateway lpref med aspath origin
> > I*> 26.0.128.0/17 172.17.1.1 60 11100 65100 i
> > * 26.0.128.0/17 192.168.100.5 50 10100 65100 i
> > I*> 26.0.144.0/22 172.17.1.1 60 11100 65100 i
> > * 26.0.144.0/22 192.168.100.5 50 10100 65100 i
> > I*> 26.1.77.0/24 172.17.1.1 60 11100 65100 i
> > * 26.1.77.0/24 192.168.100.5 50 10100 65100 i
> > router-02#
> >
> > prefixes with local-pref 60 pointing at router-01.
> > router-01 does not have it's transit peering up, and thus itself has no
> > prefixes with local-pref 60.
> >
> > router-01# bgpctl show rib | head -n
> > 10
> >
> > flags: * = Valid, > = Selected, I = via IBGP, A = Announced
> > origin: i = IGP, e = EGP, ? = Incomplete
> >
> > flags destination gateway lpref med aspath origin
> > I*> 26.0.128.0/17 172.17.1.6 50 21100 65100 i
> > I*> 26.0.144.0/22 172.17.1.6 50 21100 65100 i
> > I*> 26.1.77.0/24 172.17.1.6 50 21100 65100 i
> > I*> 26.2.172.0/22 172.17.1.6 50 21100 65100 i
> > I*> 26.3.241.0/24 172.17.1.6 50 21100 65100 i
> > I*> 26.6.126.0/24 172.17.1.6 50 21100 65100 i
> > router-01# bgpctl show rib 26.0.128.0/17 all
> > flags: * = Valid, > = Selected, I = via IBGP, A = Announced
> > origin: i = IGP, e = EGP, ? = Incomplete
> >
> > flags destination gateway lpref med aspath origin
> > I*> 26.0.128.0/17 172.17.1.6 50 21100 65100 i
> > I*> 26.0.144.0/22 172.17.1.6 50 21100 65100 i
> > router-01#
> >
> > I saw this before when I tested bgpd around a year ago. So it isn't a
> > new bug.
> > This is with 4.2-RELEASE, no patches.
> >
> > This info is from a lab I setup to replicate a live environment.
> >
> >
> > /Tony
> >
> >
> > router-01# cat /etc/bgpd.conf
> > # $OpenBSD: bgpd.conf,v 1.8 2007/03/29 13:37:35 claudio Exp $
> > # sample bgpd configuration file
> > # see bgpd.conf(5)
> >
> > #macros
> > loopback="172.17.0.1"
> >
> > # global configuration
> > AS 65200
> > router-id $loopback
> >
> > network $loopback/32 set {localpref 120, med 10}
> > network 172.17.0.0/16 set {localpref 120, med 10}
> > network connected set {localpref 120, med 10}
> > network static set {localpref 120, med 10}
> >
> > group "TRANSIT" {
> > remote-as 65100
> > announce all
> > set nexthop self
> > set med 10100
> > set localpref 60
> > neighbor 192.168.100.1 {
> > descr "TRANSIT"
> > }
> > }
> >
> > group "IBGP" {
> > remote-as 65200
> > route-reflector
> > set nexthop self
> > set med +1000
> > neighbor 172.17.1.2 {
> > local-address 172.17.1.1
> > descr "router-02 primary"
> > }
> > neighbor 172.17.1.6 {
> > local-address 172.17.1.5
> > descr "router-02 standby"
> > set med +10000
> > }
> > }
> >
> >
> > # filter
> > deny from any
> > deny to any
> >
> > allow quick to group "IBGP"
> > allow quick from group "IBGP"
> >
> > allow quick to group "TRANSIT" prefix 172.17.0.0/16
> > allow quick from group "TRANSIT"
> >
> > router-01#
> > ifconfig
> >
> > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33208
> > groups: lo
> > inet 127.0.0.1 netmask 0xff000000
> > inet6 ::1 prefixlen 128
> > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
> > ne3: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu
> > 1500
> > lladdr 52:54:00:12:02:01
> > description: transit
> > media: Ethernet 10baseT full-duplex
> > inet6 fe80::5054:ff:fe12:201%ne3 prefixlen 64 scopeid 0x1
> > inet 192.168.100.2 netmask 0xfffffffc broadcast 192.168.100.3
> > ne4: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu
> > 1500
> > lladdr 52:54:00:12:02:02
> > description: router-01 primary path
> > media: Ethernet 10baseT full-duplex
> > inet6 fe80::5054:ff:fe12:202%ne4 prefixlen 64 scopeid 0x2
> > inet 172.17.1.1 netmask 0xfffffffc broadcast 172.17.1.3
> > ne5: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu
> > 1500
> > lladdr 52:54:00:12:02:03
> > description: route-02 standby path
> > media: Ethernet 10baseT full-duplex
> > inet6 fe80::5054:ff:fe12:203%ne5 prefixlen 64 scopeid 0x3
> > inet 172.17.1.5 netmask 0xfffffffc broadcast 172.17.1.7
> > enc0: flags=0<> mtu 1536
> > lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33208
> > description: ROUTING LOOPBACK
> > groups: lo
> > inet 172.17.0.1 netmask 0xffffffff
> > router-01#
> >
> >
> >
> >
> >
> > router-02# cat /etc/bgpd.conf
> > # $OpenBSD: bgpd.conf,v 1.8 2007/03/29 13:37:35 claudio Exp $
> > # sample bgpd configuration file
> > # see bgpd.conf (5)
> >
> > #macros
> > loopback="172.17.0.2"
> >
> > # global configuration
> > AS 65200
> > router-id $loopback
> >
> > network $loopback/32 set {localpref 120, med 10}
> > network 172.17.0.0/16 set {localpref 120, med 10}
> > network connected set {localpref 120, med 10}
> > network static set {localpref 120, med 10}
> >
> > group "TRANSIT" {
> > remote-as 65100
> > announce all
> > set nexthop self
> > set med 10100
> > set localpref 50
> > neighbor 192.168.100.5 {
> > descr "TRANSIT"
> > }
> > }
> >
> > group "IBGP" {
> > remote-as 65200
> > route-reflector
> > set nexthop self
> > set med +1000
> > neighbor 172.17.1.1 {
> > local-address 172.17.1.2
> > descr "router-01 primary"
> > }
> > neighbor 172.17.1.5 {
> > local-address 172.17.1.6
> > descr "router-01 standby"
> > set med +10000
> > }
> > }
> >
> >
> > # filter
> > deny from any
> > deny to any
> >
> > allow quick to group "IBGP"
> > allow quick from group "IBGP"
> >
> > allow quick to group "TRANSIT" prefix 172.17.0.0/16
> > allow quick from group "TRANSIT"
> >
> > router-02#
> > ifconfig
> >
> > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33208
> > groups: lo
> > inet 127.0.0.1 netmask 0xff000000
> > inet6 ::1 prefixlen 128
> > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
> > ne3: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu
> > 1500
> > lladdr 52:54:00:12:03:01
> > description: transit
> > media: Ethernet 10baseT full-duplex
> > inet6 fe80::5054:ff:fe12:301%ne3 prefixlen 64 scopeid 0x1
> > inet 192.168.100.6 netmask 0xfffffffc broadcast 192.168.100.7
> > ne4: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu
> > 1500
> > lladdr 52:54:00:12:03:02
> > description: router-02 primary path
> > media: Ethernet 10baseT full-duplex
> > inet6 fe80::5054:ff:fe12:302%ne4 prefixlen 64 scopeid 0x2
> > inet 172.17.1.2 netmask 0xfffffffc broadcast 172.17.1.3
> > ne5: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu
> > 1500
> > lladdr 52:54:00:12:03:03
> > description: router-02 standby path
> > media: Ethernet 10baseT full-duplex
> > inet6 fe80::5054:ff:fe12:303%ne5 prefixlen 64 scopeid 0x3
> > inet 172.17.1.6 netmask 0xfffffffc broadcast 172.17.1.7
> > enc0: flags=0<> mtu 1536
> > lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33208
> > groups: lo
> > inet 172.17.0.2 netmask 0xffffffff
> > router-02#
>
>
>
> Same behaviour in the latest snapshot.
>
> This black-hole happens when router-01 instead of sending a WITHDRAW to
> router-02
> on the primary peering immediately sends an UPDATE for the same prefix to
> router-02.
> Since the new path in the UPDATE is via the backup peering to router-02
> router-02 will drop this update because the ORIGINATOR_ID in the prefix is
> itself.
>
> router-02 now missed the fact that router-01 changed path for the prefix
> and a
> black-hole is in place. This does not happen every time, sometimes
> router-01
> withdraws on both primary and standby peering and the network converges
> with connectivity intact.
>
> Time to read some RFC's.
>
I'm too tired for this...
Simplified, and more correct. In a route-reflecting loop a router may miss
a topology change because a prefix is replaced with UPDATE instead of
WITHDRAW+UPDATE at the same time as the new update has info
in ORIGINATOR_ID or CLUSTER_LIST that makes the receiver drop the UPDATE.
/Tony