On 11/4/07, Tony Sarendal <[EMAIL PROTECTED]> wrote: > > On 11/4/07, Tony Sarendal <[EMAIL PROTECTED]> wrote: > > > > > bgpd does not re-route correctly when I shut down a transit when I > > use a bgp-only design, causing black-holes for some prefixes. > > > > router-01 and router-02 are in the same AS and peer with the same > > transit provider. > > router-01 and router-02 have two ibgp peerings, primary and standby > > path. > > router-01 sets localpref 60 on all transit prefixes, router-02 sets > > local-pref 50. > > When I take down the transit on router-01 I see this on router-02: > > > > router-02# bgpctl show rib | head -n 10 > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced > > origin: i = IGP, e = EGP, ? = Incomplete > > > > flags destination gateway lpref med aspath origin > > I*> 26.0.128.0/17 172.17.1.1 60 11100 65100 i > > * 26.0.128.0/17 192.168.100.5 50 10100 65100 i > > I*> 26.0.144.0/22 172.17.1.1 60 11100 65100 i > > * 26.0.144.0/22 192.168.100.5 50 10100 65100 i > > I*> 26.1.77.0/24 172.17.1.1 60 11100 65100 i > > * 26.1.77.0/24 192.168.100.5 50 10100 65100 i > > router-02# > > > > prefixes with local-pref 60 pointing at router-01. > > router-01 does not have it's transit peering up, and thus itself has no > > prefixes with local-pref 60. > > > > router-01# bgpctl show rib | head -n > > 10 > > > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced > > origin: i = IGP, e = EGP, ? = Incomplete > > > > flags destination gateway lpref med aspath origin > > I*> 26.0.128.0/17 172.17.1.6 50 21100 65100 i > > I*> 26.0.144.0/22 172.17.1.6 50 21100 65100 i > > I*> 26.1.77.0/24 172.17.1.6 50 21100 65100 i > > I*> 26.2.172.0/22 172.17.1.6 50 21100 65100 i > > I*> 26.3.241.0/24 172.17.1.6 50 21100 65100 i > > I*> 26.6.126.0/24 172.17.1.6 50 21100 65100 i > > router-01# bgpctl show rib 26.0.128.0/17 all > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced > > origin: i = IGP, e = EGP, ? = Incomplete > > > > flags destination gateway lpref med aspath origin > > I*> 26.0.128.0/17 172.17.1.6 50 21100 65100 i > > I*> 26.0.144.0/22 172.17.1.6 50 21100 65100 i > > router-01# > > > > I saw this before when I tested bgpd around a year ago. So it isn't a > > new bug. > > This is with 4.2-RELEASE, no patches. > > > > This info is from a lab I setup to replicate a live environment. > > > > > > /Tony > > > > > > router-01# cat /etc/bgpd.conf > > # $OpenBSD: bgpd.conf,v 1.8 2007/03/29 13:37:35 claudio Exp $ > > # sample bgpd configuration file > > # see bgpd.conf(5) > > > > #macros > > loopback="172.17.0.1" > > > > # global configuration > > AS 65200 > > router-id $loopback > > > > network $loopback/32 set {localpref 120, med 10} > > network 172.17.0.0/16 set {localpref 120, med 10} > > network connected set {localpref 120, med 10} > > network static set {localpref 120, med 10} > > > > group "TRANSIT" { > > remote-as 65100 > > announce all > > set nexthop self > > set med 10100 > > set localpref 60 > > neighbor 192.168.100.1 { > > descr "TRANSIT" > > } > > } > > > > group "IBGP" { > > remote-as 65200 > > route-reflector > > set nexthop self > > set med +1000 > > neighbor 172.17.1.2 { > > local-address 172.17.1.1 > > descr "router-02 primary" > > } > > neighbor 172.17.1.6 { > > local-address 172.17.1.5 > > descr "router-02 standby" > > set med +10000 > > } > > } > > > > > > # filter > > deny from any > > deny to any > > > > allow quick to group "IBGP" > > allow quick from group "IBGP" > > > > allow quick to group "TRANSIT" prefix 172.17.0.0/16 > > allow quick from group "TRANSIT" > > > > router-01# > > ifconfig > > > > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33208 > > groups: lo > > inet 127.0.0.1 netmask 0xff000000 > > inet6 ::1 prefixlen 128 > > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 > > ne3: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu > > 1500 > > lladdr 52:54:00:12:02:01 > > description: transit > > media: Ethernet 10baseT full-duplex > > inet6 fe80::5054:ff:fe12:201%ne3 prefixlen 64 scopeid 0x1 > > inet 192.168.100.2 netmask 0xfffffffc broadcast 192.168.100.3 > > ne4: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu > > 1500 > > lladdr 52:54:00:12:02:02 > > description: router-01 primary path > > media: Ethernet 10baseT full-duplex > > inet6 fe80::5054:ff:fe12:202%ne4 prefixlen 64 scopeid 0x2 > > inet 172.17.1.1 netmask 0xfffffffc broadcast 172.17.1.3 > > ne5: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu > > 1500 > > lladdr 52:54:00:12:02:03 > > description: route-02 standby path > > media: Ethernet 10baseT full-duplex > > inet6 fe80::5054:ff:fe12:203%ne5 prefixlen 64 scopeid 0x3 > > inet 172.17.1.5 netmask 0xfffffffc broadcast 172.17.1.7 > > enc0: flags=0<> mtu 1536 > > lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33208 > > description: ROUTING LOOPBACK > > groups: lo > > inet 172.17.0.1 netmask 0xffffffff > > router-01# > > > > > > > > > > > > router-02# cat /etc/bgpd.conf > > # $OpenBSD: bgpd.conf,v 1.8 2007/03/29 13:37:35 claudio Exp $ > > # sample bgpd configuration file > > # see bgpd.conf (5) > > > > #macros > > loopback="172.17.0.2" > > > > # global configuration > > AS 65200 > > router-id $loopback > > > > network $loopback/32 set {localpref 120, med 10} > > network 172.17.0.0/16 set {localpref 120, med 10} > > network connected set {localpref 120, med 10} > > network static set {localpref 120, med 10} > > > > group "TRANSIT" { > > remote-as 65100 > > announce all > > set nexthop self > > set med 10100 > > set localpref 50 > > neighbor 192.168.100.5 { > > descr "TRANSIT" > > } > > } > > > > group "IBGP" { > > remote-as 65200 > > route-reflector > > set nexthop self > > set med +1000 > > neighbor 172.17.1.1 { > > local-address 172.17.1.2 > > descr "router-01 primary" > > } > > neighbor 172.17.1.5 { > > local-address 172.17.1.6 > > descr "router-01 standby" > > set med +10000 > > } > > } > > > > > > # filter > > deny from any > > deny to any > > > > allow quick to group "IBGP" > > allow quick from group "IBGP" > > > > allow quick to group "TRANSIT" prefix 172.17.0.0/16 > > allow quick from group "TRANSIT" > > > > router-02# > > ifconfig > > > > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33208 > > groups: lo > > inet 127.0.0.1 netmask 0xff000000 > > inet6 ::1 prefixlen 128 > > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 > > ne3: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu > > 1500 > > lladdr 52:54:00:12:03:01 > > description: transit > > media: Ethernet 10baseT full-duplex > > inet6 fe80::5054:ff:fe12:301%ne3 prefixlen 64 scopeid 0x1 > > inet 192.168.100.6 netmask 0xfffffffc broadcast 192.168.100.7 > > ne4: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu > > 1500 > > lladdr 52:54:00:12:03:02 > > description: router-02 primary path > > media: Ethernet 10baseT full-duplex > > inet6 fe80::5054:ff:fe12:302%ne4 prefixlen 64 scopeid 0x2 > > inet 172.17.1.2 netmask 0xfffffffc broadcast 172.17.1.3 > > ne5: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu > > 1500 > > lladdr 52:54:00:12:03:03 > > description: router-02 standby path > > media: Ethernet 10baseT full-duplex > > inet6 fe80::5054:ff:fe12:303%ne5 prefixlen 64 scopeid 0x3 > > inet 172.17.1.6 netmask 0xfffffffc broadcast 172.17.1.7 > > enc0: flags=0<> mtu 1536 > > lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33208 > > groups: lo > > inet 172.17.0.2 netmask 0xffffffff > > router-02# > > > > Same behaviour in the latest snapshot. > > This black-hole happens when router-01 instead of sending a WITHDRAW to > router-02 > on the primary peering immediately sends an UPDATE for the same prefix to > router-02. > Since the new path in the UPDATE is via the backup peering to router-02 > router-02 will drop this update because the ORIGINATOR_ID in the prefix is > itself. > > router-02 now missed the fact that router-01 changed path for the prefix > and a > black-hole is in place. This does not happen every time, sometimes > router-01 > withdraws on both primary and standby peering and the network converges > with connectivity intact. > > Time to read some RFC's. >
I'm too tired for this... Simplified, and more correct. In a route-reflecting loop a router may miss a topology change because a prefix is replaced with UPDATE instead of WITHDRAW+UPDATE at the same time as the new update has info in ORIGINATOR_ID or CLUSTER_LIST that makes the receiver drop the UPDATE. /Tony