On 11/4/07, Tony Sarendal <[EMAIL PROTECTED]> wrote: > > > > On 11/4/07, Tony Sarendal <[EMAIL PROTECTED]> wrote: > > > > On 11/4/07, Tony Sarendal <[EMAIL PROTECTED] > wrote: > > > > > > > > bgpd does not re-route correctly when I shut down a transit when I > > > use a bgp-only design, causing black-holes for some prefixes. > > > > > > router-01 and router-02 are in the same AS and peer with the same > > > transit provider. > > > router-01 and router-02 have two ibgp peerings, primary and standby > > > path. > > > router-01 sets localpref 60 on all transit prefixes, router-02 sets > > > local-pref 50. > > > When I take down the transit on router-01 I see this on router-02: > > > > > > router-02# bgpctl show rib | head -n 10 > > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced > > > origin: i = IGP, e = EGP, ? = Incomplete > > > > > > flags destination gateway lpref med aspath origin > > > I*> 26.0.128.0/17 172.17.1.1 60 11100 65100 i > > > * 26.0.128.0/17 192.168.100.5 50 10100 65100 i > > > I*> 26.0.144.0/22 172.17.1.1 60 11100 65100 i > > > * 26.0.144.0/22 192.168.100.5 50 10100 65100 i > > > I*> 26.1.77.0/24 172.17.1.1 60 11100 65100 i > > > * 26.1.77.0/24 192.168.100.5 50 10100 65100 i > > > router-02# > > > > > > prefixes with local-pref 60 pointing at router-01. > > > router-01 does not have it's transit peering up, and thus itself has > > > no prefixes with local-pref 60. > > > > > > router-01# bgpctl show rib | head -n > > > 10 > > > > > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced > > > origin: i = IGP, e = EGP, ? = Incomplete > > > > > > flags destination gateway lpref med aspath origin > > > I*> 26.0.128.0/17 172.17.1.6 50 21100 65100 i > > > I*> 26.0.144.0/22 172.17.1.6 50 21100 65100 i > > > I*> 26.1.77.0/24 172.17.1.6 50 21100 65100 i > > > I*> 26.2.172.0/22 172.17.1.6 50 21100 65100 i > > > I*> 26.3.241.0/24 172.17.1.6 50 21100 65100 i > > > I*> 26.6.126.0/24 172.17.1.6 50 21100 65100 i > > > router-01# bgpctl show rib 26.0.128.0/17 all > > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced > > > origin: i = IGP, e = EGP, ? = Incomplete > > > > > > flags destination gateway lpref med aspath origin > > > I*> 26.0.128.0/17 172.17.1.6 50 21100 65100 i > > > I*> 26.0.144.0/22 172.17.1.6 50 21100 65100 i > > > router-01# > > > > > > I saw this before when I tested bgpd around a year ago. So it isn't a > > > new bug. > > > This is with 4.2-RELEASE, no patches. > > > > > > This info is from a lab I setup to replicate a live environment. > > > > > > > > > /Tony > > > > > > > > > router-01# cat /etc/bgpd.conf > > > # $OpenBSD: bgpd.conf,v 1.8 2007/03/29 13:37:35 claudio Exp $ > > > # sample bgpd configuration file > > > # see bgpd.conf(5) > > > > > > #macros > > > loopback="172.17.0.1" > > > > > > # global configuration > > > AS 65200 > > > router-id $loopback > > > > > > network $loopback/32 set {localpref 120, med 10} > > > network 172.17.0.0/16 set {localpref 120, med 10} > > > network connected set {localpref 120, med 10} > > > network static set {localpref 120, med 10} > > > > > > group "TRANSIT" { > > > remote-as 65100 > > > announce all > > > set nexthop self > > > set med 10100 > > > set localpref 60 > > > neighbor 192.168.100.1 { > > > descr "TRANSIT" > > > } > > > } > > > > > > group "IBGP" { > > > remote-as 65200 > > > route-reflector > > > set nexthop self > > > set med +1000 > > > neighbor 172.17.1.2 { > > > local-address 172.17.1.1 > > > descr "router-02 primary" > > > } > > > neighbor 172.17.1.6 { > > > local-address 172.17.1.5 > > > descr "router-02 standby" > > > set med +10000 > > > } > > > } > > > > > > > > > # filter > > > deny from any > > > deny to any > > > > > > allow quick to group "IBGP" > > > allow quick from group "IBGP" > > > > > > allow quick to group "TRANSIT" prefix 172.17.0.0/16 > > > allow quick from group "TRANSIT" > > > > > > router-01# > > > ifconfig > > > > > > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33208 > > > groups: lo > > > inet 127.0.0.1 netmask 0xff000000 > > > inet6 ::1 prefixlen 128 > > > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 > > > ne3: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu > > > 1500 > > > lladdr 52:54:00:12:02:01 > > > description: transit > > > media: Ethernet 10baseT full-duplex > > > inet6 fe80::5054:ff:fe12:201%ne3 prefixlen 64 scopeid 0x1 > > > inet 192.168.100.2 netmask 0xfffffffc broadcast 192.168.100.3 > > > ne4: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu > > > 1500 > > > lladdr 52:54:00:12:02:02 > > > description: router-01 primary path > > > media: Ethernet 10baseT full-duplex > > > inet6 fe80::5054:ff:fe12:202%ne4 prefixlen 64 scopeid 0x2 > > > inet 172.17.1.1 netmask 0xfffffffc broadcast 172.17.1.3 > > > ne5: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu > > > 1500 > > > lladdr 52:54:00:12:02:03 > > > description: route-02 standby path > > > media: Ethernet 10baseT full-duplex > > > inet6 fe80::5054:ff:fe12:203%ne5 prefixlen 64 scopeid 0x3 > > > inet 172.17.1.5 netmask 0xfffffffc broadcast 172.17.1.7 > > > enc0: flags=0<> mtu 1536 > > > lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33208 > > > description: ROUTING LOOPBACK > > > groups: lo > > > inet 172.17.0.1 netmask 0xffffffff > > > router-01# > > > > > > > > > > > > > > > > > > router-02# cat /etc/bgpd.conf > > > # $OpenBSD: bgpd.conf,v 1.8 2007/03/29 13:37:35 claudio Exp $ > > > # sample bgpd configuration file > > > # see bgpd.conf (5) > > > > > > #macros > > > loopback="172.17.0.2" > > > > > > # global configuration > > > AS 65200 > > > router-id $loopback > > > > > > network $loopback/32 set {localpref 120, med 10} > > > network 172.17.0.0/16 set {localpref 120, med 10} > > > network connected set {localpref 120, med 10} > > > network static set {localpref 120, med 10} > > > > > > group "TRANSIT" { > > > remote-as 65100 > > > announce all > > > set nexthop self > > > set med 10100 > > > set localpref 50 > > > neighbor 192.168.100.5 { > > > descr "TRANSIT" > > > } > > > } > > > > > > group "IBGP" { > > > remote-as 65200 > > > route-reflector > > > set nexthop self > > > set med +1000 > > > neighbor 172.17.1.1 { > > > local-address 172.17.1.2 > > > descr "router-01 primary" > > > } > > > neighbor 172.17.1.5 { > > > local-address 172.17.1.6 > > > descr "router-01 standby" > > > set med +10000 > > > } > > > } > > > > > > > > > # filter > > > deny from any > > > deny to any > > > > > > allow quick to group "IBGP" > > > allow quick from group "IBGP" > > > > > > allow quick to group "TRANSIT" prefix 172.17.0.0/16 > > > allow quick from group "TRANSIT" > > > > > > router-02# > > > ifconfig > > > > > > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33208 > > > groups: lo > > > inet 127.0.0.1 netmask 0xff000000 > > > inet6 ::1 prefixlen 128 > > > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 > > > ne3: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu > > > 1500 > > > lladdr 52:54:00:12:03:01 > > > description: transit > > > media: Ethernet 10baseT full-duplex > > > inet6 fe80::5054:ff:fe12:301%ne3 prefixlen 64 scopeid 0x1 > > > inet 192.168.100.6 netmask 0xfffffffc broadcast 192.168.100.7 > > > ne4: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu > > > 1500 > > > lladdr 52:54:00:12:03:02 > > > description: router-02 primary path > > > media: Ethernet 10baseT full-duplex > > > inet6 fe80::5054:ff:fe12:302%ne4 prefixlen 64 scopeid 0x2 > > > inet 172.17.1.2 netmask 0xfffffffc broadcast 172.17.1.3 > > > ne5: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu > > > 1500 > > > lladdr 52:54:00:12:03:03 > > > description: router-02 standby path > > > media: Ethernet 10baseT full-duplex > > > inet6 fe80::5054:ff:fe12:303%ne5 prefixlen 64 scopeid 0x3 > > > inet 172.17.1.6 netmask 0xfffffffc broadcast 172.17.1.7 > > > enc0: flags=0<> mtu 1536 > > > lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33208 > > > groups: lo > > > inet 172.17.0.2 netmask 0xffffffff > > > router-02# > > > > > > > > Same behaviour in the latest snapshot. > > > > This black-hole happens when router-01 instead of sending a WITHDRAW to > > router-02 > > on the primary peering immediately sends an UPDATE for the same prefix > > to router-02. > > Since the new path in the UPDATE is via the backup peering to router-02 > > router-02 will drop this update because the ORIGINATOR_ID in the prefix > > is itself. > > > > router-02 now missed the fact that router-01 changed path for the prefix > > and a > > black-hole is in place. This does not happen every time, sometimes > > router-01 > > withdraws on both primary and standby peering and the network converges > > with connectivity intact. > > > > Time to read some RFC's. > > > > I'm too tired for this... > > Simplified, and more correct. In a route-reflecting loop a router may miss > a topology change because a prefix is replaced with UPDATE instead of > WITHDRAW+UPDATE at the same time as the new update has info > in ORIGINATOR_ID or CLUSTER_LIST that makes the receiver drop the UPDATE. > > /Tony >
Right, last one for the night. I modified the setup to make it look more familiar. Three AS all peer with each other, all provide transit to each other. AS65100, AS65200 and AS65300. AS65200 and AS65300 have two links between them, one primary and one backup where prefixes are prepended incoming. In this setup I also see this behaviour. This is what happens when I shut down all peerings with AS65100. router-01# bgpctl -n show Neighbor AS MsgRcvd MsgSent OutQ Up/Down State/PrfRcvd 172.17.1.6 65300 13 13 0 00:03:59 5 172.17.1.2 65300 12 14 0 00:03:59 6 192.168.100.1 65100 7 7 0 00:02:42 Idle router-01# bgpctl show rib 192.168.0.0/16 flags: * = Valid, > = Selected, I = via IBGP, A = Announced origin: i = IGP, e = EGP, ? = Incomplete flags destination gateway lpref med aspath origin *> 192.168.0.0/16 172.17.1.2 100 0 65300 65100 i router-01# router-02# bgpctl -n show Neighbor AS MsgRcvd MsgSent OutQ Up/Down State/PrfRcvd 172.17.1.5 65200 319 316 0 00:05:00 6 172.17.1.1 65200 320 316 0 00:05:00 5 192.168.100.5 65100 311 304 0 00:02:12 Idle router-02# bgpctl show rib 192.168.0.0/16 flags: * = Valid, > = Selected, I = via IBGP, A = Announced origin: i = IGP, e = EGP, ? = Incomplete flags destination gateway lpref med aspath origin *> 192.168.0.0/16 172.17.1.5 100 0 65200 65200 65200 65200 65100 i router-02# Both AS65200 and AS65300 carry the AS65100 prefix, and they point at each other. With a local-pref in AS65300 to prefer AS65200 I again get a black-hole if I kill the AS65100-AS65200 peering. So far I actually can't see bgpd breaking any RFC behaviour, although the behaviour is interesting. /Tony router-01# cat /etc/bgpd.conf # $OpenBSD: bgpd.conf,v 1.8 2007/03/29 13:37:35 claudio Exp $ # sample bgpd configuration file # see bgpd.conf(5) #macros loopback="172.17.0.1" # global configuration AS 65200 router-id $loopback network $loopback/32 set {localpref 120, med 10} network 172.17.0.0/16 set {localpref 120, med 10} network connected set {localpref 120, med 10} network static set {localpref 120, med 10} group "EBGP-PEER" { announce all neighbor 192.168.100.1 { remote-as 65100 descr "AS65100" } neighbor 172.17.1.2 { remote-as 65300 descr "AS65200-PRI" } neighbor 172.17.1.6 { remote-as 65300 descr "AS65200-SB" set prepend-neighbor 3 } } allow from any allow to any router-01# router-02# cat /etc/bgpd.conf # $OpenBSD: bgpd.conf,v 1.8 2007/03/29 13:37:35 claudio Exp $ # sample bgpd configuration file # see bgpd.conf(5) #macros loopback="172.17.0.2" # global configuration AS 65300 router-id $loopback network $loopback/32 set {localpref 120, med 10} network 172.17.0.0/16 set {localpref 120, med 10} network connected set {localpref 120, med 10} network static set {localpref 120, med 10} group "EBGP-PEER" { announce all neighbor 192.168.100.5 { remote-as 65100 descr "AS65100" } neighbor 172.17.1.1 { remote-as 65200 descr "AS65200-PRI" } neighbor 172.17.1.5 { remote-as 65200 descr "AS65200-SB" set prepend-neighbor 3 } } allow from any allow to any router-02#