Re: bgpd acting up, dropping connected/static network statements
On 24/05/2019 12.25, open...@kene.nu wrote: Hello, I finally got to testing this and the bug seems to be fixed. What is the recommended way of implementing this fix into a critical production environment? Should we wait for a syspatch (will one be made available for this bug)? It is possible to deploy it via hacks in automation tools but it would be nice to know if there was an "official" way of doing it or plans for one. On Linux distros you can have additional repos that override the official packages if the ones in there are newer and trusted. Is it possible to have multiple repos like that? Also not sure how it would work to update base since packages in there are not really distributed via the normal repo. Since everything is signed I don't see how we could do it in a nice way. Unless there is support to add additional trusted keys. /T
Re: bgpd acting up, dropping connected/static network statements
Hello, I finally got to testing this and the bug seems to be fixed. What is the recommended way of implementing this fix into a critical production environment? Should we wait for a syspatch (will one be made available for this bug)? On Fri, May 3, 2019 at 12:12 PM Claudio Jeker wrote: > > On Fri, May 03, 2019 at 11:52:07AM +0200, open...@kene.nu wrote: > > Much appreciated, will test. Did this also affect previous versions > > (specifically thinking about 6.3 and 6.4)? > > No. This code was changed after 6.4 > > > On Fri, May 3, 2019 at 11:43 AM Claudio Jeker > > wrote: > > > > > > On Fri, May 03, 2019 at 09:59:40AM +0200, open...@kene.nu wrote: > > > > Hello, > > > > > > > > I am seeing strange behaviour of bgpd in 6.5. > > > > > > > > Not sure what causes the networks in bgpd to disappear but they do > > > > disappear and performing a netstart pick the network back up again in > > > > bgpd. I cannot see this in either 6.4 or 6.3. One triggering factor > > > > seems to be restarting the bgpd process. > > > > > > > > Excerpt form the daemon logs (bgpd restart or reload): > > > > May 3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4 > > > > (LOCAL) AS64712: announce 10.1.150.0/24 > > > > May 3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4 > > > > (LOCAL) AS64712: withdraw announce 10.1.150.0/24 > > > > > > > > If one performs a netstart, of relevant vlan interfaces, the > > > > announcements seem to survive a bgpd reload. Static routes never > > > > survive a restart or reload. > > > > > > > > Some additional commands to show behaviour: > > > > # uname -a > > > > OpenBSD host 6.5 GENERIC.MP#3 amd64 > > > > # ifconfig vlan190 > > > > vlan190: flags=8943 mtu > > > > 1500 > > > > lladdr > > > > index 33 priority 0 llprio 3 > > > > encap: vnetid 190 parent em0 txprio packet > > > > groups: vlan > > > > media: Ethernet autoselect (1000baseT full-duplex,master) > > > > status: active > > > > inet 10.1.150.2 netmask 0xff00 broadcast 10.1.150.255 > > > > # grep connected /etc/bgpd.conf > > > > network inet connected set community 65000:64712 > > > > # bgpctl sh ip bgp 10.1.150.0/24 > > > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced, > > > >S = Stale, E = Error > > > > origin validation state: N = not-found, V = valid, ! = invalid > > > > origin: i = IGP, e = EGP, ? = Incomplete > > > > > > > > flags ovs destination gateway lpref med aspath > > > > origin > > > > # sh /etc/netstart vlan150 > > > > # bgpctl sh ip bgp 10.1.150.0/24 > > > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced, > > > >S = Stale, E = Error > > > > origin validation state: N = not-found, V = valid, ! = invalid > > > > origin: i = IGP, e = EGP, ? = Incomplete > > > > > > > > flags ovs destination gateway lpref med aspath > > > > origin > > > > AI*>N 10.1.150.0/240.0.0.0100 0 i > > > > > > > > > > > > My bgpd.conf: > > > > # GLOBALS > > > > AS 64712 > > > > router-id 172.30.198.4 > > > > holdtime 9 > > > > log updates > > > > > > > > prefix-set internal { 10.0.0.0/8 prefixlen >= 16, 10.60.0.0/15, > > > > 172.20.0.0/16 prefixlen <= 32, 172.29.0.0/16 prefixlen >= 24, > > > > 172.29.248.10/31 prefixlen = 32, 172.30.0.0/16 prefixlen >= 24 } > > > > > > > > # DEFAULT FILTERING > > > > deny from any > > > > deny to any > > > > > > > > # NETWORK STATEMENTS > > > > network 172.30.198.4/32 set community 65000:64712 > > > > network inet connected set community 65000:64712 > > > > network inet static set community 65000:64712 > > > > > > > > # NEIGHBORS > > > > group "vpn" { > > > > announce IPv6 none > > > > route-reflector > > > > remote-as 64712 > > > > > > > > neighbor 10.1.230.9 { > > > > descr "vpn1" > > > > } > > > > neighbor 10.1.230.10 { > > > > descr "vpn2" > > > > } > > > > } > > > > > > > > # SOURCE FILTERING > > > > allow to group "vpn" prefix-set internal community 65000:64712 > > > > # DEST FILTERING > > > > allow from group "vpn" prefix-set internal > > > > # TRAFFIC ENGINEERING > > > > match to group "vpn" set nexthop 10.1.230.12 > > > > match to any prefix 172.30.198.4/32 set nexthop self > > > > > > > > > > Thanks for the detailed report. I quick workaround is to reload the config > > > twice. Then the networks are added again. The proper fix is attached. > > > > > > The problem was that when already present networks were readded the > > > function kr_net_redist_add() returned 0 which was propegated to > > > kr_net_match() which then caused kr_redistribute() to actually remove the > > > prefix. > > > > > > I changed the code to only return 0 when there is actually the case that > > > the network being added is shadowed by another one and therefor this > > > prefix should be removed. While there I also fixed a memory leak ;) > > > > > > Please test. > > > -- > > > :wq Claudio > > > > > > Index: kroute.c > > >
Re: bgpd acting up, dropping connected/static network statements
On Fri, May 03, 2019 at 11:52:07AM +0200, open...@kene.nu wrote: > Much appreciated, will test. Did this also affect previous versions > (specifically thinking about 6.3 and 6.4)? No. This code was changed after 6.4 > On Fri, May 3, 2019 at 11:43 AM Claudio Jeker > wrote: > > > > On Fri, May 03, 2019 at 09:59:40AM +0200, open...@kene.nu wrote: > > > Hello, > > > > > > I am seeing strange behaviour of bgpd in 6.5. > > > > > > Not sure what causes the networks in bgpd to disappear but they do > > > disappear and performing a netstart pick the network back up again in > > > bgpd. I cannot see this in either 6.4 or 6.3. One triggering factor > > > seems to be restarting the bgpd process. > > > > > > Excerpt form the daemon logs (bgpd restart or reload): > > > May 3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4 > > > (LOCAL) AS64712: announce 10.1.150.0/24 > > > May 3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4 > > > (LOCAL) AS64712: withdraw announce 10.1.150.0/24 > > > > > > If one performs a netstart, of relevant vlan interfaces, the > > > announcements seem to survive a bgpd reload. Static routes never > > > survive a restart or reload. > > > > > > Some additional commands to show behaviour: > > > # uname -a > > > OpenBSD host 6.5 GENERIC.MP#3 amd64 > > > # ifconfig vlan190 > > > vlan190: flags=8943 mtu > > > 1500 > > > lladdr > > > index 33 priority 0 llprio 3 > > > encap: vnetid 190 parent em0 txprio packet > > > groups: vlan > > > media: Ethernet autoselect (1000baseT full-duplex,master) > > > status: active > > > inet 10.1.150.2 netmask 0xff00 broadcast 10.1.150.255 > > > # grep connected /etc/bgpd.conf > > > network inet connected set community 65000:64712 > > > # bgpctl sh ip bgp 10.1.150.0/24 > > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced, > > >S = Stale, E = Error > > > origin validation state: N = not-found, V = valid, ! = invalid > > > origin: i = IGP, e = EGP, ? = Incomplete > > > > > > flags ovs destination gateway lpref med aspath origin > > > # sh /etc/netstart vlan150 > > > # bgpctl sh ip bgp 10.1.150.0/24 > > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced, > > >S = Stale, E = Error > > > origin validation state: N = not-found, V = valid, ! = invalid > > > origin: i = IGP, e = EGP, ? = Incomplete > > > > > > flags ovs destination gateway lpref med aspath origin > > > AI*>N 10.1.150.0/240.0.0.0100 0 i > > > > > > > > > My bgpd.conf: > > > # GLOBALS > > > AS 64712 > > > router-id 172.30.198.4 > > > holdtime 9 > > > log updates > > > > > > prefix-set internal { 10.0.0.0/8 prefixlen >= 16, 10.60.0.0/15, > > > 172.20.0.0/16 prefixlen <= 32, 172.29.0.0/16 prefixlen >= 24, > > > 172.29.248.10/31 prefixlen = 32, 172.30.0.0/16 prefixlen >= 24 } > > > > > > # DEFAULT FILTERING > > > deny from any > > > deny to any > > > > > > # NETWORK STATEMENTS > > > network 172.30.198.4/32 set community 65000:64712 > > > network inet connected set community 65000:64712 > > > network inet static set community 65000:64712 > > > > > > # NEIGHBORS > > > group "vpn" { > > > announce IPv6 none > > > route-reflector > > > remote-as 64712 > > > > > > neighbor 10.1.230.9 { > > > descr "vpn1" > > > } > > > neighbor 10.1.230.10 { > > > descr "vpn2" > > > } > > > } > > > > > > # SOURCE FILTERING > > > allow to group "vpn" prefix-set internal community 65000:64712 > > > # DEST FILTERING > > > allow from group "vpn" prefix-set internal > > > # TRAFFIC ENGINEERING > > > match to group "vpn" set nexthop 10.1.230.12 > > > match to any prefix 172.30.198.4/32 set nexthop self > > > > > > > Thanks for the detailed report. I quick workaround is to reload the config > > twice. Then the networks are added again. The proper fix is attached. > > > > The problem was that when already present networks were readded the > > function kr_net_redist_add() returned 0 which was propegated to > > kr_net_match() which then caused kr_redistribute() to actually remove the > > prefix. > > > > I changed the code to only return 0 when there is actually the case that > > the network being added is shadowed by another one and therefor this > > prefix should be removed. While there I also fixed a memory leak ;) > > > > Please test. > > -- > > :wq Claudio > > > > Index: kroute.c > > === > > RCS file: /cvs/src/usr.sbin/bgpd/kroute.c,v > > retrieving revision 1.235 > > diff -u -p -r1.235 kroute.c > > --- kroute.c7 Mar 2019 07:42:36 - 1.235 > > +++ kroute.c3 May 2019 09:32:10 - > > @@ -1230,19 +1230,19 @@ kr_net_redist_add(struct ktable *kt, str > > > > xr = RB_INSERT(kredist_tree, &kt->kredist, r); > > if (xr != NULL) { > > - if (dynamic == xr->dynamic || dynamic) { > > + free(r); > > + > > +
Re: bgpd acting up, dropping connected/static network statements
Much appreciated, will test. Did this also affect previous versions (specifically thinking about 6.3 and 6.4)? On Fri, May 3, 2019 at 11:43 AM Claudio Jeker wrote: > > On Fri, May 03, 2019 at 09:59:40AM +0200, open...@kene.nu wrote: > > Hello, > > > > I am seeing strange behaviour of bgpd in 6.5. > > > > Not sure what causes the networks in bgpd to disappear but they do > > disappear and performing a netstart pick the network back up again in > > bgpd. I cannot see this in either 6.4 or 6.3. One triggering factor > > seems to be restarting the bgpd process. > > > > Excerpt form the daemon logs (bgpd restart or reload): > > May 3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4 > > (LOCAL) AS64712: announce 10.1.150.0/24 > > May 3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4 > > (LOCAL) AS64712: withdraw announce 10.1.150.0/24 > > > > If one performs a netstart, of relevant vlan interfaces, the > > announcements seem to survive a bgpd reload. Static routes never > > survive a restart or reload. > > > > Some additional commands to show behaviour: > > # uname -a > > OpenBSD host 6.5 GENERIC.MP#3 amd64 > > # ifconfig vlan190 > > vlan190: flags=8943 mtu 1500 > > lladdr > > index 33 priority 0 llprio 3 > > encap: vnetid 190 parent em0 txprio packet > > groups: vlan > > media: Ethernet autoselect (1000baseT full-duplex,master) > > status: active > > inet 10.1.150.2 netmask 0xff00 broadcast 10.1.150.255 > > # grep connected /etc/bgpd.conf > > network inet connected set community 65000:64712 > > # bgpctl sh ip bgp 10.1.150.0/24 > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced, > >S = Stale, E = Error > > origin validation state: N = not-found, V = valid, ! = invalid > > origin: i = IGP, e = EGP, ? = Incomplete > > > > flags ovs destination gateway lpref med aspath origin > > # sh /etc/netstart vlan150 > > # bgpctl sh ip bgp 10.1.150.0/24 > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced, > >S = Stale, E = Error > > origin validation state: N = not-found, V = valid, ! = invalid > > origin: i = IGP, e = EGP, ? = Incomplete > > > > flags ovs destination gateway lpref med aspath origin > > AI*>N 10.1.150.0/240.0.0.0100 0 i > > > > > > My bgpd.conf: > > # GLOBALS > > AS 64712 > > router-id 172.30.198.4 > > holdtime 9 > > log updates > > > > prefix-set internal { 10.0.0.0/8 prefixlen >= 16, 10.60.0.0/15, > > 172.20.0.0/16 prefixlen <= 32, 172.29.0.0/16 prefixlen >= 24, > > 172.29.248.10/31 prefixlen = 32, 172.30.0.0/16 prefixlen >= 24 } > > > > # DEFAULT FILTERING > > deny from any > > deny to any > > > > # NETWORK STATEMENTS > > network 172.30.198.4/32 set community 65000:64712 > > network inet connected set community 65000:64712 > > network inet static set community 65000:64712 > > > > # NEIGHBORS > > group "vpn" { > > announce IPv6 none > > route-reflector > > remote-as 64712 > > > > neighbor 10.1.230.9 { > > descr "vpn1" > > } > > neighbor 10.1.230.10 { > > descr "vpn2" > > } > > } > > > > # SOURCE FILTERING > > allow to group "vpn" prefix-set internal community 65000:64712 > > # DEST FILTERING > > allow from group "vpn" prefix-set internal > > # TRAFFIC ENGINEERING > > match to group "vpn" set nexthop 10.1.230.12 > > match to any prefix 172.30.198.4/32 set nexthop self > > > > Thanks for the detailed report. I quick workaround is to reload the config > twice. Then the networks are added again. The proper fix is attached. > > The problem was that when already present networks were readded the > function kr_net_redist_add() returned 0 which was propegated to > kr_net_match() which then caused kr_redistribute() to actually remove the > prefix. > > I changed the code to only return 0 when there is actually the case that > the network being added is shadowed by another one and therefor this > prefix should be removed. While there I also fixed a memory leak ;) > > Please test. > -- > :wq Claudio > > Index: kroute.c > === > RCS file: /cvs/src/usr.sbin/bgpd/kroute.c,v > retrieving revision 1.235 > diff -u -p -r1.235 kroute.c > --- kroute.c7 Mar 2019 07:42:36 - 1.235 > +++ kroute.c3 May 2019 09:32:10 - > @@ -1230,19 +1230,19 @@ kr_net_redist_add(struct ktable *kt, str > > xr = RB_INSERT(kredist_tree, &kt->kredist, r); > if (xr != NULL) { > - if (dynamic == xr->dynamic || dynamic) { > + free(r); > + > + if (dynamic != xr->dynamic && dynamic) { > /* > -* ignore update, equal announcement already present, > -* or a non-dynamic announcement is already present > -* which has preference. > +* ignore update a non-dynamic announcement is > +
Re: bgpd acting up, dropping connected/static network statements
On Fri, May 03, 2019 at 09:59:40AM +0200, open...@kene.nu wrote: > Hello, > > I am seeing strange behaviour of bgpd in 6.5. > > Not sure what causes the networks in bgpd to disappear but they do > disappear and performing a netstart pick the network back up again in > bgpd. I cannot see this in either 6.4 or 6.3. One triggering factor > seems to be restarting the bgpd process. > > Excerpt form the daemon logs (bgpd restart or reload): > May 3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4 > (LOCAL) AS64712: announce 10.1.150.0/24 > May 3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4 > (LOCAL) AS64712: withdraw announce 10.1.150.0/24 > > If one performs a netstart, of relevant vlan interfaces, the > announcements seem to survive a bgpd reload. Static routes never > survive a restart or reload. > > Some additional commands to show behaviour: > # uname -a > OpenBSD host 6.5 GENERIC.MP#3 amd64 > # ifconfig vlan190 > vlan190: flags=8943 mtu 1500 > lladdr > index 33 priority 0 llprio 3 > encap: vnetid 190 parent em0 txprio packet > groups: vlan > media: Ethernet autoselect (1000baseT full-duplex,master) > status: active > inet 10.1.150.2 netmask 0xff00 broadcast 10.1.150.255 > # grep connected /etc/bgpd.conf > network inet connected set community 65000:64712 > # bgpctl sh ip bgp 10.1.150.0/24 > flags: * = Valid, > = Selected, I = via IBGP, A = Announced, >S = Stale, E = Error > origin validation state: N = not-found, V = valid, ! = invalid > origin: i = IGP, e = EGP, ? = Incomplete > > flags ovs destination gateway lpref med aspath origin > # sh /etc/netstart vlan150 > # bgpctl sh ip bgp 10.1.150.0/24 > flags: * = Valid, > = Selected, I = via IBGP, A = Announced, >S = Stale, E = Error > origin validation state: N = not-found, V = valid, ! = invalid > origin: i = IGP, e = EGP, ? = Incomplete > > flags ovs destination gateway lpref med aspath origin > AI*>N 10.1.150.0/240.0.0.0100 0 i > > > My bgpd.conf: > # GLOBALS > AS 64712 > router-id 172.30.198.4 > holdtime 9 > log updates > > prefix-set internal { 10.0.0.0/8 prefixlen >= 16, 10.60.0.0/15, > 172.20.0.0/16 prefixlen <= 32, 172.29.0.0/16 prefixlen >= 24, > 172.29.248.10/31 prefixlen = 32, 172.30.0.0/16 prefixlen >= 24 } > > # DEFAULT FILTERING > deny from any > deny to any > > # NETWORK STATEMENTS > network 172.30.198.4/32 set community 65000:64712 > network inet connected set community 65000:64712 > network inet static set community 65000:64712 > > # NEIGHBORS > group "vpn" { > announce IPv6 none > route-reflector > remote-as 64712 > > neighbor 10.1.230.9 { > descr "vpn1" > } > neighbor 10.1.230.10 { > descr "vpn2" > } > } > > # SOURCE FILTERING > allow to group "vpn" prefix-set internal community 65000:64712 > # DEST FILTERING > allow from group "vpn" prefix-set internal > # TRAFFIC ENGINEERING > match to group "vpn" set nexthop 10.1.230.12 > match to any prefix 172.30.198.4/32 set nexthop self > Thanks for the detailed report. I quick workaround is to reload the config twice. Then the networks are added again. The proper fix is attached. The problem was that when already present networks were readded the function kr_net_redist_add() returned 0 which was propegated to kr_net_match() which then caused kr_redistribute() to actually remove the prefix. I changed the code to only return 0 when there is actually the case that the network being added is shadowed by another one and therefor this prefix should be removed. While there I also fixed a memory leak ;) Please test. -- :wq Claudio Index: kroute.c === RCS file: /cvs/src/usr.sbin/bgpd/kroute.c,v retrieving revision 1.235 diff -u -p -r1.235 kroute.c --- kroute.c7 Mar 2019 07:42:36 - 1.235 +++ kroute.c3 May 2019 09:32:10 - @@ -1230,19 +1230,19 @@ kr_net_redist_add(struct ktable *kt, str xr = RB_INSERT(kredist_tree, &kt->kredist, r); if (xr != NULL) { - if (dynamic == xr->dynamic || dynamic) { + free(r); + + if (dynamic != xr->dynamic && dynamic) { /* -* ignore update, equal announcement already present, -* or a non-dynamic announcement is already present -* which has preference. +* ignore update a non-dynamic announcement is +* already present which has preference. */ - free(r); return 0; } /* -* only the case where xr->dynamic == 1 and dynamic == 0 -* ends up here and in this case non-dynamic announcments -* are preferred. Override dynamic flag. +* on
bgpd acting up, dropping connected/static network statements
Hello, I am seeing strange behaviour of bgpd in 6.5. Not sure what causes the networks in bgpd to disappear but they do disappear and performing a netstart pick the network back up again in bgpd. I cannot see this in either 6.4 or 6.3. One triggering factor seems to be restarting the bgpd process. Excerpt form the daemon logs (bgpd restart or reload): May 3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4 (LOCAL) AS64712: announce 10.1.150.0/24 May 3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4 (LOCAL) AS64712: withdraw announce 10.1.150.0/24 If one performs a netstart, of relevant vlan interfaces, the announcements seem to survive a bgpd reload. Static routes never survive a restart or reload. Some additional commands to show behaviour: # uname -a OpenBSD host 6.5 GENERIC.MP#3 amd64 # ifconfig vlan190 vlan190: flags=8943 mtu 1500 lladdr index 33 priority 0 llprio 3 encap: vnetid 190 parent em0 txprio packet groups: vlan media: Ethernet autoselect (1000baseT full-duplex,master) status: active inet 10.1.150.2 netmask 0xff00 broadcast 10.1.150.255 # grep connected /etc/bgpd.conf network inet connected set community 65000:64712 # bgpctl sh ip bgp 10.1.150.0/24 flags: * = Valid, > = Selected, I = via IBGP, A = Announced, S = Stale, E = Error origin validation state: N = not-found, V = valid, ! = invalid origin: i = IGP, e = EGP, ? = Incomplete flags ovs destination gateway lpref med aspath origin # sh /etc/netstart vlan150 # bgpctl sh ip bgp 10.1.150.0/24 flags: * = Valid, > = Selected, I = via IBGP, A = Announced, S = Stale, E = Error origin validation state: N = not-found, V = valid, ! = invalid origin: i = IGP, e = EGP, ? = Incomplete flags ovs destination gateway lpref med aspath origin AI*>N 10.1.150.0/240.0.0.0100 0 i My bgpd.conf: # GLOBALS AS 64712 router-id 172.30.198.4 holdtime 9 log updates prefix-set internal { 10.0.0.0/8 prefixlen >= 16, 10.60.0.0/15, 172.20.0.0/16 prefixlen <= 32, 172.29.0.0/16 prefixlen >= 24, 172.29.248.10/31 prefixlen = 32, 172.30.0.0/16 prefixlen >= 24 } # DEFAULT FILTERING deny from any deny to any # NETWORK STATEMENTS network 172.30.198.4/32 set community 65000:64712 network inet connected set community 65000:64712 network inet static set community 65000:64712 # NEIGHBORS group "vpn" { announce IPv6 none route-reflector remote-as 64712 neighbor 10.1.230.9 { descr "vpn1" } neighbor 10.1.230.10 { descr "vpn2" } } # SOURCE FILTERING allow to group "vpn" prefix-set internal community 65000:64712 # DEST FILTERING allow from group "vpn" prefix-set internal # TRAFFIC ENGINEERING match to group "vpn" set nexthop 10.1.230.12 match to any prefix 172.30.198.4/32 set nexthop self