Re: bgpd causing black-holes with bgp-only setup

2007-11-09 Thread Insan Praja SW

On Mon, 05 Nov 2007 07:08:56 +0700, Claudio Jeker
[EMAIL PROTECTED] wrote:
Hi,
I'm currently setup a redundant BGP router, from your presentation (maybe
around 2004-2006), you discourage using carp for fail-over/load balancing
since it will loose the session. so I wonder, since I'm using 4.2-current,
is using carp interface already do-able, it wont loose session, etc?
Thanks,


Insan

On Sun, Nov 04, 2007 at 11:30:20PM +, Tony Sarendal wrote:

On 11/4/07, Tony Sarendal [EMAIL PROTECTED] wrote:



Thanks for all the info. I will have a look at this as well. Currently I
think it is possible that route-reflector is not bug free in cases where
you have route-reflector rings or other very complex setups. I only
tested
the easy setups till now. Why you get routing loops and black-holes in
your 3 AS setups is not clear (at least for me) but I guess it may be an
issue with a failed update. I have the feeling that when we get a update
with a routing loop in it we should actually issue a withdraw for the
prefix carried in it so the following code in rde.c is looking
suspicious:
/* aspath needs to be loop free nota bene this is not a hard error */
if (peer-conf.ebgp  !aspath_loopfree(asp-aspath, conf-as)) {
error = 0;
goto done;
}

I'm mostly offline in the next days so maybe you beat me in finding a fix
for this.




--
Insan Praja SW



Re: bgpd causing black-holes with bgp-only setup

2007-11-09 Thread Henning Brauer
* Insan Praja SW [EMAIL PROTECTED] [2007-11-09 16:37]:
 On Mon, 05 Nov 2007 07:08:56 +0700, Claudio Jeker
 [EMAIL PROTECTED] wrote:
 Hi,
 I'm currently setup a redundant BGP router, from your presentation (maybe
 around 2004-2006), you discourage using carp for fail-over/load balancing
 since it will loose the session. so I wonder, since I'm using 4.2-current,
 is using carp interface already do-able, it wont loose session, etc?

using carp interfaces for failover is perfectly fine, you just have to 
understand what it does and what not. sessions get lost and 
re-established of course.

-- 
Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED]
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg  Amsterdam



Re: bgpd causing black-holes with bgp-only setup

2007-11-04 Thread Tony Sarendal
On 11/4/07, Tony Sarendal [EMAIL PROTECTED] wrote:


 bgpd does not re-route correctly when I shut down a transit when I
 use a bgp-only design, causing black-holes for some prefixes.

 router-01 and router-02 are in the same AS and peer with the same transit
 provider.
 router-01 and router-02 have two ibgp peerings, primary and standby path.
 router-01 sets localpref 60 on all transit prefixes, router-02 sets
 local-pref 50.
 When I take down the transit on router-01 I see this on router-02:

 router-02# bgpctl show rib | head -n 10
 flags: * = Valid,  = Selected, I = via IBGP, A = Announced
 origin: i = IGP, e = EGP, ? = Incomplete

 flags destination gateway  lpref   med aspath origin
 I*   26.0.128.0/17   172.17.1.1  60 11100 65100 i
 * 26.0.128.0/17   192.168.100.5   50 10100 65100 i
 I*   26.0.144.0/22   172.17.1.1  60 11100 65100 i
 * 26.0.144.0/22192.168.100.5   50 10100 65100 i
 I*   26.1.77.0/24172.17.1.1  60 11100 65100 i
 * 26.1.77.0/24192.168.100.5   50 10100 65100 i
 router-02#

 prefixes with local-pref 60 pointing at router-01.
 router-01 does not have it's transit peering up, and thus itself has no
 prefixes with local-pref 60.

 router-01# bgpctl show rib | head -n
 10

 flags: * = Valid,  = Selected, I = via IBGP, A = Announced
 origin: i = IGP, e = EGP, ? = Incomplete

 flags destination gateway  lpref   med aspath origin
 I*   26.0.128.0/17   172.17.1.6   50 21100 65100 i
 I*   26.0.144.0/22   172.17.1.6  50 21100 65100 i
 I*   26.1.77.0/24 172.17.1.6  50 21100 65100 i
 I*   26.2.172.0/22   172.17.1.6  50 21100 65100 i
 I*   26.3.241.0/24   172.17.1.6  50 21100 65100 i
 I*   26.6.126.0/24   172.17.1.6   50 21100 65100 i
 router-01#  bgpctl show rib 26.0.128.0/17 all
 flags: * = Valid,  = Selected, I = via IBGP, A = Announced
 origin: i = IGP, e = EGP, ? = Incomplete

 flags destination gateway  lpref   med aspath origin
 I*   26.0.128.0/17   172.17.1.6  50 21100 65100 i
 I*   26.0.144.0/22   172.17.1.6  50 21100 65100 i
 router-01#

 I saw this before when I tested bgpd around a year ago. So it isn't a new
 bug.
 This is with 4.2-RELEASE, no patches.

 This info is from a lab I setup to replicate a live environment.


 /Tony


 router-01# cat /etc/bgpd.conf
 # $OpenBSD: bgpd.conf,v 1.8 2007/03/29 13:37:35 claudio Exp $
 # sample bgpd configuration file
 # see bgpd.conf(5)

 #macros
 loopback=172.17.0.1

 # global configuration
 AS 65200
 router-id $loopback

 network $loopback/32 set {localpref 120, med 10}
 network 172.17.0.0/16 set {localpref 120, med 10}
 network connected set {localpref 120, med 10}
 network static set {localpref 120, med 10}

 group TRANSIT {
 remote-as 65100
 announce all
 set nexthop self
 set med 10100
 set localpref 60
 neighbor 192.168.100.1 {
 descr TRANSIT
 }
 }

 group IBGP {
 remote-as 65200
 route-reflector
 set nexthop self
 set med +1000
 neighbor 172.17.1.2 {
 local-address 172.17.1.1
 descr router-02 primary
 }
 neighbor 172.17.1.6 {
 local-address 172.17.1.5
 descr router-02 standby
 set med +1
 }
 }


 # filter
 deny from any
 deny to any

 allow quick to group IBGP
 allow quick from group IBGP

 allow quick to group TRANSIT prefix 172.17.0.0/16
 allow quick from group TRANSIT

 router-01#
 ifconfig

 lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST mtu 33208
 groups: lo
 inet 127.0.0.1 netmask 0xff00
 inet6 ::1 prefixlen 128
 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
 ne3: flags=8863UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST mtu
 1500
 lladdr 52:54:00:12:02:01
 description: transit
 media: Ethernet 10baseT full-duplex
 inet6 fe80::5054:ff:fe12:201%ne3 prefixlen 64 scopeid 0x1
 inet 192.168.100.2 netmask 0xfffc broadcast 192.168.100.3
 ne4: flags=8863UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST mtu
 1500
 lladdr 52:54:00:12:02:02
 description: router-01 primary path
 media: Ethernet 10baseT full-duplex
 inet6 fe80::5054:ff:fe12:202%ne4 prefixlen 64 scopeid 0x2
 inet 172.17.1.1 netmask 0xfffc broadcast 172.17.1.3
 ne5: flags=8863UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST mtu
 1500
 lladdr 52:54:00:12:02:03
 description: route-02 standby path
 media: Ethernet 10baseT full-duplex
 inet6 fe80::5054:ff:fe12:203%ne5 prefixlen 64 scopeid 0x3
 inet 172.17.1.5 netmask 0xfffc broadcast 172.17.1.7
 enc0: flags=0 mtu 1536
 lo1: flags=8049UP,LOOPBACK,RUNNING,MULTICAST mtu 33208
 description: ROUTING LOOPBACK
 groups: lo

Re: bgpd causing black-holes with bgp-only setup

2007-11-04 Thread Tony Sarendal
On 11/4/07, Tony Sarendal [EMAIL PROTECTED] wrote:

 On 11/4/07, Tony Sarendal [EMAIL PROTECTED] wrote:

 
  bgpd does not re-route correctly when I shut down a transit when I
  use a bgp-only design, causing black-holes for some prefixes.
 
  router-01 and router-02 are in the same AS and peer with the same
  transit provider.
  router-01 and router-02 have two ibgp peerings, primary and standby
  path.
  router-01 sets localpref 60 on all transit prefixes, router-02 sets
  local-pref 50.
  When I take down the transit on router-01 I see this on router-02:
 
  router-02# bgpctl show rib | head -n 10
  flags: * = Valid,  = Selected, I = via IBGP, A = Announced
  origin: i = IGP, e = EGP, ? = Incomplete
 
  flags destination gateway  lpref   med aspath origin
  I*   26.0.128.0/17   172.17.1.1  60 11100 65100 i
  * 26.0.128.0/17   192.168.100.5   50 10100 65100 i
  I*   26.0.144.0/22   172.17.1.1  60 11100 65100 i
  * 26.0.144.0/22192.168.100.5   50 10100 65100 i
  I*   26.1.77.0/24172.17.1.1  60 11100 65100 i
  * 26.1.77.0/24192.168.100.5   50 10100 65100 i
  router-02#
 
  prefixes with local-pref 60 pointing at router-01.
  router-01 does not have it's transit peering up, and thus itself has no
  prefixes with local-pref 60.
 
  router-01# bgpctl show rib | head -n
  10
 
  flags: * = Valid,  = Selected, I = via IBGP, A = Announced
  origin: i = IGP, e = EGP, ? = Incomplete
 
  flags destination gateway  lpref   med aspath origin
  I*   26.0.128.0/17   172.17.1.6   50 21100 65100 i
  I*   26.0.144.0/22   172.17.1.6  50 21100 65100 i
  I*   26.1.77.0/24 172.17.1.6  50 21100 65100 i
  I*   26.2.172.0/22   172.17.1.6  50 21100 65100 i
  I*   26.3.241.0/24   172.17.1.6  50 21100 65100 i
  I*   26.6.126.0/24   172.17.1.6   50 21100 65100 i
  router-01#  bgpctl show rib 26.0.128.0/17 all
  flags: * = Valid,  = Selected, I = via IBGP, A = Announced
  origin: i = IGP, e = EGP, ? = Incomplete
 
  flags destination gateway  lpref   med aspath origin
  I*   26.0.128.0/17   172.17.1.6  50 21100 65100 i
  I*   26.0.144.0/22   172.17.1.6  50 21100 65100 i
  router-01#
 
  I saw this before when I tested bgpd around a year ago. So it isn't a
  new bug.
  This is with 4.2-RELEASE, no patches.
 
  This info is from a lab I setup to replicate a live environment.
 
 
  /Tony
 
 
  router-01# cat /etc/bgpd.conf
  # $OpenBSD: bgpd.conf,v 1.8 2007/03/29 13:37:35 claudio Exp $
  # sample bgpd configuration file
  # see bgpd.conf(5)
 
  #macros
  loopback=172.17.0.1
 
  # global configuration
  AS 65200
  router-id $loopback
 
  network $loopback/32 set {localpref 120, med 10}
  network 172.17.0.0/16 set {localpref 120, med 10}
  network connected set {localpref 120, med 10}
  network static set {localpref 120, med 10}
 
  group TRANSIT {
  remote-as 65100
  announce all
  set nexthop self
  set med 10100
  set localpref 60
  neighbor 192.168.100.1 {
  descr TRANSIT
  }
  }
 
  group IBGP {
  remote-as 65200
  route-reflector
  set nexthop self
  set med +1000
  neighbor 172.17.1.2 {
  local-address 172.17.1.1
  descr router-02 primary
  }
  neighbor 172.17.1.6 {
  local-address 172.17.1.5
  descr router-02 standby
  set med +1
  }
  }
 
 
  # filter
  deny from any
  deny to any
 
  allow quick to group IBGP
  allow quick from group IBGP
 
  allow quick to group TRANSIT prefix 172.17.0.0/16
  allow quick from group TRANSIT
 
  router-01#
  ifconfig
 
  lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST mtu 33208
  groups: lo
  inet 127.0.0.1 netmask 0xff00
  inet6 ::1 prefixlen 128
  inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
  ne3: flags=8863UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST mtu
  1500
  lladdr 52:54:00:12:02:01
  description: transit
  media: Ethernet 10baseT full-duplex
  inet6 fe80::5054:ff:fe12:201%ne3 prefixlen 64 scopeid 0x1
  inet 192.168.100.2 netmask 0xfffc broadcast 192.168.100.3
  ne4: flags=8863UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST mtu
  1500
  lladdr 52:54:00:12:02:02
  description: router-01 primary path
  media: Ethernet 10baseT full-duplex
  inet6 fe80::5054:ff:fe12:202%ne4 prefixlen 64 scopeid 0x2
  inet 172.17.1.1 netmask 0xfffc broadcast 172.17.1.3
  ne5: flags=8863UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST mtu
  1500
  lladdr 52:54:00:12:02:03
  description: route-02 standby path
  media: Ethernet 10baseT full-duplex
  inet6 fe80::5054:ff:fe12:203%ne5 prefixlen 64 scopeid 0x3
  

Re: bgpd causing black-holes with bgp-only setup

2007-11-04 Thread Tony Sarendal
On 11/4/07, Tony Sarendal [EMAIL PROTECTED] wrote:



 On 11/4/07, Tony Sarendal [EMAIL PROTECTED] wrote:
 
  On 11/4/07, Tony Sarendal [EMAIL PROTECTED]  wrote:
 
  
   bgpd does not re-route correctly when I shut down a transit when I
   use a bgp-only design, causing black-holes for some prefixes.
  
   router-01 and router-02 are in the same AS and peer with the same
   transit provider.
   router-01 and router-02 have two ibgp peerings, primary and standby
   path.
   router-01 sets localpref 60 on all transit prefixes, router-02 sets
   local-pref 50.
   When I take down the transit on router-01 I see this on router-02:
  
   router-02# bgpctl show rib | head -n 10
   flags: * = Valid,  = Selected, I = via IBGP, A = Announced
   origin: i = IGP, e = EGP, ? = Incomplete
  
   flags destination gateway  lpref   med aspath origin
   I*   26.0.128.0/17   172.17.1.1  60 11100 65100 i
   * 26.0.128.0/17   192.168.100.5   50 10100 65100 i
   I*   26.0.144.0/22   172.17.1.1  60 11100 65100 i
   * 26.0.144.0/22192.168.100.5   50 10100 65100 i
   I*   26.1.77.0/24172.17.1.1  60 11100 65100 i
   * 26.1.77.0/24192.168.100.5   50 10100 65100 i
   router-02#
  
   prefixes with local-pref 60 pointing at router-01.
   router-01 does not have it's transit peering up, and thus itself has
   no prefixes with local-pref 60.
  
   router-01# bgpctl show rib | head -n
   10
  
   flags: * = Valid,  = Selected, I = via IBGP, A = Announced
   origin: i = IGP, e = EGP, ? = Incomplete
  
   flags destination gateway  lpref   med aspath origin
   I*   26.0.128.0/17   172.17.1.6   50 21100 65100 i
   I*   26.0.144.0/22   172.17.1.6  50 21100 65100 i
   I*   26.1.77.0/24 172.17.1.6  50 21100 65100 i
   I*   26.2.172.0/22   172.17.1.6  50 21100 65100 i
   I*   26.3.241.0/24   172.17.1.6  50 21100 65100 i
   I*   26.6.126.0/24   172.17.1.6   50 21100 65100 i
   router-01#  bgpctl show rib 26.0.128.0/17 all
   flags: * = Valid,  = Selected, I = via IBGP, A = Announced
   origin: i = IGP, e = EGP, ? = Incomplete
  
   flags destination gateway  lpref   med aspath origin
   I*   26.0.128.0/17   172.17.1.6  50 21100 65100 i
   I*   26.0.144.0/22   172.17.1.6  50 21100 65100 i
   router-01#
  
   I saw this before when I tested bgpd around a year ago. So it isn't a
   new bug.
   This is with 4.2-RELEASE, no patches.
  
   This info is from a lab I setup to replicate a live environment.
  
  
   /Tony
  
  
   router-01# cat /etc/bgpd.conf
   # $OpenBSD: bgpd.conf,v 1.8 2007/03/29 13:37:35 claudio Exp $
   # sample bgpd configuration file
   # see bgpd.conf(5)
  
   #macros
   loopback=172.17.0.1
  
   # global configuration
   AS 65200
   router-id $loopback
  
   network $loopback/32 set {localpref 120, med 10}
   network 172.17.0.0/16 set {localpref 120, med 10}
   network connected set {localpref 120, med 10}
   network static set {localpref 120, med 10}
  
   group TRANSIT {
   remote-as 65100
   announce all
   set nexthop self
   set med 10100
   set localpref 60
   neighbor 192.168.100.1 {
   descr TRANSIT
   }
   }
  
   group IBGP {
   remote-as 65200
   route-reflector
   set nexthop self
   set med +1000
   neighbor 172.17.1.2 {
   local-address 172.17.1.1
   descr router-02 primary
   }
   neighbor 172.17.1.6 {
   local-address 172.17.1.5
   descr router-02 standby
   set med +1
   }
   }
  
  
   # filter
   deny from any
   deny to any
  
   allow quick to group IBGP
   allow quick from group IBGP
  
   allow quick to group TRANSIT prefix 172.17.0.0/16
   allow quick from group TRANSIT
  
   router-01#
   ifconfig
  
   lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST mtu 33208
   groups: lo
   inet 127.0.0.1 netmask 0xff00
   inet6 ::1 prefixlen 128
   inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
   ne3: flags=8863UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST mtu
   1500
   lladdr 52:54:00:12:02:01
   description: transit
   media: Ethernet 10baseT full-duplex
   inet6 fe80::5054:ff:fe12:201%ne3 prefixlen 64 scopeid 0x1
   inet 192.168.100.2 netmask 0xfffc broadcast 192.168.100.3
   ne4: flags=8863UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST mtu
   1500
   lladdr 52:54:00:12:02:02
   description: router-01 primary path
   media: Ethernet 10baseT full-duplex
   inet6 fe80::5054:ff:fe12:202%ne4 prefixlen 64 scopeid 0x2
   inet 172.17.1.1 netmask 0xfffc broadcast 172.17.1.3
   ne5: flags=8863UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST mtu
   1500
 

Re: bgpd causing black-holes with bgp-only setup

2007-11-04 Thread Claudio Jeker
On Sun, Nov 04, 2007 at 11:30:20PM +, Tony Sarendal wrote:
 On 11/4/07, Tony Sarendal [EMAIL PROTECTED] wrote:
 

Thanks for all the info. I will have a look at this as well. Currently I
think it is possible that route-reflector is not bug free in cases where
you have route-reflector rings or other very complex setups. I only tested
the easy setups till now. Why you get routing loops and black-holes in
your 3 AS setups is not clear (at least for me) but I guess it may be an
issue with a failed update. I have the feeling that when we get a update
with a routing loop in it we should actually issue a withdraw for the
prefix carried in it so the following code in rde.c is looking suspicious:
/* aspath needs to be loop free nota bene this is not a hard error */
if (peer-conf.ebgp  !aspath_loopfree(asp-aspath, conf-as)) {
error = 0;
goto done;
}

I'm mostly offline in the next days so maybe you beat me in finding a fix
for this.
-- 
:wq Claudio



Re: bgpd causing black-holes with bgp-only setup

2007-11-04 Thread Tony Sarendal
On 11/5/07, Claudio Jeker [EMAIL PROTECTED] wrote:

 On Sun, Nov 04, 2007 at 11:30:20PM +, Tony Sarendal wrote:
  On 11/4/07, Tony Sarendal [EMAIL PROTECTED] wrote:
  

 Thanks for all the info. I will have a look at this as well. Currently I
 think it is possible that route-reflector is not bug free in cases where
 you have route-reflector rings or other very complex setups. I only tested
 the easy setups till now. Why you get routing loops and black-holes in
 your 3 AS setups is not clear (at least for me) but I guess it may be an
 issue with a failed update. I have the feeling that when we get a update
 with a routing loop in it we should actually issue a withdraw for the
 prefix carried in it so the following code in rde.c is looking suspicious:
 /* aspath needs to be loop free nota bene this is not a hard error
 */
 if (peer-conf.ebgp  !aspath_loopfree(asp-aspath, conf-as)) {
 error = 0;
 goto done;
 }

 I'm mostly offline in the next days so maybe you beat me in finding a fix
 for this.
 --



RFC4271:

   Changing the attribute(s) of a route is accomplished by advertising a
   replacement route.  The replacement route carries new (changed)
   attributes and has the same address prefix as the original route.

That is the reason.

When in my tests AS65200 looses direct connectivity with AS65100 it sees
AS65300 as a viable path.
It sends a WITHDRAW of the AS65100 prefix to AS65300 via the primary
peering.
On the standby peering no WITHDRAW is sent, instead AS65200 sends an UPDATE
with it's new path. Since this update has AS65300 in the AS-PATH AS65300
will discard
the update and just missed the fact that AS65200 doesn't have connectivity
to AS65100.

Handling an incoming UPDATE with a loop as a WITHDRAW, be it as-path,
cluster-list or
originator-id, sounds pretty good to me right now. I'll sleep  on it and see
how it feels
tomorrow.

As I said, I don't see anything here that violates RFC's, but I have never
seen this before
either. I will try to get the time to check out how IOS and IOS XR handle
this. No point
in re-inventing the wheel if they happen to have a round one.

/Tony