Re: [c-nsp] delay eBGP sessions on startup?
On Tue, Nov 24, 2009 at 2:25 AM, Gert Doering g...@greenie.muc.de wrote: Hi, On Mon, Nov 23, 2009 at 07:03:07PM -0500, Bill Desjardins wrote: the idea is that the border routers peer with the ibgp RR's and use a bgp conditional statement to advertise your aggregate upstream only upon matching a 'trigger route' received from the ibgp RR. I am no bgp This would work, but won't do the job in our network - this network is really, really small (but has BIG requirements, as always :-) ) - so there are no other BGP routers. Given that there are only these two boxes, and neither can rely on the other one (it could be down...), waiting for a certain route to show up might be fatal - depending on what else is available, the route might just not show up ever. (I'm not sure that I would be able to convince the customer that adding two more BGP boxes and increasing the complexity of the overall configuration is a good thing...) What I'm currently leaning toward is put all internal routes into OSPF and to hell with best practices... much less complexity, problem still solved. The estimate is that we'll see something like 50-100 internal routes at maximum, and OSPF will quite happily handle this. if the only ibgp is between these 2 borders, than best ibgp practice would seem to be a bit far off anyway. ospf and be done with it. if IGP routes grow to an uncomfortable level, then you can revisit the design then. simplicity and reliability would be my first choice. gert Bill ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
On 24/11/2009, at 5:19 PM, Gert Doering wrote: Well, the two routers mentioned above are the core and the border routers. There *is* only these two :-) Well, in that case the only thing I can think of is conditional advertisement based on the visibility of an iBGP prefix that you receive from the other router as someone mentioned before. Again, this wouldn't be deterministic and you could quite possibly still blackhole traffic but hopefully for a much shorter time. At least you'd know that the iBGP session had been established and prefixes were flowing even if things hadn't totally reconverged. David ... ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
Hi, On Wed, Nov 25, 2009 at 05:50:45AM +1000, David Hughes wrote: On 24/11/2009, at 5:19 PM, Gert Doering wrote: Well, the two routers mentioned above are the core and the border routers. There *is* only these two :-) Well, in that case the only thing I can think of is conditional advertisement based on the visibility of an iBGP prefix that you receive from the other router as someone mentioned before. Sounds like a plan - Router A down - prefix missing on Router B, remove external announcement there as well. How to build a redundant network that falls off the 'net if *either* router dies :-)) Again, this wouldn't be deterministic and you could quite possibly still blackhole traffic but hopefully for a much shorter time. At least you'd know that the iBGP session had been established and prefixes were flowing even if things hadn't totally reconverged. internal routes in OSPF :-) gert -- USENET is *not* the non-clickable part of WWW! //www.muc.de/~gert/ Gert Doering - Munich, Germany g...@greenie.muc.de fax: +49-89-35655025g...@net.informatik.tu-muenchen.de pgpJjWXjUloJq.pgp Description: PGP signature ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
On 25/11/2009, at 6:46 AM, Gert Doering wrote: Sounds like a plan - Router A down - prefix missing on Router B, remove external announcement there as well. How to build a redundant network that falls off the 'net if *either* router dies :-)) LOL. Didn't think that one through to its natural conclusion did I :-) David ... ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
Hi, On Mon, Nov 23, 2009 at 08:46:56AM +0100, Gert Doering wrote: One possible solution would be to have a knob that tells IOS delay bringing up eBGP sessions and/or announcement of routes on eBGP sessions for n seconds after initial BGP startup. This would make sure that iBGP has converged before eBGP starts, and no transient black-holing is seen. Indeed there is a knob that seems to go into the right direction (thanks to Marco Eulenfeld for pointing this out to me): bgp update-delay n the bgp update-delay command is used to tune the maximum time the software will wait after the first neighbor is established until it starts calculating best paths and sending out advertisements. Now, what does maximum time mean? Will it wait, or will it not? The documentation that I found claims that the default value is 120, which would certainly not agree with the observed behaviour. OTOH, Marco claims that he has seen 0 as a default... Will test, and report. gert -- USENET is *not* the non-clickable part of WWW! //www.muc.de/~gert/ Gert Doering - Munich, Germany g...@greenie.muc.de fax: +49-89-35655025g...@net.informatik.tu-muenchen.de pgpJTLl2efPoL.pgp Description: PGP signature ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
On Mon, Nov 23, 2009 at 09:10:25AM +0100, Gert Doering wrote: bgp update-delay n the bgp update-delay command is used to tune the maximum time the software will wait after the first neighbor is established until it starts calculating best paths and sending out advertisements. Now, what does maximum time mean? Will it wait, or will it not? The documentation that I found claims that the default value is 120, which would certainly not agree with the observed behaviour. OTOH, Marco claims that he has seen 0 as a default... The docs make it look like more of a graceful-restart specific timer, not like advertisement-interval (intentionally delaying the propagation of new updates to try and consolidate them) or the on-startup delay behaviors available in the IGPs. http://www.cisco.com/en/US/products/ps6550/products_white_paper09186a008016317c.shtml The bgp update-delay n command may be entered on the Cisco NSF-capable router. The update-delay specifies the time interval- after the first peer has reconnected during which the restarting router expects to receive all BGP updates and the EOR marker from all of its configured peers. The default value of n is 120 seconds, and n is always measured in seconds. If the restarting router has a large number of peers, each with a large number of updates to be sent, this value may need to be increased from its default value. -- Richard A Steenbergen r...@e-gerbil.net http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC) ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
probably Cisco needs a knob very similar to vendor Juniper out-delay. you can delay the time between when BGP and the routing table exchange route information. http://www.juniper.net/techpubs/software/junos/junos73/swconfig73-routing/html/bgp-config58.html#1016387 Regards, Masood On Mon, Nov 23, 2009 at 09:10:25AM +0100, Gert Doering wrote: bgp update-delay n the bgp update-delay command is used to tune the maximum time the software will wait after the first neighbor is established until it starts calculating best paths and sending out advertisements. Now, what does maximum time mean? Will it wait, or will it not? The documentation that I found claims that the default value is 120, which would certainly not agree with the observed behaviour. OTOH, Marco claims that he has seen 0 as a default... The docs make it look like more of a graceful-restart specific timer, not like advertisement-interval (intentionally delaying the propagation of new updates to try and consolidate them) or the on-startup delay behaviors available in the IGPs. http://www.cisco.com/en/US/products/ps6550/products_white_paper09186a008016317c.shtml The bgp update-delay n command may be entered on the Cisco NSF-capable router. The update-delay specifies the time interval- after the first peer has reconnected during which the restarting router expects to receive all BGP updates and the EOR marker from all of its configured peers. The default value of n is 120 seconds, and n is always measured in seconds. If the restarting router has a large number of peers, each with a large number of updates to be sent, this value may need to be increased from its default value. -- Richard A Steenbergen r...@e-gerbil.net http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC) ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
Hi all, The situation is due to the fact that the upstream solution architecture is not symetric + the fact that BGP is not designed for milisecond convergence. Hence are my silly ideas in the order they appear in memory: 1. One of the solutions would be to make the architecture symetric - make Upstream 1 --- ISP-Router 1 send 200k routes between themselves. 2. Try to get the situation symetric as much as possible with Advanced Complicated BGP tweaking a. As default MTU for BGP session is 536, use ip tcp path-mtu-discovery on neighboars or neighbor x.x.x.x transport path-mtu-discovery. This should get the 200k on the other side faster. b. Bind the advertizing of the big 200.1.0.0/16 to RTR tracker that tracks the availability of certain route c. BGP scanner tweaking d. etc. etc. see Networkers presentations: BRKIPM-3005 - Advances in BGP BRKIPM-3004 - IOS-XR IGP, BGP and PIM Convergence 3. Shutdown the BGP with Upstream_1 in startup, and unshut it manually. :)) 4. Shutdown the BGP with Upstream_1 in startup, and unshut it automatically with clever EEM. :)) I my opinion asking Cisco for a knob is a last resort, should be used only when all the ideas fail. -pavel skovajsa On Mon, Nov 23, 2009 at 10:30 AM, mas...@nexlinx.net.pk wrote: probably Cisco needs a knob very similar to vendor Juniper out-delay. you can delay the time between when BGP and the routing table exchange route information. http://www.juniper.net/techpubs/software/junos/junos73/swconfig73-routing/html/bgp-config58.html#1016387 Regards, Masood On Mon, Nov 23, 2009 at 09:10:25AM +0100, Gert Doering wrote: bgp update-delay n the bgp update-delay command is used to tune the maximum time the software will wait after the first neighbor is established until it starts calculating best paths and sending out advertisements. Now, what does maximum time mean? Will it wait, or will it not? The documentation that I found claims that the default value is 120, which would certainly not agree with the observed behaviour. OTOH, Marco claims that he has seen 0 as a default... The docs make it look like more of a graceful-restart specific timer, not like advertisement-interval (intentionally delaying the propagation of new updates to try and consolidate them) or the on-startup delay behaviors available in the IGPs. http://www.cisco.com/en/US/products/ps6550/products_white_paper09186a008016317c.shtml The bgp update-delay n command may be entered on the Cisco NSF-capable router. The update-delay specifies the time interval- after the first peer has reconnected during which the restarting router expects to receive all BGP updates and the EOR marker from all of its configured peers. The default value of n is 120 seconds, and n is always measured in seconds. If the restarting router has a large number of peers, each with a large number of updates to be sent, this value may need to be increased from its default value. -- Richard A Steenbergen r...@e-gerbil.net http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC) ___ cisco-nsp mailing list cisco-...@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-...@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
Hi, On Mon, Nov 23, 2009 at 11:31:42AM +0100, Pavel Skovajsa wrote: The situation is due to the fact that the upstream solution architecture is not symetric + the fact that BGP is not designed for milisecond convergence. Indeed. But actually you don't need millisecond convergence here, if you ensure convergence in the right sequence - IGP first (with overload bit), iBGP next, clear IGP overload bit, eBGP. The problem is that eBGP routes are announced before iBGP has converged, and as such, the routers cannot do the right thing here. Hence are my silly ideas in the order they appear in memory: 1. One of the solutions would be to make the architecture symetric - make Upstream 1 --- ISP-Router 1 send 200k routes between themselves. This would not help at all. Why? Because at startup, ISP Router 1 only has *one* prefix. Only after the 200k routes from ISP-R2 and upstream 1 have been received, ISP-R1 could even begin to announce them. Not that I would *want* to announce the full table to the upstream routers. 2. Try to get the situation symetric as much as possible with Advanced Complicated BGP tweaking a. As default MTU for BGP session is 536, use ip tcp path-mtu-discovery on neighboars or neighbor x.x.x.x transport path-mtu-discovery. This should get the 200k on the other side faster. This would improve things slightly, but won't solve the general problem. b. Bind the advertizing of the big 200.1.0.0/16 to RTR tracker that tracks the availability of certain route Won't help. If ISP-R2 is down, ISP-R1 still has to announce the /16 (there are customers directly connected to ISP-R1 that need the /16 to be in BGP). c. BGP scanner tweaking Won't help. Scanner is not involved yet. d. etc. etc. see Networkers presentations: BRKIPM-3005 - Advances in BGP BRKIPM-3004 - IOS-XR IGP, BGP and PIM Convergence I'll look at these (thanks). 3. Shutdown the BGP with Upstream_1 in startup, and unshut it manually. :)) 4. Shutdown the BGP with Upstream_1 in startup, and unshut it automatically with clever EEM. :)) These two would solve this, but 3. will only help for planned reboots (we hardly ever do planned reboots, unplanned crashes and/or power problem are more frequent), and 4. introduces extra complexity that we really do not want to see there... Are EEM applets and startup invocation visible in show running-config? (This is a serious question - of course the router configuration needs to be backed up, and restored easily. If extra work besides copy tftp start is needed to get a replacement device in place, this is bad). I my opinion asking Cisco for a knob is a last resort, should be used only when all the ideas fail. EEM is a hack that increases complexity in a non-deterministic way, and should only be used when all the more generic approaches fail. gert -- USENET is *not* the non-clickable part of WWW! //www.muc.de/~gert/ Gert Doering - Munich, Germany g...@greenie.muc.de fax: +49-89-35655025g...@net.informatik.tu-muenchen.de pgpOVmG7p7MQb.pgp Description: PGP signature ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
Hi, On Mon, Nov 23, 2009 at 09:10:25AM +0100, Gert Doering wrote: bgp update-delay n [..] Will test, and report. Well, the default indeed *is* 120 (if set to 120, it won't show up in the running-config, if set to 121 or 119, it will) - and it doesn't seem to do what I had hoped for. That is: after a reboot, the eBGP session still comes up right away, and the aggregate prefix is announced a few seconds later, causing temporary blackholing if the iBGP routes are not there yet. There is a certain race component to it - IOS doesn't seem to bring up the eBGP sessions right away, but the exact timing depends a bit on the external neighbor behaviour - if the neighbor wants a session as soon as the link comes up, IOS will grant it (and feed the prefix), but it won't initiate the session immediately. Doesn't really solve the problem, but makes reproducing more tricky. *Especially* since every reboot on this box takes 10 minutes... gert -- USENET is *not* the non-clickable part of WWW! //www.muc.de/~gert/ Gert Doering - Munich, Germany g...@greenie.muc.de fax: +49-89-35655025g...@net.informatik.tu-muenchen.de pgpj8HzdTQ9Ps.pgp Description: PGP signature ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
Hi Gert, On 23/11/2009, at 5:46 PM, Gert Doering wrote: both ISP-Routers announce the ISP's aggregate (let's call it 200.1.0.0/16) to their respective upstream providers (static route to null0, network statement). This needs to be done, to make sure that the aggregate is always visible, even if one of the routers is down. So you are generating the aggregate at the border? That can certainly leave you black holing traffic under several scenarios (anything that isolates that router). Have you thought about generating the aggregate within your network and propagating it via iBGP. At least the border can't advertise it upstream instantaneously as it won't know about it until iBGP is up. So either a static to NULL0 on a pair of core box somewhere or even an aggregate address statement on the border could help you here. Both should delay the advertisement of the aggregate upstream but I don't know if the timing of the advertisement would be deterministic. You could still have the same issue just for a shorter period. David ... ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
Hi Gert, just an idea. I have not tried this and it may also not fit your application... this is on sup2's (SXF17) in my tiny network I have several route reflectors which handle only my customer assignements. nice and small for ibgp convergence. the idea is that the border routers peer with the ibgp RR's and use a bgp conditional statement to advertise your aggregate upstream only upon matching a 'trigger route' received from the ibgp RR. I am no bgp expert and am unsure if the received routes are sorted or not, but if so, you could add a max IPv4 address like 254.254.254.254/32 to place it at the end of the received update. if the updates are not reliably sorted, this is probably all for not though. then with a bgp conditional such as: neighbor x.x.x.x advertise-map EBGPOUT exist-map IBGPDONE you would only advertise out after your IBGP session is near end at least giving you the best chance to avoid blackholes. you should also have solid reliability to multiple RR's to keep it stable. granted a knob would be nice, but at least this method can be centralized and uses commands meant to do this anyway. all your doing is adding an internal 'trigger route' to signify ibgp is about done so send out ebgp advertisements when ya get a chance :) Bill ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
On Tuesday 24 November 2009 06:25:45 am David Hughes wrote: So you are generating the aggregate at the border? That can certainly leave you black holing traffic under several scenarios (anything that isolates that router). Have you thought about generating the aggregate within your network and propagating it via iBGP. At least the border can't advertise it upstream instantaneously as it won't know about it until iBGP is up. Reading through this thread since yesterday, this is also one of the first things I'd recommend be done. In our case, our route reflectors generate our aggregates. All other peering routers simply pass them on if they receive them from the route reflectors via iBGP. Particularly useful if you have a peering router that was generating your aggregate at an exchange point, but suddenly lost its backhaul to your core, and along with it, its iBGP session. Not that I'm recommending it, but one of the unintended benefits we've seen of running a BGP-free core (for IPv4, that is) is that given how long core boxes take to boot, and how slow they may sometimes be in fully converging their BGP tables (while potentially blackholing traffic in the process, hence the little useful knobs in OSPF and IS-IS), not having to run BGP in the core means only edge routers are affected by a system restart. This would limit outages to a smaller part of the network than if a core router were restarting. Mark. signature.asc Description: This is a digitally signed message part. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
Hi, On Tue, Nov 24, 2009 at 08:25:45AM +1000, David Hughes wrote: both ISP-Routers announce the ISP's aggregate (let's call it 200.1.0.0/16) to their respective upstream providers (static route to null0, network statement). This needs to be done, to make sure that the aggregate is always visible, even if one of the routers is down. So you are generating the aggregate at the border? Yes. That can certainly leave you black holing traffic under several scenarios (anything that isolates that router). I'm aware of that - and in this specific network scenario, this is considered highly unlikely. Basically, the network really consists of two routers, which are directly interconnected (direct fiber to the next rack), and both of them are connected via 2 2xGE etherchannels to two L2 switches. So there's 5 different links between those routers - and if someone manages to break *all* of these at the same time, well, blackholing is the least of my worries. (The network is a bit more complex, but the details really don't change this statement) Have you thought about generating the aggregate within your network and propagating it via iBGP. At least the border can't advertise it upstream instantaneously as it won't know about it until iBGP is up. There are no other routers that are considered reliable enough in this setup - everything else is stuff like firewalls or 3640s used as console server. So either a static to NULL0 on a pair of core box somewhere or even an aggregate address statement on the border could help you here. Well, the two routers mentioned above are the core and the border routers. There *is* only these two :-) gert -- USENET is *not* the non-clickable part of WWW! //www.muc.de/~gert/ Gert Doering - Munich, Germany g...@greenie.muc.de fax: +49-89-35655025g...@net.informatik.tu-muenchen.de pgpUyLTeE7jS5.pgp Description: PGP signature ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] delay eBGP sessions on startup?
Hi, On Mon, Nov 23, 2009 at 07:03:07PM -0500, Bill Desjardins wrote: just an idea. I have not tried this and it may also not fit your application... this is on sup2's (SXF17) in my tiny network I have several route reflectors which handle only my customer assignements. nice and small for ibgp convergence. the idea is that the border routers peer with the ibgp RR's and use a bgp conditional statement to advertise your aggregate upstream only upon matching a 'trigger route' received from the ibgp RR. I am no bgp This would work, but won't do the job in our network - this network is really, really small (but has BIG requirements, as always :-) ) - so there are no other BGP routers. Given that there are only these two boxes, and neither can rely on the other one (it could be down...), waiting for a certain route to show up might be fatal - depending on what else is available, the route might just not show up ever. (I'm not sure that I would be able to convince the customer that adding two more BGP boxes and increasing the complexity of the overall configuration is a good thing...) What I'm currently leaning toward is put all internal routes into OSPF and to hell with best practices... much less complexity, problem still solved. The estimate is that we'll see something like 50-100 internal routes at maximum, and OSPF will quite happily handle this. gert -- USENET is *not* the non-clickable part of WWW! //www.muc.de/~gert/ Gert Doering - Munich, Germany g...@greenie.muc.de fax: +49-89-35655025g...@net.informatik.tu-muenchen.de pgpTRrhCCLvHk.pgp Description: PGP signature ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
[c-nsp] delay eBGP sessions on startup?
Hi, so I'm now following the design that everbody claims is best (loopbacks in OSPF, everything else in BGP), and I've found a few corner cases that are seriously worse than customer routes in OSPF. Number one - consider the following (simplified) network: Upstream 1 --- ISP-Router 1 --- ISP-Router 2 --- Upstream 2 | Customer X both ISP-Routers announce the ISP's aggregate (let's call it 200.1.0.0/16) to their respective upstream providers (static route to null0, network statement). This needs to be done, to make sure that the aggregate is always visible, even if one of the routers is down. Customer X uses addresses from 200.1.0.0/16, let's give him 200.1.1.1/32. So, when ISP-Router 1 boots, the following happens, more or less in this order: 1. bootup complete 2. OSPF neighbor establishes with ISP-Router 2 3. eBGP-Session to Upstream 1 establishes, 200.1.0.0/16 is announced (only a single prefix is announced outbound) 4. iBGP-Session to ISP-Router 2 establishes, 200k prefixes start propagating ISP-R2 - ISP-R1 (full table at ISP-R2) 5. Traffic starts flowing from Upstream 1 to ISP-Router 1 (because the Upstream router is installing the 200.1.0.0/16 route right away) 6. 20-60 seconds delay 7. ISP-R1 has processed all the BGP prefixes from ISP-R2, has built a FIB, and programmed everything in its hardware forwarding engines. 8. Traffic from Upstream 1 to Customer X can be forwarded properly the crucial element here is: between the items 5 and 8, packets coming from Upstream 1 to Customer X are *dropped*, because ISP-R1 has no full internal reachability information yet, but is still announcing reachability for the aggregate to Upstream 1. The 20-60 seconds delay comes from the fact that even if the eBGP and iBGP sessions are established at roughly the same time, the eBGP session only has to announce one single prefix (instantaneous), while the iBGP session will see ~200k prefixes, Customer X being just one of them, fairly far down at the end (200.1.1.1/32). So - now I'm wondering if it's only me? Shouldn't this problem bite other folks as well? The other design (customer routes in IGP) doesn't suffer from it, as IGP is usually done converging before BGP starts. But we don't want that. One possible solution would be to have a knob that tells IOS delay bringing up eBGP sessions and/or announcement of routes on eBGP sessions for n seconds after initial BGP startup. This would make sure that iBGP has converged before eBGP starts, and no transient black-holing is seen. Is that possible? I have googled and stared at the command-line help for a while, but couldn't find anything useful. Routers in question are 6500s with SXI2a. gert -- USENET is *not* the non-clickable part of WWW! //www.muc.de/~gert/ Gert Doering - Munich, Germany g...@greenie.muc.de fax: +49-89-35655025g...@net.informatik.tu-muenchen.de pgpyi7FUkap0q.pgp Description: PGP signature ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/