On 23 mei 2009, at 0:58, Zaid Ali wrote:
From experience I found that you need to keep all the timers in
sync with all your peers. Something like this for every peer in
your bgp config.
neighbor xxx.xx.xx.x timers 30 60
30 60 isn't a good choice because that means that after 30.1 seconds a
keepalive comes in and then after 60.0 seconds the session will expire
while the second one would be there in 60.1 seconds.
The other side will typically use hold timer / 3 for their keepalive
interval. If you set it to something not divisible by 3 then you get
all 3 of those within the hold timer.
I often recommended 5 16 in the past but that's a bit on the short
side, some less robust BGP implementations work single threaded and
may not be able to send keepalives every 15 seconds when they're very
busy.
The minimum possible hold time is 3.
If you only change the setting at your end you can change it to
something higher when bad stuff happens, if the other end also sets it
then you'll have to change it at both ends as the hold time is
negotiated and the lowest is used.
If you really want fast failover terminate the fiber in the BGP router
and make sure fast-external-failover is on (I think it's the default).
For manual failover, simply shut down the BGP sessions on the router
that you don't want to handle traffic at that time. If you have
peergroups you can do "neighbor peergroup shutdown" for the fastest
results. Shutting down interfaces is not such a good idea, then the
routing protocols have to time out.
Make sure that this is communicated to your peer as well so that
their timer setting are reflected the same.
Zaid
----- Original Message -----
From: "Steve Bertrand" <[email protected]>
To: "nanog list" <[email protected]>
Sent: Friday, May 22, 2009 3:45:20 PM GMT -08:00 US/Canada Pacific
Subject: Multi-homed clients and BGP timers
Hi all,
I've got numerous single-site 100Mb fibre clients who have backup SDSL
links to my PoP. The two services terminate on separate
distribution/access routers.
The CPE that peers to my fibre router sets a community, and my end
sets
the pref to 150 based on it. The CPE also sets a higher pref for
prefixes from the fibre router. The SDSL router to CPE leaves the
default preference in place. Both of my PE gear sends default-
originate
to the CPE. There is (generally) no traffic that should ever be on the
SDSL link while the fibre is up.
Both of the PE routers then advertise the learnt client route up into
the core:
*>i208.70.107.128/28
172.16.104.22 0 150 0 64762 i
* i 172.16.104.23 0 100 0 64762 i
My problem is the noticeable delay for switchover when the fibre
happens
to go down (God forbid).
I would like to know if BGP timer adjustment is the way to adjust
this,
or if there is a better/different way. It's fair to say that the fibre
doesn't 'flap'. Based on operational experience, if there is a problem
with the fibre network, it's down for the count.
While I'm at it, I've got another couple of questions:
- whatever technique you might recommend to reduce the convergence
throughout the network, can the same principles be applied to iBGP
as well?
- if I need to down core2, what is the quickest and easiest way to
ensure that all gear connected to the cores will *quickly* switch to
preferring core1?
Steve