Hi,

[I'm not subscribed to grow@ietf.org so apologies for breaking threading
and possibly bringing up an issue that has been raised before. Please
Cc me on replies.]

While I wholeheartedly agree with the recommendations in the draft and
really wish my providers and peers will all incorporate them into their
maintenance routines, I have one comment on the following text:

   In network topologies where BGP speaking routers are directly
   attached to each other, or use fault detection mechanisms such as BFD
   [RFC5880], detecting and acting upon a link down event (for example
   when someone yanks the physical connector) in a timely fashion is
   straightforward.

Not necessarily. Speaking from direct [painful] experiences, a router
with a slow control plane (such as the Juniper MX80) can easily need 15
minutes or so to fully converge after link down has been detected on an
interface to which it had a large number of active routes. The prime
example would be a connection to an IP-transit provider advertising
full DFZ tables.

So in my opinion, BGP Session Culling really ought to be the BCP in all
topologies, not just the ones «where upper layer fast fault detection
mechanisms are unavailable and the lower layer topology is hidden from
the BGP speakers».

Another logical consequence of this is that the rather imprecise «a few
minutes» should ideally be expanded on, taking slow routers such as the
MX80 into account. While five minutes of culling would be helpful, it
would not be enough to avoid all disruption.

If the operator/peer would decide to cull the session, say, 30 minutes
ahead of the maintenance, that would be great for me and others in my
situation. For single-homed customers of an IP-transit provider, on the
other hand, any amount of culling only contributes to making the outage
longer than necessary. So maybe it should be a suggestion to make it
possible to opt out of the culling behaviour.

Improving on convergence time during maintenance (and unscheduled
downtime) is by the way a major reason why we're very seriously
considering dropping most of our routes from the FIB and instead go
with a default route + a few others, so this is definitively a real
problem.

Tore

_______________________________________________
GROW mailing list
GROW@ietf.org
https://www.ietf.org/mailman/listinfo/grow

Reply via email to