Hi, [I'm not subscribed to grow@ietf.org so apologies for breaking threading and possibly bringing up an issue that has been raised before. Please Cc me on replies.]
While I wholeheartedly agree with the recommendations in the draft and really wish my providers and peers will all incorporate them into their maintenance routines, I have one comment on the following text: In network topologies where BGP speaking routers are directly attached to each other, or use fault detection mechanisms such as BFD [RFC5880], detecting and acting upon a link down event (for example when someone yanks the physical connector) in a timely fashion is straightforward. Not necessarily. Speaking from direct [painful] experiences, a router with a slow control plane (such as the Juniper MX80) can easily need 15 minutes or so to fully converge after link down has been detected on an interface to which it had a large number of active routes. The prime example would be a connection to an IP-transit provider advertising full DFZ tables. So in my opinion, BGP Session Culling really ought to be the BCP in all topologies, not just the ones «where upper layer fast fault detection mechanisms are unavailable and the lower layer topology is hidden from the BGP speakers». Another logical consequence of this is that the rather imprecise «a few minutes» should ideally be expanded on, taking slow routers such as the MX80 into account. While five minutes of culling would be helpful, it would not be enough to avoid all disruption. If the operator/peer would decide to cull the session, say, 30 minutes ahead of the maintenance, that would be great for me and others in my situation. For single-homed customers of an IP-transit provider, on the other hand, any amount of culling only contributes to making the outage longer than necessary. So maybe it should be a suggestion to make it possible to opt out of the culling behaviour. Improving on convergence time during maintenance (and unscheduled downtime) is by the way a major reason why we're very seriously considering dropping most of our routes from the FIB and instead go with a default route + a few others, so this is definitively a real problem. Tore _______________________________________________ GROW mailing list GROW@ietf.org https://www.ietf.org/mailman/listinfo/grow