Re: [GROW] Comment on draft-iops-grow-bgp-session-culling
* Job Snijders> TEXT: > In network topologies where BGP speaking routers are directly > attached to each other, or use fault detection mechanisms such as > BFD, detecting and acting upon a link > down event (for example when someone yanks the physical connector) > in a timely fashion is straightforward. > > So we should add something that even though detection is > straightforward, and initiating action as a result of this event can be > done timely, we cannot be sure of timely termination of whatever actions > are taken because of the event, and therefor the recommendation is to > shutdown sessions before doing maintenance, even though networks are > directly connected to each other. > > The above matches my operational experience and aligns with how we > perform router maintenance. > > There are a number of considerations: > > - an operator may not know whether they are directly connected > - even if directly connected, the remote side might not be able to > convergence in a timely fashion > > Perhaps the paragraph should just be removed? Yes. Here's a quick suggestion or starting point for discussion (intended to replace section 1 in its entirety): BGP Session Culling is the practice of ensuring BGP sessions are forcefully torn down before maintenance activities on a lower layer network commence, which otherwise would affect the flow of data between the BGP speakers. BGP Session Culling ensures that network maintenance activities cause the minimum possible amount of disruption, by giving BGP speakers advance notice of an impending outage, so they may preemptively react to it by gracefully converging onto alternate paths while the forwarding plane is still fully operational. The grace period required for a successful implementation BGP Session Culling is the sum of the time needed to detect the BGP session loss plus the time required for the BGP speaker to converge on alternate paths. The first value is in the worst case governed by the BGP Hold Timer (section 6.5 of [RFC4271]). The second value is implementation specific, but could be as much as 15 minutes or more in the case of sessions where a router with a slow control plane is receiving a full set of Internet routes. Operators implementing BGP Session Culling are in any case encouraged to avoid using a fixed grace period, but instead monitor forwarding plane activity while the culling is taking place and consider it complete once traffic levels have dropped to a minimum [Section 2.3]. Tore ___ GROW mailing list GROW@ietf.org https://www.ietf.org/mailman/listinfo/grow
Re: [GROW] Comment on draft-iops-grow-bgp-session-culling
On Tue, Mar 14, 2017 at 01:05:08PM +0100, Tore Anderson wrote: > * Nick Hilliard> > Tore Anderson wrote: > > > In other words: in my opinion, BGP session culling should be > > > considered a BCP even in situations where link state signaling > > > and/or BFD is used. IP-transit providers should perform culling > > > towards their customers ahead of maintenance works. Direct peers, > > > likewise. > > > > probably not much need if bfd is used because that would operate > > route-to-router. > > Quite the contrary, there is very much a need in this case too. If there > are many active routes that will become invalid, converging on > alternate paths (reprogramming the FIB) can take significantly longer > than actually detecting the outage (even if it's detected only using > BGP timers). > > > > IXPs aren't at all special regarding the fundamental need for session > > > culling, only in the method by which it is accomplished (i.e., using > > > layer-2 ACLs). > > > > Correct, but for direct peers over PNIs, etc, the operator will usually > > have control over the bgp session. What we're talking about here is a > > situation where there is an intermediate operator which has no direct > > admin control over bgp sessions. > > The draft is most definitively also talking about the situations where > the operator does have admin control over the BGP session (section 2.1). TEXT: In network topologies where BGP speaking routers are directly attached to each other, or use fault detection mechanisms such as BFD, detecting and acting upon a link down event (for example when someone yanks the physical connector) in a timely fashion is straightforward. So we should add something that even though detection is straightforward, and initiating action as a result of this event can be done timely, we cannot be sure of timely termination of whatever actions are taken because of the event, and therefor the recommendation is to shutdown sessions before doing maintenance, even though networks are directly connected to each other. The above matches my operational experience and aligns with how we perform router maintenance. There are a number of considerations: - an operator may not know whether they are directly connected - even if directly connected, the remote side might not be able to convergence in a timely fashion Perhaps the paragraph should just be removed? Kind regards, Job ___ GROW mailing list GROW@ietf.org https://www.ietf.org/mailman/listinfo/grow
Re: [GROW] Comment on draft-iops-grow-bgp-session-culling
* Nick Hilliard> Tore Anderson wrote: > > In other words: in my opinion, BGP session culling should be > > considered a BCP even in situations where link state signaling > > and/or BFD is used. IP-transit providers should perform culling > > towards their customers ahead of maintenance works. Direct peers, > > likewise. > > probably not much need if bfd is used because that would operate > route-to-router. Quite the contrary, there is very much a need in this case too. If there are many active routes that will become invalid, converging on alternate paths (reprogramming the FIB) can take significantly longer than actually detecting the outage (even if it's detected only using BGP timers). > > IXPs aren't at all special regarding the fundamental need for session > > culling, only in the method by which it is accomplished (i.e., using > > layer-2 ACLs). > > Correct, but for direct peers over PNIs, etc, the operator will usually > have control over the bgp session. What we're talking about here is a > situation where there is an intermediate operator which has no direct > admin control over bgp sessions. The draft is most definitively also talking about the situations where the operator does have admin control over the BGP session (section 2.1). Tore ___ GROW mailing list GROW@ietf.org https://www.ietf.org/mailman/listinfo/grow
Re: [GROW] Comment on draft-iops-grow-bgp-session-culling
Tore Anderson wrote: > In other words: in my opinion, BGP session culling should be considered > a BCP even in situations where link state signaling and/or BFD is used. > IP-transit providers should perform culling towards their customers > ahead of maintenance works. Direct peers, likewise. probably not much need if bfd is used because that would operate route-to-router. Link state signaling is problematic because it's not necessarily transferred to all the devices that need to see the link state changes. > IXPs aren't at all special regarding the fundamental need for session > culling, only in the method by which it is accomplished (i.e., using > layer-2 ACLs). Correct, but for direct peers over PNIs, etc, the operator will usually have control over the bgp session. What we're talking about here is a situation where there is an intermediate operator which has no direct admin control over bgp sessions. Nick ___ GROW mailing list GROW@ietf.org https://www.ietf.org/mailman/listinfo/grow
Re: [GROW] Comment on draft-iops-grow-bgp-session-culling
* Nick Hilliard> Tore Anderson wrote: > > My point here was that if the IXP is doing maintenance, it could shut > > all ports to all members simultaneously, and thus get the exact same > > effect as the «when someone yanks the physical connector» scenario > > described in the draft. > > this doesn't work because 1. some ixp participants connect their > routers via intermediate switches and if all ports are yanked > simultaneously, they will blackhole traffic on their side and 2. any > ixp with more than one switch in their peering fabric needs to be > able to performance maintenance on part of their ixp without > affecting the rest. Maybe we're talking past each other. I fully agree with you that this does not work sufficiently well, which by extension means that the draft is wrong in suggesting that BGP session culling is only needed «in topologies where upper layer fast fault detection mechanisms are unavailable and the lower layer topology is hidden». In other words: in my opinion, BGP session culling should be considered a BCP even in situations where link state signaling and/or BFD is used. IP-transit providers should perform culling towards their customers ahead of maintenance works. Direct peers, likewise. IXPs aren't at all special regarding the fundamental need for session culling, only in the method by which it is accomplished (i.e., using layer-2 ACLs). Tore ___ GROW mailing list GROW@ietf.org https://www.ietf.org/mailman/listinfo/grow
Re: [GROW] Comment on draft-iops-grow-bgp-session-culling
Tore Anderson wrote: > By the way, as an IXP operator, you also have the possibility to simply > shut down your members' interfaces prior to performing maintenance, > instead of doing culling. Doing so would be completely analogous to the > directly connected BGP speakers scenario discussed in section 1 where the > draft says «detecting and acting upon a link down event (for example > when someone yanks the physical connector) in a timely fashion is > straightforward». - if the ixp port is connected to an intermediate switch on the member side, the ixp member router won't see the carrier state transition and will blackhole traffic on the member side - all other ixp members who peer with that router also won't see the carrier state and will leave bgp up, causing traffic to be blackholed on the remote side. Nick ___ GROW mailing list GROW@ietf.org https://www.ietf.org/mailman/listinfo/grow
Re: [GROW] Comment on draft-iops-grow-bgp-session-culling
Will Hargrave wrote: > With a 30 minute cull window, there is substantial concern that an > operator will begin to debug the ‘problem’, discover ICMP PING works but > TCP/179 doesn’t work, and get very annoyed at this strange behaviour. I > think we should operate on a principle of least surprise here. We have found that re-mailing technical contacts with a "Maintenance is about to start in 1 hour" reminder helps quite a lot in this sort of situation. Low tech is sometimes good. Otherwise, 30 minutes for traffic drain is unnecessary - we keep our interface counters on 30 second load average so can see when the majority of traffic is gone; this always happens within a couple of minutes of the 3m bgp dead time drop. Usually this sort of thing happens at omg o'clock in the morning, and we've found that reducing the time load helps quite a bit with both accuracy and staff morale, as all our staff usually work regular office hours. Again, this sort of rationale would be outside the scope of IETF BCPs, but they are real enough concerns that I'd personally be uncomfortable with recommending any delays. Nick ___ GROW mailing list GROW@ietf.org https://www.ietf.org/mailman/listinfo/grow
Re: [GROW] Comment on draft-iops-grow-bgp-session-culling
Hello, Tore and GROW, I am new here. On 13 Mar 2017, at 11:11, Tore Anderson wrote: Another logical consequence of this is that the rather imprecise «a few minutes» should ideally be expanded on, taking slow routers such as the MX80 into account. While five minutes of culling would be helpful, it would not be enough to avoid all disruption. One point of the technique is that the ‘lower layer caretaker’ looks at their interface traffic counters to ensure traffic has dropped to near-zero before commencing the ‘destructive’ part of the maintenance. As a result no traffic is affected. If a tree falls in the forest and there is no-one for it to land on, does it matter? :-) Our operational experience at LONAP shows this does usually happen within 5 minutes. If the operator/peer would decide to cull the session, say, 30 minutes ahead of the maintenance, that would be great for me and others in my situation. I suspect this may be unnecessary, but do not have extensive data to back this up. With a 30 minute cull window, there is substantial concern that an operator will begin to debug the ‘problem’, discover ICMP PING works but TCP/179 doesn’t work, and get very annoyed at this strange behaviour. I think we should operate on a principle of least surprise here. ___ GROW mailing list GROW@ietf.org https://www.ietf.org/mailman/listinfo/grow
Re: [GROW] Comment on draft-iops-grow-bgp-session-culling
Hi Tore, On Mon, Mar 13, 2017 at 12:11:34PM +0100, Tore Anderson wrote: > While I wholeheartedly agree with the recommendations in the draft and > really wish my providers and peers will all incorporate them into their > maintenance routines, I have one comment on the following text: > >In network topologies where BGP speaking routers are directly >attached to each other, or use fault detection mechanisms such as BFD >[RFC5880], detecting and acting upon a link down event (for example >when someone yanks the physical connector) in a timely fashion is >straightforward. > > Not necessarily. Speaking from direct [painful] experiences, a router > with a slow control plane (such as the Juniper MX80) can easily need 15 > minutes or so to fully converge after link down has been detected on an > interface to which it had a large number of active routes. The prime > example would be a connection to an IP-transit provider advertising > full DFZ tables. > > So in my opinion, BGP Session Culling really ought to be the BCP in all > topologies, not just the ones «where upper layer fast fault detection > mechanisms are unavailable and the lower layer topology is hidden from > the BGP speakers». > > Another logical consequence of this is that the rather imprecise «a few > minutes» should ideally be expanded on, taking slow routers such as the > MX80 into account. While five minutes of culling would be helpful, it > would not be enough to avoid all disruption. I think you make a valid point, would you be willing to prepare a change proposal to address this? The document's authoritive source is on github: https://github.com/bgpculling/draft-bgp-session-culling > If the operator/peer would decide to cull the session, say, 30 minutes > ahead of the maintenance, that would be great for me and others in my > situation. For single-homed customers of an IP-transit provider, on the > other hand, any amount of culling only contributes to making the outage > longer than necessary. So maybe it should be a suggestion to make it > possible to opt out of the culling behaviour. The myriad of issues resulting from a network single-homing on a single connection behind a single router might be out of scope for this document. Kind regards, Job ___ GROW mailing list GROW@ietf.org https://www.ietf.org/mailman/listinfo/grow