Hi Jeff, > That said, Robert, there's room for you to work on that if you want to kick > off a draft on the topic.
Thx for the hint, but I do not think this extension should be done in BFD for three reasons: Reason 1 - BFD works well to quickly detect failures. Loading on it more stuff compromises it. Moreover other vendors already have shipping tools which already can detect issues due to changes in MTU of the paths: Example: https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/ipsla/configuration/xe-16-6/sla-xe-16-6-book/sla-icmp-pathecho.html Reason 2 - As the draft states that the idea is to automate use of this extension by client protocols. I do not agree with such deployment model of this enhancement. At most if frequency of MTU probing would be 100-1000 times less frequent then up/down link detection it would serve its purpose - yet there is no word in the draft about such option. Essentially instead of replacing current tiny BFD packets one could use bfd-large as different sessions with completely different timers. Maybe even end to end instead of link by link. Reason 3 - As we know BFD is very often used between ASes. How do you signal over p2p link willingness to now encapsulate BFD in stuffed UDP ? Email ? Phone ? Text ? Note that with mentioned icmp pathecho I can seamlessly detect issue with MTU of the link to my peer without telling anyone or asking for support of the other side. Thx, Robert. On Thu, Oct 3, 2019 at 9:34 PM Jeffrey Haas <[email protected]> wrote: > On Tue, Oct 01, 2019 at 11:11:13PM -0000, Albert Fu (BLOOMBERG/ 120 PARK) > wrote: > > There are well known cases, including those you mentioned, where BFD has > > limitations in deterministically detecting data plane issue, and not > > specific with the BFD Large Packet Draft. I am a novice to the IETF > > process, and not sure if we need to mention them here, but shall discuss > > with Jeff if it is worth highlighting them. > > It's reasonable to make note of issues where common operational scenarios > will complicate the solution. But it's not up to a draft carried on top of > an RFC with that core issue to try to solve the issue in that core RFC. > > So, trying to solve "BFD doesn't work perfectly in the presence of LAGs" in > bfd-large is the wrong place to do it. :-) > > That said, Robert, there's room for you to work on that if you want to kick > off a draft on the topic. > > > > We won't have control over how the Provider maps our traffic > (BFD/data). > > > > > Well of course you do :) Just imagine if your BFD packets (in set > equal to configured multiplier) would start > > > using random UDP source port which then would be mapped to different > ECMP buckets along the way in provider's > > > underlay ? > > And that's an example of possible solution space for such a draft on the > underlying issue. > > That said, LAG fan-out issues are a massive operational pain. While it's > likely that varying L3 PDU fields for entropy to distribute traffic across > the LAG may work (and we have any number of customers who rely on this for > UDP especially), it starts getting very problematic when you have multiple > LAGs in a path. I have a vague memory that someone had started some > discussions with IEEE to try to figure out what OAM mechanisms would look > like for such scenarios, but that's very much out of normal BFD scope. > > -- Jeff >
