Hi Jeff, I've looked through several active BFD-related drafts and the only one that might be relevant to the discussion is draft-mirsky-bfd-mpls-demand <https://datatracker.ietf.org/doc/draft-mirsky-bfd-mpls-demand/>. Although, updating bfd.LocalDiag is not explicitly discussed there.
Regards, Greg On Wed, Feb 22, 2023 at 2:30 PM Jeffrey Haas <[email protected]> wrote: > "Hold for update" was the expected outcome for my filing of the errata. > > At best, we're telling new implementors that there's an issue here of note > in the protocol. The mail discussion will note that there are multiple > existing implementations that have historically set the value to 0 when > transitioning to Up. > > It'll also log some of what your thinking was at the time. Since RFC 5880 > already talks about cases where the value is of interest (concatenated > path, etc.) implementors already know to pay attention to the value even > when state transitions aren't happening. > > Part of my own motivation to have the behavior clear has been other > proposals we've seen come and go trying to use Diag as a much more rigorous > mechanism to trigger behaviors. I had thought I could find a draft I saw > during the mpls session at IETF 115 along these lines, but I appear to be > mistaken. In any case, people keep wanting to use Diag for Clever Things > and it'll bite them in unpleasant places if they do so. > > Greg may have recollection of the proposal I'm thinking about. > > -- Jeff > > > > > > On Feb 22, 2023, at 2:34 PM, Dave Katz <[email protected]> wrote: > > > > Thanks for the background. > > > > I guess the fact of the matter is that, since the issue cannot affect > interoperability, it’s hard to imagine getting the WG to go through a bunch > of machinations to go out of its way to fix something that is entirely > pedantic. In that case I think holding the erratum for update is the right > choice. The erratum can describe the ambiguity and the WG can decide what > to do about it if they find another reason to update the spec... > > > > —Dave > > > > > >> On Feb 22, 2023, at 11:29 AM, John Scudder <[email protected]> wrote: > >> > >> Hi All, > >> > >> Regarding Jeff’s "Given the maturity of the feature, I'd suggest > sticking to the reality on the ground”, I want to step in and remind folks > about what our guidelines are for processing errata [1]. They are fairly > narrow, by design. One of the high-order bits of the guidelines is, "Errata > are meant to fix "bugs" in the specification and should not be used to > change what the community meant when it approved the RFC.” What I take > this to mean in the current context is, if the behavior specified in the > RFC has been found to be ‘wrong’ (for whatever definition of ‘wrong’ you > choose to apply), an erratum is definitively not the way to correct that. > An erratum is to clarify or correct whatever the intent of the RFC was at > time of publication. Of course, many RFCs (this one included, it seems) > didn’t receive detailed scrutiny of every crevice of the spec before being > declared ready for publication, and so it’s not always possible to really > say that the consensus was firmly one way or the other. In such cases, I > think we have to err on the side of what the words in the spec say. > >> > >> About the furthest I’d go in documenting that (part of) the WG now > thinks the specified behavior is undesirable, is to note it in a ‘hold for > document update’ erratum. That seems reasonable in this case — it can lay > out the pros and cons and at least creates an artifact for future > implementors to notice. > >> > >> The bottom line is that changes to specified behavior require WG and > IETF consensus, and that means they require an RFC to update or obsolete > the old behavior. This is one of the pointy bits of our process, that RFCs > document the consensus at a moment in time, not the evolving consensus. > >> > >> Thanks, > >> > >> —John > >> > >> [1] > https://www.ietf.org/about/groups/iesg/statements/processing-errata-ietf-stream/ > >> > >>> On Feb 22, 2023, at 11:14 AM, Jeffrey Haas <[email protected]> wrote: > >>> > >>> > >>> Dave, > >>> > >>> Just as a reminder, the context for why this errata is being discussed > is this inquiry: > >>> > https://mailarchive.ietf.org/arch/msg/rtg-bfd/YIeCo-nQicI_OIcVncYaJM5Zz6c/ > >>> > >>> More below: > >>> > >>> > >>>> On Feb 17, 2023, at 12:04 PM, Dave Katz <dkatz= > [email protected]> wrote: > >>>>> On Feb 17, 2023, at 8:47 AM, Reshad Rahman <[email protected]> wrote: > >>>>> Having the diag field as breadcrumb has been extremely useful > indeed. But both ends can store last diag field sent/received, we don't > have to keep sending the diag field after the failure has cleared. It seems > odd to be sending a diag field which happened e.g. a year ago. > >>>> > >>>> That property helped me when debugging my implementation, as it > survives the restart/reboot of the far end. > >>>> > >>>> There is also no timeout that would make sense; “forever, for small > values of ‘forever'” is semantically consistent and the only thing that > makes sense (to me, at least). > >>>> > >>>> Resetting it to zero only deletes information (albeit a tiny amount > of it) and doesn’t help anything; you know that the session is up, so the > diagnostic for its most recent transition to non-upness is disambiguated. > >>>> > >>>> Debugging broken things is a scramble for bits of data; leaving a > breadcrumb is a polite gift. > >>> > >>> From my perspective, the breadcrumb is useful to note during the > transitions, and not simply the transitions for the state. Examples have > been given where the diag is updated as part of a state transition > (governed by normative text in 5880), or transitions that may happen while > the session remains up (e.g. concatenated path down, echo, etc.). The RFC > isn't great about saying how you clear such things when the state is still > Up; intuitively it's to return to "No Diagnostic". > >>> > >>> However, your own leaning, Dave, is "leave it set forever". Using the > above examples for diag signaling an event while leaving the state up, I > don't think you mean that. > >>> > >>> So, again, the interesting breadcrumbs are when Things Change. Each > of these items is an edge transition of note. If I care about the event, I > care about it when it happens and will remember it. I'm not going to look > at diag to reflect this forever. > >>> > >>>> > >>>>> > >>>>> Also the text in 6.8.1 says "The diagnostic code specifying the > reason for the most recent change in the local session state.". To me that > means resetting bfd.LocalDiag to 0 when the state changes to Up. > >>>> > >>>> Thus the language that needs fixing (I know, I wrote it...) > >>>> > >>>>> > >>>>> AFAIK IOS-XR and JunOS reset LocalDiag. It'd be good to hear from > other implementations. > >>>> > >>>> Probably. I might have even coded it that way 20 years ago, or > someone else did later, thus underscoring the largely-irrelevant nature of > this discussion... > >>> > >>> I did confirm with Juniper's BFD developers that it's reset to 0 when > we transition to Up. > >>> > >>> Given the maturity of the feature, I'd suggest sticking to the reality > on the ground. > >>> > >>> -- Jeff > >>> > >> > > > >
