Further to what I just sent, "is it possible to simply log a permanent erratum that explains the ambiguity and encourages people not to worry about it?”
Yep. That's basically what I had in mind with "note it in a ‘hold for document update’ erratum”. I don’t really expect anyone to write an Internet Draft to formally update RFC 5880, although I wouldn’t object to someone doing so if they wanted to do the work, the doc itself would probably just be a few pages (the email thread discussing it would inevitably be longer). But an erratum to at least say “some people do it one way, some do it another, that’s life” is worth having and I will gladly verify one. —John > On Feb 22, 2023, at 2:12 PM, Dave Katz <[email protected]> wrote: > > I guess the reality is that the diag field is a weak and ambiguous feature > that I spent five minutes thinking about, and is ultimately (and happily) not > subject to normative verification because it is write-only. > > The received value is edge-triggerable if you grab the value during the next > session establishment phase (you get at least one packet from the other guy > prior to the session coming Up), so I suppose one could change the language > to zero the local value on the Up transition as suggested without losing the > (mis)feature altogether. If that’s the case, I would add a sentence in the > field description saying that the received diag value refers to the previous > session when received in a packet bearing a state other than Up. Of course > this means that the value received with Up is ambiguous, because it could be > referring to the previous session or the current one, depending on how it was > implemented, but that’s already the case. > > Or is it possible to simply log a permanent erratum that explains the > ambiguity and encourages people not to worry about it? It’s already taken up > more time than it is worth, considering that it cannot affect > interoperability (and so I suppose never should have been normative, but we’d > probably still be discussing it anyhow). > > Thanks, > > —Dave > > >> On Feb 22, 2023, at 8:14 AM, Jeffrey Haas <[email protected]> wrote: >> >> >> [External Email. Be cautious of content] >> >> >> Dave, >> >> Just as a reminder, the context for why this errata is being discussed is >> this inquiry: >> https://mailarchive.ietf.org/arch/msg/rtg-bfd/YIeCo-nQicI_OIcVncYaJM5Zz6c/ >> >> More below: >> >> >>> On Feb 17, 2023, at 12:04 PM, Dave Katz >>> <[email protected]> wrote: >>>> On Feb 17, 2023, at 8:47 AM, Reshad Rahman <[email protected]> wrote: >>>> Having the diag field as breadcrumb has been extremely useful indeed. But >>>> both ends can store last diag field sent/received, we don't have to keep >>>> sending the diag field after the failure has cleared. It seems odd to be >>>> sending a diag field which happened e.g. a year ago. >>> >>> That property helped me when debugging my implementation, as it survives >>> the restart/reboot of the far end. >>> >>> There is also no timeout that would make sense; “forever, for small values >>> of ‘forever'” is semantically consistent and the only thing that makes >>> sense (to me, at least). >>> >>> Resetting it to zero only deletes information (albeit a tiny amount of it) >>> and doesn’t help anything; you know that the session is up, so the >>> diagnostic for its most recent transition to non-upness is disambiguated. >>> >>> Debugging broken things is a scramble for bits of data; leaving a >>> breadcrumb is a polite gift. >> >> From my perspective, the breadcrumb is useful to note during the >> transitions, and not simply the transitions for the state. Examples have >> been given where the diag is updated as part of a state transition (governed >> by normative text in 5880), or transitions that may happen while the session >> remains up (e.g. concatenated path down, echo, etc.). The RFC isn't great >> about saying how you clear such things when the state is still Up; >> intuitively it's to return to "No Diagnostic". >> >> However, your own leaning, Dave, is "leave it set forever". Using the above >> examples for diag signaling an event while leaving the state up, I don't >> think you mean that. >> >> So, again, the interesting breadcrumbs are when Things Change. Each of >> these items is an edge transition of note. If I care about the event, I care >> about it when it happens and will remember it. I'm not going to look at >> diag to reflect this forever. >> >>> >>>> >>>> Also the text in 6.8.1 says "The diagnostic code specifying the reason for >>>> the most recent change in the local session state.". To me that means >>>> resetting bfd.LocalDiag to 0 when the state changes to Up. >>> >>> Thus the language that needs fixing (I know, I wrote it...) >>> >>>> >>>> AFAIK IOS-XR and JunOS reset LocalDiag. It'd be good to hear from other >>>> implementations. >>> >>> Probably. I might have even coded it that way 20 years ago, or someone >>> else did later, thus underscoring the largely-irrelevant nature of this >>> discussion... >> >> I did confirm with Juniper's BFD developers that it's reset to 0 when we >> transition to Up. >> >> Given the maturity of the feature, I'd suggest sticking to the reality on >> the ground. >> >> -- Jeff >> >
