I guess the reality is that the diag field is a weak and ambiguous feature that 
I spent five minutes thinking about, and is ultimately (and happily) not 
subject to normative verification because it is write-only.

The received value is edge-triggerable if you grab the value during the next 
session establishment phase (you get at least one packet from the other guy 
prior to the session coming Up), so I suppose one could change the language to 
zero the local value on the Up transition as suggested without losing the 
(mis)feature altogether.  If that’s the case, I would add a sentence in the 
field description saying that the received diag value refers to the previous 
session when received in a packet bearing a state other than Up.   Of course 
this means that the value received with Up is ambiguous, because it could be 
referring to the previous session or the current one, depending on how it was 
implemented, but that’s already the case.

Or is it possible to simply log a permanent erratum that explains the ambiguity 
and encourages people not to worry about it?  It’s already taken up more time 
than it is worth, considering that it cannot affect interoperability (and so I 
suppose never should have been normative, but we’d probably still be discussing 
it anyhow).

Thanks,

—Dave


On Feb 22, 2023, at 8:14 AM, Jeffrey Haas 
<[email protected]<mailto:[email protected]>> wrote:


[External Email. Be cautious of content]


Dave,

Just as a reminder, the context for why this errata is being discussed is this 
inquiry:
https://mailarchive.ietf.org/arch/msg/rtg-bfd/YIeCo-nQicI_OIcVncYaJM5Zz6c/<https://urldefense.com/v3/__https://mailarchive.ietf.org/arch/msg/rtg-bfd/YIeCo-nQicI_OIcVncYaJM5Zz6c/__;!!NEt6yMaO-gk!GVv_NZcFCL5MFRsyAGepCNjAgo_HL9681GUVxIxvb1r3cJUhJggRmVYcVL1krqjB7EUqNmQR11p9$>

More below:


On Feb 17, 2023, at 12:04 PM, Dave Katz 
<[email protected]<mailto:[email protected]>> 
wrote:
On Feb 17, 2023, at 8:47 AM, Reshad Rahman 
<[email protected]<mailto:[email protected]>> wrote:
Having the diag field as breadcrumb has been extremely useful indeed. But both 
ends can store last diag field sent/received, we don't have to keep sending the 
diag field after the failure has cleared. It seems odd to be sending a diag 
field which happened e.g. a year ago.

That property helped me when debugging my implementation, as it survives the 
restart/reboot of the far end.

There is also no timeout that would make sense;  “forever, for small values of 
‘forever'” is semantically consistent and the only thing that makes sense (to 
me, at least).

Resetting it to zero only deletes information (albeit a tiny amount of it) and 
doesn’t help anything;  you know that the session is up, so the diagnostic for 
its most recent transition to non-upness is disambiguated.

Debugging broken things is a scramble for bits of data;  leaving a breadcrumb 
is a polite gift.

From my perspective, the breadcrumb is useful to note during the transitions, 
and not simply the transitions for the state.  Examples have been given where 
the diag is updated as part of a state transition (governed by normative text 
in 5880), or transitions that may happen while the session remains up (e.g. 
concatenated path down, echo, etc.).  The RFC isn't great about saying how you 
clear such things when the state is still Up; intuitively it's to return to "No 
Diagnostic".

However, your own leaning, Dave, is "leave it set forever".  Using the above 
examples for diag signaling an event while leaving the state up, I don't think 
you mean that.

So, again, the interesting breadcrumbs are when Things Change.  Each of these 
items is an edge transition of note. If I care about the event, I care about it 
when it happens and will remember it.  I'm not going to look at diag to reflect 
this forever.



Also the text in 6.8.1 says "The diagnostic code specifying the reason for the 
most recent change in the local session state.". To me that means resetting 
bfd.LocalDiag to 0 when the state changes to Up.

Thus the language that needs fixing (I know, I wrote it...)


AFAIK IOS-XR and JunOS reset LocalDiag. It'd be good to hear from other 
implementations.

Probably.  I might have even coded it that way 20 years ago, or someone else 
did later, thus underscoring the largely-irrelevant nature of this discussion...

I did confirm with Juniper's BFD developers that it's reset to 0 when we 
transition to Up.

Given the maturity of the feature, I'd suggest sticking to the reality on the 
ground.

-- Jeff


Reply via email to