Re: Mirja Kühlewind's Discuss on draft-ietf-bfd-seamless-base-09: (with DISCUSS)

Mirja Kuehlewind (IETF) Wed, 04 May 2016 09:10:47 -0700

Hi Carlos,

> Am 04.05.2016 um 17:44 schrieb Carlos Pignataro (cpignata) 
> <[email protected]>:
> 
> Hi, Mirja,
> 
>> On May 4, 2016, at 11:29 AM, Mirja Kuehlewind (IETF) <[email protected]> 
>> wrote:
>> 
>> Hi Carlos,
>> 
>>> Am 04.05.2016 um 17:13 schrieb Carlos Pignataro (cpignata) 
>>> <[email protected]>:
>>> 
>>> Hi, Mirja,
>>> 
>>>> On May 4, 2016, at 10:41 AM, Mirja Kuehlewind (IETF) <[email protected]> 
>>>> wrote:
>>>> 
>>>> Hi Carlos,
>>>> 
>>>> below.
>>>> 
>>>>> Am 04.05.2016 um 16:33 schrieb Carlos Pignataro (cpignata) 
>>>>> <[email protected]>:
>>>>> 
>>>>> Thanks much for the response, Mirja!
>>>>> 
>>>>> I think we are converging, please see inline.
>>>>> 
>>>>>> On May 4, 2016, at 10:13 AM, Mirja Kuehlewind (IETF) 
>>>>>> <[email protected]> wrote:
>>>>>> 
>>>>>> Hi Carlos,
>>>>>> 
>>>>>> see below.
>>>>>> 
>>>>>>> Am 03.05.2016 um 19:24 schrieb Carlos Pignataro (cpignata) 
>>>>>>> <[email protected]>:
>>>>>>> 
>>>>>>> Hi, Mirja,
>>>>>>> 
>>>>>>>> On May 3, 2016, at 12:31 PM, Mirja Kuehlewind (IETF) 
>>>>>>>> <[email protected]> wrote:
>>>>>>>> 
>>>>>>>> Hi Carlos,
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Am 03.05.2016 um 15:40 schrieb Carlos Pignataro (cpignata) 
>>>>>>>>> <[email protected]>:
>>>>>>>>> 
>>>>>>>>> Hi, Mirja,
>>>>>>>>> 
>>>>>>>>> What is an uncontrolled packet in an IP network, and what entity 
>>>>>>>>> controls controlled ones? :-)
>>>>>>>> 
>>>>>>>> Questions over questions… :-)
>>>>>>>> 
>>>>>>>> See below...
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> More seriously, please see inline.
>>>>>>>>> 
>>>>>>>>>> On May 3, 2016, at 5:35 AM, Mirja Kuehlewind <[email protected]> 
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Mirja Kühlewind has entered the following ballot position for
>>>>>>>>>> draft-ietf-bfd-seamless-base-09: Discuss
>>>>>>>>>> 
>>>>>>>>>> When responding, please keep the subject line intact and reply to all
>>>>>>>>>> email addresses included in the To and CC lines. (Feel free to cut 
>>>>>>>>>> this
>>>>>>>>>> introductory paragraph, however.)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Please refer to 
>>>>>>>>>> https://www.ietf.org/iesg/statement/discuss-criteria.html
>>>>>>>>>> for more information about IESG DISCUSS and COMMENT positions.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> The document, along with other ballot positions, can be found here:
>>>>>>>>>> https://datatracker.ietf.org/doc/draft-ietf-bfd-seamless-base/
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>>> DISCUSS:
>>>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>>> 
>>>>>>>>>> As S-BFD has no initiation process anymore it is not guarenteed that 
>>>>>>>>>> the
>>>>>>>>>> receiver/responder actually exists. That means that packets could 
>>>>>>>>>> float
>>>>>>>>>> (uncontrolled) in the network or even outside of the adminstrative 
>>>>>>>>>> domain
>>>>>>>>>> (e.g. due to configuration mistakes). From my point of view this 
>>>>>>>>>> document
>>>>>>>>>> should recommend/require two things:
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> We have called out the misconfiguration — however:
>>>>>>>>> 
>>>>>>>>>> 1) A maximum number of S-BFD packet that is allow to be send without
>>>>>>>>>> getting a response (maybe leading to a local error report).
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> This can result in a deadlock situation, if an S-BFD Reflector is 
>>>>>>>>> enabled much later. I’m very hesitant to cap the packets sent. We 
>>>>>>>>> can, and I think it is useful, MAY log an error for multiple timeouts.
>>>>>>>> 
>>>>>>>> Okay, I understand that a hard limit probably does make sense. An 
>>>>>>>> error log seems definitely useful.
>>>>>>> 
>>>>>>> OK, that sounds good. See below.
>>>>>>> 
>>>>>>>> Another proposal for consideration: Currently the draft says an 
>>>>>>>> initiator should only send one packet per second if the target is in 
>>>>>>>> ADMINDOWN state. In this case there this state is explicit announced. 
>>>>>>>> However if the other end just disappears or was never/not yet there, 
>>>>>>>> one could use an exponential back off instead, starting with a smaller 
>>>>>>>> intervals than one second but then increase it exponentially. Just an 
>>>>>>>> idea...
>>>>>>> 
>>>>>>> Thanks for the proposal. Please have in mind however that this is a 
>>>>>>> protocol for detecting liveness (and lack of it), so increasing 
>>>>>>> exponentially defeats the purpose.
>>>>>>> 
>>>>>>> Further, exponential back off may not be the best choice when 
>>>>>>> interacting with routing protocols.
>>>>>>> 
>>>>>>> What we currently say is:
>>>>>>>  The criteria for declaring loss of
>>>>>>>  reachability and the action that would be triggered as a result
>>>>>>>  are outside the scope of this document.
>>>>>>> 
>>>>>>> As much of these are implementation choices.
>>>>>>> 
>>>>>>> But we can add at the end “, and MAY include logging an error.“
>>>>>> 
>>>>>> Please do so.
>>>>> 
>>>>> Done.
>>>>> 
>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> 2) Egress filtering at the adminstrative border of the domain that 
>>>>>>>>>> uses
>>>>>>>>>> S-BFD to make sure that no S-BFD packets leave the domain.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> This is no different than any other application that uses UDP; a 
>>>>>>>>> misconfigured DNS server will result in the same, and a traceroute is 
>>>>>>>>> also not too different. This seems too onerous of a requirement. An 
>>>>>>>>> administrative domain filters at ingress.
>>>>>>>> 
>>>>>>>> First of all, just because other protocols might have such a problem, 
>>>>>>>> that does mean it’s okay.
>>>>>>> 
>>>>>>> I agree with this. I had a different point in mind though — trying to 
>>>>>>> specify this on every UDP application might not be the most effective 
>>>>>>> way. Perhaps there’s a UDP guideline you are uncovering.
>>>>>>> 
>>>>>>>> However, correctly me if I’m wrong, but there the situation seems 
>>>>>>>> slightly different because there is no termination criterium at all 
>>>>>>>> that means an s-bfd node would send useless data forever (… until 
>>>>>>>> manual change of the config).
>>>>>>>> 
>>>>>>> 
>>>>>>> But as far as this doc is concerned, let me try to make some 
>>>>>>> clarifications (and a correction).
>>>>>>> 
>>>>>>> There are termination criteria — the document says:
>>>>>>> 
>>>>>>> An SBFDInitiator may be a persistent session on the initiator with a
>>>>>>> timer for S-BFD control packet transmissions (stateful
>>>>>>> SBFDInitiator).  An SBFDInitiator may also be a module, a script or a
>>>>>>> tool on the initiator that transmits one or more S-BFD control
>>>>>>> packets "when needed" (stateless SBFDInitiator).
>>>>>>> 
>>>>>>> For the case in which you have a “when needed” SBFDInitiator, the 
>>>>>>> workflow is like a “ping”.
>>>>>>> 
>>>>>>> For the case in which you have a “persistent" SBFDInitiator, in theory 
>>>>>>> this can run forever (for some value of ever). However, please don’t 
>>>>>>> loose track of why this protocol exists. Having OAM failures and do 
>>>>>>> nothing about it defeats the purpose of having OAM. Meaning, a red 
>>>>>>> alarm will blink, a honk will horn, and the config state will be 
>>>>>>> changed (manually or by some support system).
>>>>>>> 
>>>>>>> In other words, I do not think this is such a unique case (insofar as 
>>>>>>> running ad-infinutum).
>>>>>> 
>>>>>> I still believe that the case where you have a misconfiguration and the 
>>>>>> initiator sends packets (forever) but never ever gest a reply (and never 
>>>>>> has seen a reply in the past), is a different case and can be detected 
>>>>>> and handled separately.
>>>>>> 
>>>>> 
>>>>> Again, I would not qualify this as ‘forever’, but I understand what you 
>>>>> mean.
>>>>> 
>>>>>>> 
>>>>>>>> I still believe that egress filtering would be more appropriate here 
>>>>>>>> (than ingress) because the domain that is using s-bfd knows about it 
>>>>>>>> and therefor can set up the respective filters and should not spam 
>>>>>>>> others while hoping that filters are in place.
>>>>>>>> 
>>>>>>> 
>>>>>>> To me, there is no insignificant operational complexity with requiring 
>>>>>>> the addition of filters throughout, for one particular application not 
>>>>>>> leaking (where the leak does not cause anything special), and when the 
>>>>>>> leak might happen because of a misconfiguration (or bug) but will be 
>>>>>>> detected by the operational support systems. The ROI does not seem to 
>>>>>>> add up.
>>>>>> 
>>>>>> Okay the document should probably not require egress filtering in any 
>>>>>> case but what’s about saying something like:
>>>>>> 
>>>>>> „If S-BFD is used it SHOULD be ensured that S-BFD control packet do not 
>>>>>> propagate outside of the administrative domain that uses it.“
>>>>>> 
>>>>> 
>>>>> We can add an additional explanation of the problem before a statement, 
>>>>> but I do not think that particular SHOULD is actionable. How’s something 
>>>>> like:
>>>>> 
>>>>> Explain that without handshake, a persistent initiator can send blindly, 
>>>>> to then add “In such case, operational measures SHOULD be taken to 
>>>>> identify if S-BFD packets are not responded to for an extended period of 
>>>>> time, and remediate the situation”
>>>>> 
>>>>>> This is not an uncommon thing to specify also for other protocols.
>>>>>> 
>>>>> 
>>>>> I agree. Frankly, I am happy with either statement, but I think the 
>>>>> latter might be more operationally actionable.
>>>>> 
>>>>> Preference?
>>>> 
>>>> I still would prefer something in the line as I proposed. I think there 
>>>> could effectively  be different action to be taken here, e.g. agree 
>>>> filtering or measurement to detect failure, as well as no action if for 
>>>> some other reason it can be ensure that should a misconfiguration can not 
>>>> happen (or is at least very unlikely to happen) e.g because things are 
>>>> automated and there are additional checks before apply a config.
>>>> 
>>> 
>>> Perhaps I can add “for an extended period of time” to the first statement 
>>> (or similar wording of your liking)?
>>> 
>>> Your main concern is the “forever”. Let’s ensure it is not “forever”. 
>>> However, I’m concerned that a single packet out (like a ping to the wrong 
>>> address) will be violating “ it SHOULD be ensured that S-BFD control packet 
>>> do not propagate outside”
>> 
>> The concern it not „forever“ but putting (unnecessary) load on other network 
>> (by accident). So I agree, one or a few packets is not a problem. So yes, 
>> adding “for an extended period of time” is fine. We could also/instead 
>> exchange the word „ensure“ by something else (maybe „control“…?).
>> 
> 
> These two changes would certainly work. 
> 
> Thank you. We will post a new rev today.
> 
> [I still think that a few packets are not “(unnecessary) load" for an IP 
> device, it’s really not different than doing a traceroute and getting an 
> icmp.unreach port unreachable (or if it is critical and unwelcome load for a 
> device, those devices are protected at ingress at their border).


I do agree but you never know how people might (mis)use things in future...

> 
> But in any case, I do think that explaining the problem you highlight helps 
> and improves the doc, and the new text on what to do does not hurt.]

Thanks. I’ll clear my discuss now and will have a look at the new version next 
week!

Mirja


> 
> Thanks,
> 
> — Carlos.
> 
>> Mirja
>> 
>> 
>> 
>>> 
>>> Would that work?
>>> 
>>> Thanks,
>>> 
>>> — Carlos.
>>> 
>>>> The second SHOULD that you proposed is from my point of view actually an 
>>>> additional point that I would also be happy to see in the doc.
>>>> 
>>>> Mirja
>>>> 
>>>> 
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> — Carlos.
>>>>> 
>>>>>> Mirja
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Does the explanation of the termination criteria help?
>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Seems to me the logging will alert someone/something to take action, 
>>>>>>>>> and should be enough.
>>>>>>>> 
>>>>>>>> Logging plus alerts is definitely a good thing.
>>>>>>>> 
>>>>>>> 
>>>>>>> I agree.
>>>>>>> 
>>>>>>> Will add “, and MAY include logging an error.” as per above.
>>>>>>> 
>>>>>>> Do you think we should expand on this somewhere else in the document?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> — Carlos.
>>>>>>> 
>>>>>>>> Mirja
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thoughts?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> 
>>>>>>>>> — Carlos.
>

Re: Mirja Kühlewind's Discuss on draft-ietf-bfd-seamless-base-09: (with DISCUSS)

Reply via email to