Hi, Mirja, > On May 4, 2016, at 11:29 AM, Mirja Kuehlewind (IETF) <[email protected]> > wrote: > > Hi Carlos, > >> Am 04.05.2016 um 17:13 schrieb Carlos Pignataro (cpignata) >> <[email protected]>: >> >> Hi, Mirja, >> >>> On May 4, 2016, at 10:41 AM, Mirja Kuehlewind (IETF) <[email protected]> >>> wrote: >>> >>> Hi Carlos, >>> >>> below. >>> >>>> Am 04.05.2016 um 16:33 schrieb Carlos Pignataro (cpignata) >>>> <[email protected]>: >>>> >>>> Thanks much for the response, Mirja! >>>> >>>> I think we are converging, please see inline. >>>> >>>>> On May 4, 2016, at 10:13 AM, Mirja Kuehlewind (IETF) >>>>> <[email protected]> wrote: >>>>> >>>>> Hi Carlos, >>>>> >>>>> see below. >>>>> >>>>>> Am 03.05.2016 um 19:24 schrieb Carlos Pignataro (cpignata) >>>>>> <[email protected]>: >>>>>> >>>>>> Hi, Mirja, >>>>>> >>>>>>> On May 3, 2016, at 12:31 PM, Mirja Kuehlewind (IETF) >>>>>>> <[email protected]> wrote: >>>>>>> >>>>>>> Hi Carlos, >>>>>>> >>>>>>> >>>>>>>> Am 03.05.2016 um 15:40 schrieb Carlos Pignataro (cpignata) >>>>>>>> <[email protected]>: >>>>>>>> >>>>>>>> Hi, Mirja, >>>>>>>> >>>>>>>> What is an uncontrolled packet in an IP network, and what entity >>>>>>>> controls controlled ones? :-) >>>>>>> >>>>>>> Questions over questions… :-) >>>>>>> >>>>>>> See below... >>>>>>> >>>>>>>> >>>>>>>> More seriously, please see inline. >>>>>>>> >>>>>>>>> On May 3, 2016, at 5:35 AM, Mirja Kuehlewind <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Mirja Kühlewind has entered the following ballot position for >>>>>>>>> draft-ietf-bfd-seamless-base-09: Discuss >>>>>>>>> >>>>>>>>> When responding, please keep the subject line intact and reply to all >>>>>>>>> email addresses included in the To and CC lines. (Feel free to cut >>>>>>>>> this >>>>>>>>> introductory paragraph, however.) >>>>>>>>> >>>>>>>>> >>>>>>>>> Please refer to >>>>>>>>> https://www.ietf.org/iesg/statement/discuss-criteria.html >>>>>>>>> for more information about IESG DISCUSS and COMMENT positions. >>>>>>>>> >>>>>>>>> >>>>>>>>> The document, along with other ballot positions, can be found here: >>>>>>>>> https://datatracker.ietf.org/doc/draft-ietf-bfd-seamless-base/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ---------------------------------------------------------------------- >>>>>>>>> DISCUSS: >>>>>>>>> ---------------------------------------------------------------------- >>>>>>>>> >>>>>>>>> As S-BFD has no initiation process anymore it is not guarenteed that >>>>>>>>> the >>>>>>>>> receiver/responder actually exists. That means that packets could >>>>>>>>> float >>>>>>>>> (uncontrolled) in the network or even outside of the adminstrative >>>>>>>>> domain >>>>>>>>> (e.g. due to configuration mistakes). From my point of view this >>>>>>>>> document >>>>>>>>> should recommend/require two things: >>>>>>>>> >>>>>>>> >>>>>>>> We have called out the misconfiguration — however: >>>>>>>> >>>>>>>>> 1) A maximum number of S-BFD packet that is allow to be send without >>>>>>>>> getting a response (maybe leading to a local error report). >>>>>>>>> >>>>>>>> >>>>>>>> This can result in a deadlock situation, if an S-BFD Reflector is >>>>>>>> enabled much later. I’m very hesitant to cap the packets sent. We can, >>>>>>>> and I think it is useful, MAY log an error for multiple timeouts. >>>>>>> >>>>>>> Okay, I understand that a hard limit probably does make sense. An error >>>>>>> log seems definitely useful. >>>>>> >>>>>> OK, that sounds good. See below. >>>>>> >>>>>>> Another proposal for consideration: Currently the draft says an >>>>>>> initiator should only send one packet per second if the target is in >>>>>>> ADMINDOWN state. In this case there this state is explicit announced. >>>>>>> However if the other end just disappears or was never/not yet there, >>>>>>> one could use an exponential back off instead, starting with a smaller >>>>>>> intervals than one second but then increase it exponentially. Just an >>>>>>> idea... >>>>>> >>>>>> Thanks for the proposal. Please have in mind however that this is a >>>>>> protocol for detecting liveness (and lack of it), so increasing >>>>>> exponentially defeats the purpose. >>>>>> >>>>>> Further, exponential back off may not be the best choice when >>>>>> interacting with routing protocols. >>>>>> >>>>>> What we currently say is: >>>>>> The criteria for declaring loss of >>>>>> reachability and the action that would be triggered as a result >>>>>> are outside the scope of this document. >>>>>> >>>>>> As much of these are implementation choices. >>>>>> >>>>>> But we can add at the end “, and MAY include logging an error.“ >>>>> >>>>> Please do so. >>>> >>>> Done. >>>> >>>>> >>>>>>> >>>>>>>> >>>>>>>>> 2) Egress filtering at the adminstrative border of the domain that >>>>>>>>> uses >>>>>>>>> S-BFD to make sure that no S-BFD packets leave the domain. >>>>>>>>> >>>>>>>> >>>>>>>> This is no different than any other application that uses UDP; a >>>>>>>> misconfigured DNS server will result in the same, and a traceroute is >>>>>>>> also not too different. This seems too onerous of a requirement. An >>>>>>>> administrative domain filters at ingress. >>>>>>> >>>>>>> First of all, just because other protocols might have such a problem, >>>>>>> that does mean it’s okay. >>>>>> >>>>>> I agree with this. I had a different point in mind though — trying to >>>>>> specify this on every UDP application might not be the most effective >>>>>> way. Perhaps there’s a UDP guideline you are uncovering. >>>>>> >>>>>>> However, correctly me if I’m wrong, but there the situation seems >>>>>>> slightly different because there is no termination criterium at all >>>>>>> that means an s-bfd node would send useless data forever (… until >>>>>>> manual change of the config). >>>>>>> >>>>>> >>>>>> But as far as this doc is concerned, let me try to make some >>>>>> clarifications (and a correction). >>>>>> >>>>>> There are termination criteria — the document says: >>>>>> >>>>>> An SBFDInitiator may be a persistent session on the initiator with a >>>>>> timer for S-BFD control packet transmissions (stateful >>>>>> SBFDInitiator). An SBFDInitiator may also be a module, a script or a >>>>>> tool on the initiator that transmits one or more S-BFD control >>>>>> packets "when needed" (stateless SBFDInitiator). >>>>>> >>>>>> For the case in which you have a “when needed” SBFDInitiator, the >>>>>> workflow is like a “ping”. >>>>>> >>>>>> For the case in which you have a “persistent" SBFDInitiator, in theory >>>>>> this can run forever (for some value of ever). However, please don’t >>>>>> loose track of why this protocol exists. Having OAM failures and do >>>>>> nothing about it defeats the purpose of having OAM. Meaning, a red alarm >>>>>> will blink, a honk will horn, and the config state will be changed >>>>>> (manually or by some support system). >>>>>> >>>>>> In other words, I do not think this is such a unique case (insofar as >>>>>> running ad-infinutum). >>>>> >>>>> I still believe that the case where you have a misconfiguration and the >>>>> initiator sends packets (forever) but never ever gest a reply (and never >>>>> has seen a reply in the past), is a different case and can be detected >>>>> and handled separately. >>>>> >>>> >>>> Again, I would not qualify this as ‘forever’, but I understand what you >>>> mean. >>>> >>>>>> >>>>>>> I still believe that egress filtering would be more appropriate here >>>>>>> (than ingress) because the domain that is using s-bfd knows about it >>>>>>> and therefor can set up the respective filters and should not spam >>>>>>> others while hoping that filters are in place. >>>>>>> >>>>>> >>>>>> To me, there is no insignificant operational complexity with requiring >>>>>> the addition of filters throughout, for one particular application not >>>>>> leaking (where the leak does not cause anything special), and when the >>>>>> leak might happen because of a misconfiguration (or bug) but will be >>>>>> detected by the operational support systems. The ROI does not seem to >>>>>> add up. >>>>> >>>>> Okay the document should probably not require egress filtering in any >>>>> case but what’s about saying something like: >>>>> >>>>> „If S-BFD is used it SHOULD be ensured that S-BFD control packet do not >>>>> propagate outside of the administrative domain that uses it.“ >>>>> >>>> >>>> We can add an additional explanation of the problem before a statement, >>>> but I do not think that particular SHOULD is actionable. How’s something >>>> like: >>>> >>>> Explain that without handshake, a persistent initiator can send blindly, >>>> to then add “In such case, operational measures SHOULD be taken to >>>> identify if S-BFD packets are not responded to for an extended period of >>>> time, and remediate the situation” >>>> >>>>> This is not an uncommon thing to specify also for other protocols. >>>>> >>>> >>>> I agree. Frankly, I am happy with either statement, but I think the latter >>>> might be more operationally actionable. >>>> >>>> Preference? >>> >>> I still would prefer something in the line as I proposed. I think there >>> could effectively be different action to be taken here, e.g. agree >>> filtering or measurement to detect failure, as well as no action if for >>> some other reason it can be ensure that should a misconfiguration can not >>> happen (or is at least very unlikely to happen) e.g because things are >>> automated and there are additional checks before apply a config. >>> >> >> Perhaps I can add “for an extended period of time” to the first statement >> (or similar wording of your liking)? >> >> Your main concern is the “forever”. Let’s ensure it is not “forever”. >> However, I’m concerned that a single packet out (like a ping to the wrong >> address) will be violating “ it SHOULD be ensured that S-BFD control packet >> do not propagate outside” > > The concern it not „forever“ but putting (unnecessary) load on other network > (by accident). So I agree, one or a few packets is not a problem. So yes, > adding “for an extended period of time” is fine. We could also/instead > exchange the word „ensure“ by something else (maybe „control“…?). >
These two changes would certainly work. Thank you. We will post a new rev today. [I still think that a few packets are not “(unnecessary) load" for an IP device, it’s really not different than doing a traceroute and getting an icmp.unreach port unreachable (or if it is critical and unwelcome load for a device, those devices are protected at ingress at their border). But in any case, I do think that explaining the problem you highlight helps and improves the doc, and the new text on what to do does not hurt.] Thanks, — Carlos. > Mirja > > > >> >> Would that work? >> >> Thanks, >> >> — Carlos. >> >>> The second SHOULD that you proposed is from my point of view actually an >>> additional point that I would also be happy to see in the doc. >>> >>> Mirja >>> >>> >>>> >>>> Thanks, >>>> >>>> — Carlos. >>>> >>>>> Mirja >>>>> >>>>> >>>>>> >>>>>> Does the explanation of the termination criteria help? >>>>>> >>>>>>>> >>>>>>>> Seems to me the logging will alert someone/something to take action, >>>>>>>> and should be enough. >>>>>>> >>>>>>> Logging plus alerts is definitely a good thing. >>>>>>> >>>>>> >>>>>> I agree. >>>>>> >>>>>> Will add “, and MAY include logging an error.” as per above. >>>>>> >>>>>> Do you think we should expand on this somewhere else in the document? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> — Carlos. >>>>>> >>>>>>> Mirja >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thoughts? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> — Carlos.
signature.asc
Description: Message signed with OpenPGP using GPGMail
