Hi Carlos, > Am 04.05.2016 um 17:44 schrieb Carlos Pignataro (cpignata) > <[email protected]>: > > Hi, Mirja, > >> On May 4, 2016, at 11:29 AM, Mirja Kuehlewind (IETF) <[email protected]> >> wrote: >> >> Hi Carlos, >> >>> Am 04.05.2016 um 17:13 schrieb Carlos Pignataro (cpignata) >>> <[email protected]>: >>> >>> Hi, Mirja, >>> >>>> On May 4, 2016, at 10:41 AM, Mirja Kuehlewind (IETF) <[email protected]> >>>> wrote: >>>> >>>> Hi Carlos, >>>> >>>> below. >>>> >>>>> Am 04.05.2016 um 16:33 schrieb Carlos Pignataro (cpignata) >>>>> <[email protected]>: >>>>> >>>>> Thanks much for the response, Mirja! >>>>> >>>>> I think we are converging, please see inline. >>>>> >>>>>> On May 4, 2016, at 10:13 AM, Mirja Kuehlewind (IETF) >>>>>> <[email protected]> wrote: >>>>>> >>>>>> Hi Carlos, >>>>>> >>>>>> see below. >>>>>> >>>>>>> Am 03.05.2016 um 19:24 schrieb Carlos Pignataro (cpignata) >>>>>>> <[email protected]>: >>>>>>> >>>>>>> Hi, Mirja, >>>>>>> >>>>>>>> On May 3, 2016, at 12:31 PM, Mirja Kuehlewind (IETF) >>>>>>>> <[email protected]> wrote: >>>>>>>> >>>>>>>> Hi Carlos, >>>>>>>> >>>>>>>> >>>>>>>>> Am 03.05.2016 um 15:40 schrieb Carlos Pignataro (cpignata) >>>>>>>>> <[email protected]>: >>>>>>>>> >>>>>>>>> Hi, Mirja, >>>>>>>>> >>>>>>>>> What is an uncontrolled packet in an IP network, and what entity >>>>>>>>> controls controlled ones? :-) >>>>>>>> >>>>>>>> Questions over questions… :-) >>>>>>>> >>>>>>>> See below... >>>>>>>> >>>>>>>>> >>>>>>>>> More seriously, please see inline. >>>>>>>>> >>>>>>>>>> On May 3, 2016, at 5:35 AM, Mirja Kuehlewind <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Mirja Kühlewind has entered the following ballot position for >>>>>>>>>> draft-ietf-bfd-seamless-base-09: Discuss >>>>>>>>>> >>>>>>>>>> When responding, please keep the subject line intact and reply to all >>>>>>>>>> email addresses included in the To and CC lines. (Feel free to cut >>>>>>>>>> this >>>>>>>>>> introductory paragraph, however.) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Please refer to >>>>>>>>>> https://www.ietf.org/iesg/statement/discuss-criteria.html >>>>>>>>>> for more information about IESG DISCUSS and COMMENT positions. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The document, along with other ballot positions, can be found here: >>>>>>>>>> https://datatracker.ietf.org/doc/draft-ietf-bfd-seamless-base/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ---------------------------------------------------------------------- >>>>>>>>>> DISCUSS: >>>>>>>>>> ---------------------------------------------------------------------- >>>>>>>>>> >>>>>>>>>> As S-BFD has no initiation process anymore it is not guarenteed that >>>>>>>>>> the >>>>>>>>>> receiver/responder actually exists. That means that packets could >>>>>>>>>> float >>>>>>>>>> (uncontrolled) in the network or even outside of the adminstrative >>>>>>>>>> domain >>>>>>>>>> (e.g. due to configuration mistakes). From my point of view this >>>>>>>>>> document >>>>>>>>>> should recommend/require two things: >>>>>>>>>> >>>>>>>>> >>>>>>>>> We have called out the misconfiguration — however: >>>>>>>>> >>>>>>>>>> 1) A maximum number of S-BFD packet that is allow to be send without >>>>>>>>>> getting a response (maybe leading to a local error report). >>>>>>>>>> >>>>>>>>> >>>>>>>>> This can result in a deadlock situation, if an S-BFD Reflector is >>>>>>>>> enabled much later. I’m very hesitant to cap the packets sent. We >>>>>>>>> can, and I think it is useful, MAY log an error for multiple timeouts. >>>>>>>> >>>>>>>> Okay, I understand that a hard limit probably does make sense. An >>>>>>>> error log seems definitely useful. >>>>>>> >>>>>>> OK, that sounds good. See below. >>>>>>> >>>>>>>> Another proposal for consideration: Currently the draft says an >>>>>>>> initiator should only send one packet per second if the target is in >>>>>>>> ADMINDOWN state. In this case there this state is explicit announced. >>>>>>>> However if the other end just disappears or was never/not yet there, >>>>>>>> one could use an exponential back off instead, starting with a smaller >>>>>>>> intervals than one second but then increase it exponentially. Just an >>>>>>>> idea... >>>>>>> >>>>>>> Thanks for the proposal. Please have in mind however that this is a >>>>>>> protocol for detecting liveness (and lack of it), so increasing >>>>>>> exponentially defeats the purpose. >>>>>>> >>>>>>> Further, exponential back off may not be the best choice when >>>>>>> interacting with routing protocols. >>>>>>> >>>>>>> What we currently say is: >>>>>>> The criteria for declaring loss of >>>>>>> reachability and the action that would be triggered as a result >>>>>>> are outside the scope of this document. >>>>>>> >>>>>>> As much of these are implementation choices. >>>>>>> >>>>>>> But we can add at the end “, and MAY include logging an error.“ >>>>>> >>>>>> Please do so. >>>>> >>>>> Done. >>>>> >>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>>> 2) Egress filtering at the adminstrative border of the domain that >>>>>>>>>> uses >>>>>>>>>> S-BFD to make sure that no S-BFD packets leave the domain. >>>>>>>>>> >>>>>>>>> >>>>>>>>> This is no different than any other application that uses UDP; a >>>>>>>>> misconfigured DNS server will result in the same, and a traceroute is >>>>>>>>> also not too different. This seems too onerous of a requirement. An >>>>>>>>> administrative domain filters at ingress. >>>>>>>> >>>>>>>> First of all, just because other protocols might have such a problem, >>>>>>>> that does mean it’s okay. >>>>>>> >>>>>>> I agree with this. I had a different point in mind though — trying to >>>>>>> specify this on every UDP application might not be the most effective >>>>>>> way. Perhaps there’s a UDP guideline you are uncovering. >>>>>>> >>>>>>>> However, correctly me if I’m wrong, but there the situation seems >>>>>>>> slightly different because there is no termination criterium at all >>>>>>>> that means an s-bfd node would send useless data forever (… until >>>>>>>> manual change of the config). >>>>>>>> >>>>>>> >>>>>>> But as far as this doc is concerned, let me try to make some >>>>>>> clarifications (and a correction). >>>>>>> >>>>>>> There are termination criteria — the document says: >>>>>>> >>>>>>> An SBFDInitiator may be a persistent session on the initiator with a >>>>>>> timer for S-BFD control packet transmissions (stateful >>>>>>> SBFDInitiator). An SBFDInitiator may also be a module, a script or a >>>>>>> tool on the initiator that transmits one or more S-BFD control >>>>>>> packets "when needed" (stateless SBFDInitiator). >>>>>>> >>>>>>> For the case in which you have a “when needed” SBFDInitiator, the >>>>>>> workflow is like a “ping”. >>>>>>> >>>>>>> For the case in which you have a “persistent" SBFDInitiator, in theory >>>>>>> this can run forever (for some value of ever). However, please don’t >>>>>>> loose track of why this protocol exists. Having OAM failures and do >>>>>>> nothing about it defeats the purpose of having OAM. Meaning, a red >>>>>>> alarm will blink, a honk will horn, and the config state will be >>>>>>> changed (manually or by some support system). >>>>>>> >>>>>>> In other words, I do not think this is such a unique case (insofar as >>>>>>> running ad-infinutum). >>>>>> >>>>>> I still believe that the case where you have a misconfiguration and the >>>>>> initiator sends packets (forever) but never ever gest a reply (and never >>>>>> has seen a reply in the past), is a different case and can be detected >>>>>> and handled separately. >>>>>> >>>>> >>>>> Again, I would not qualify this as ‘forever’, but I understand what you >>>>> mean. >>>>> >>>>>>> >>>>>>>> I still believe that egress filtering would be more appropriate here >>>>>>>> (than ingress) because the domain that is using s-bfd knows about it >>>>>>>> and therefor can set up the respective filters and should not spam >>>>>>>> others while hoping that filters are in place. >>>>>>>> >>>>>>> >>>>>>> To me, there is no insignificant operational complexity with requiring >>>>>>> the addition of filters throughout, for one particular application not >>>>>>> leaking (where the leak does not cause anything special), and when the >>>>>>> leak might happen because of a misconfiguration (or bug) but will be >>>>>>> detected by the operational support systems. The ROI does not seem to >>>>>>> add up. >>>>>> >>>>>> Okay the document should probably not require egress filtering in any >>>>>> case but what’s about saying something like: >>>>>> >>>>>> „If S-BFD is used it SHOULD be ensured that S-BFD control packet do not >>>>>> propagate outside of the administrative domain that uses it.“ >>>>>> >>>>> >>>>> We can add an additional explanation of the problem before a statement, >>>>> but I do not think that particular SHOULD is actionable. How’s something >>>>> like: >>>>> >>>>> Explain that without handshake, a persistent initiator can send blindly, >>>>> to then add “In such case, operational measures SHOULD be taken to >>>>> identify if S-BFD packets are not responded to for an extended period of >>>>> time, and remediate the situation” >>>>> >>>>>> This is not an uncommon thing to specify also for other protocols. >>>>>> >>>>> >>>>> I agree. Frankly, I am happy with either statement, but I think the >>>>> latter might be more operationally actionable. >>>>> >>>>> Preference? >>>> >>>> I still would prefer something in the line as I proposed. I think there >>>> could effectively be different action to be taken here, e.g. agree >>>> filtering or measurement to detect failure, as well as no action if for >>>> some other reason it can be ensure that should a misconfiguration can not >>>> happen (or is at least very unlikely to happen) e.g because things are >>>> automated and there are additional checks before apply a config. >>>> >>> >>> Perhaps I can add “for an extended period of time” to the first statement >>> (or similar wording of your liking)? >>> >>> Your main concern is the “forever”. Let’s ensure it is not “forever”. >>> However, I’m concerned that a single packet out (like a ping to the wrong >>> address) will be violating “ it SHOULD be ensured that S-BFD control packet >>> do not propagate outside” >> >> The concern it not „forever“ but putting (unnecessary) load on other network >> (by accident). So I agree, one or a few packets is not a problem. So yes, >> adding “for an extended period of time” is fine. We could also/instead >> exchange the word „ensure“ by something else (maybe „control“…?). >> > > These two changes would certainly work. > > Thank you. We will post a new rev today. > > [I still think that a few packets are not “(unnecessary) load" for an IP > device, it’s really not different than doing a traceroute and getting an > icmp.unreach port unreachable (or if it is critical and unwelcome load for a > device, those devices are protected at ingress at their border).
I do agree but you never know how people might (mis)use things in future... > > But in any case, I do think that explaining the problem you highlight helps > and improves the doc, and the new text on what to do does not hurt.] Thanks. I’ll clear my discuss now and will have a look at the new version next week! Mirja > > Thanks, > > — Carlos. > >> Mirja >> >> >> >>> >>> Would that work? >>> >>> Thanks, >>> >>> — Carlos. >>> >>>> The second SHOULD that you proposed is from my point of view actually an >>>> additional point that I would also be happy to see in the doc. >>>> >>>> Mirja >>>> >>>> >>>>> >>>>> Thanks, >>>>> >>>>> — Carlos. >>>>> >>>>>> Mirja >>>>>> >>>>>> >>>>>>> >>>>>>> Does the explanation of the termination criteria help? >>>>>>> >>>>>>>>> >>>>>>>>> Seems to me the logging will alert someone/something to take action, >>>>>>>>> and should be enough. >>>>>>>> >>>>>>>> Logging plus alerts is definitely a good thing. >>>>>>>> >>>>>>> >>>>>>> I agree. >>>>>>> >>>>>>> Will add “, and MAY include logging an error.” as per above. >>>>>>> >>>>>>> Do you think we should expand on this somewhere else in the document? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> — Carlos. >>>>>>> >>>>>>>> Mirja >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Thoughts? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> — Carlos. >
