Eduard, The idea of the draft to come is to explain what to do - when and how. The goal is not to regulate (we really don’t) but to provide, similarly to RFC7938 a set of guidelines that community can use to build better and more resilient networks.
Cheers, Jeff > On Aug 2, 2021, at 04:01, Alexander Azimov <[email protected]> wrote: > > > Eduard, > > пн, 2 авг. 2021 г. в 13:45, Vasilenko Eduard <[email protected]>: >> It is the key in this presentation “This behavior MUST be switched off by >> default” >> >> It has been shown on slides 7-10 that flow label change on RTO is enabled by >> default for BSD and LINUX. >> >> It needs regulation. It needs a standard RFC. Because it kills Anycast >> otherwise. >> > As I'm partially responsible for the key points of the presentation, I can > stress that it is a bit different. > We have an opportunity for self-healing TCP on top of IPv6, it should be > preserved; > The Linux defaults should be changed to a safe mode to prevent session > timeouts; > The hash recalculation behavior should be documented; > I'm not sure what you mean by the term 'regulation'. > >> The story of how to use RTO to work-around “silent drop” vendor’s bugs could >> be a good informational RFC. >> >> My be people developing iOAM would pay more attention to this use case. >> >> >> >> IMHO: these are 2 separate drafts. >> > I'm not sure about it, we'll try to provide -00 before the next IETF meeting, > let's see how it progresses. > >> Eduard >> >> From: Alexander Azimov [mailto:[email protected]] >> Sent: Monday, August 2, 2021 1:20 PM >> To: Vasilenko Eduard <[email protected]>; Jeff Tantsura >> <[email protected]> >> Cc: routing WG <[email protected]> >> Subject: Re: Self-healing Networking with Flow Label >> >> >> >> Eduard, >> >> >> >> Please see the quote from the slide 28. My suggestion was: >> >> >> >> Client – sends SYN, Server – responds with SYN&ACK >> >> In case of SYN_RTO or RTO events Server SHOULD recalculate its TCP socket >> hash, thus change Flow Label. This behavior MAY be switched on by default; >> In case of SYN_RTO or RTO events Client MAY recalculate its TCP socket hash, >> thus change Flow Label. This behavior MUST be switched off by default; >> This looks like a safe default behavior, that saves the part of the >> improvements, but makes the work with stateful anycast services safe. >> >> >> >> And yes, IMO it's ok to have a knob to enable it in the controlled >> environment. If you ask how to enable it in the presence of internal anycast >> services - there was also a suggestion in the slides: eBPF. It gives a good >> way to make this kind of separation. >> >> >> >> 02.08.2021, 11:48, "Vasilenko Eduard" <[email protected]>: >> >> Hi Jeff, >> The situation when Control Plane does not understand what the Forwarding >> pane doing is a bug. >> Yes, RTO in TCP helps to find a work-around for this bug. And yes, Anycast >> is typically absent inside DC – it does not create the problem in the DC >> environment. >> >> But the same LINUX is used outside DC. RTO Flow Label change here would >> create even more problems if Anycast would happen on the traffic path (not >> much predictable for client). >> Do we need separate LINUX distribution for DC and separate distribution for >> other environments? >> Or should we rely on the proper non-default configuration for different >> environments? (Admin should not forget to change) >> What if Anycast would become needed in DC? >> >> RTO flow label recalculation is mutually exclusive with Anycast on the >> traffic part. >> What is more valuable for the public? >> >> IMHO: It is better to fight the problem of such type of a bug with iOAM than >> to cancel Anycast. >> >> IMHO: It is better to have Flow Label recalculation on RTO as “off” by >> default. >> DC Admin has the higher qualification to activate it in a controlled >> environment than every client worldwide that should not forget to disable it. >> >> Eduard >> From: Jeff Tantsura [mailto:[email protected]] >> Sent: Monday, August 2, 2021 6:56 AM >> To: Vasilenko Eduard <[email protected]> >> Cc: [email protected]; routing WG <[email protected]> >> Subject: Re: Self-healing Networking with Flow Label >> >> Eduard, >> >> The issue is present not in link/device case, if well implemented - fast >> rehash takes care of updating forwarding within a number of ms. The problem >> is with “gray” failures, when the link in question is UP from >> routing/forwarding prospective but drops traffic (mostly occasionally and >> when a HW bug occurs has a distinct flow attributes). >> >> In many large DC fabrics, the majority of the traffic is east-west, between >> end-points that aren’t anycast. In such deployments - the solution solves >> issues rather elegantly and without any interventions from the operator. >> The issues/side effects are well understood and will be documented. >> >> The best way to receive RTGWG emails is well… to subscribe to RTGWG ;-) >> >> Cheers, >> Jeff >> >> >> On Aug 1, 2021, at 09:47, Vasilenko Eduard <[email protected]> >> wrote: >> >> >> Hi Alexander, >> >> Have I understood your presentation right? >> The client SHOULD change IPv6 flow label after SYN RTO to have a chance to >> be moved to the working path inside DC fabric (if DC fabric supports flow >> label for hash calculation) >> But at the same time >> The client SHOULD NOT change the IPv6 flow label after SYN RTO to avoid >> being switched to a different TCP proxy engine. >> >> Looks like a deadlock, especially if both things should happen for the same >> traffic: >> it should reach DC fabric >> and >> it should be hash load-balanced between different TCP proxy engines (or >> applications) inside DC Fabric. >> >> I see one bad solution (“Disable Flow Label”): >> Routers up to TCP proxy engine SHOULD be configured not to use flow label >> (by the way these are all routers on the Internet), >> TCP flow engines SHOULD be outside of the DC Fabric (CLOS) – probably in >> front of it. >> Routers/Switches inside DC Fabric SHOULD use flow labels. >> >> I see another bad solution (“Disable Anycast”): >> Disable anycast on routers in principle, use only stateful LB. >> >> >> It has been commented in the chat that Anycast is not possible in principle >> for stateful connection. It is too general a statement. >> Anycast is just not compatible with Flow Label. It is not a problem for IPv4 >> anycast even if the connection is stateful (TCP) because 5-tuple for hash >> would not change. >> Hence, IPv6 anycast has become dead at the time when Flow Label change has >> been added in LINUX for active TCP session. >> >> Among 3 thins: >> - Anycast >> - Flow Label load balancing (basic Flow Label functionality) >> - Flow Label change on the active session for application to be >> more active in new path search >> You have to choose which one to kill – all 3 are not compatible with each >> other at the same. >> I vote to disable Flow Label change in LINUX. Then wait till the network >> would fix itself. >> We have so many fancy TE tools our days. A broken link or a broken node >> could be excluded from routing for 50ms. >> >> PS: I am not subscribed to the RTGWG alias, please keep me on a copy of this >> thread. >> <image001.png> >> Best Regards >> Eduard Vasilenko >> Senior Architect >> Europe Standardization & Industry Development Department >> Tel: +7(985) 910-1105, +7(916) 800-5506 >> >> _______________________________________________ >> rtgwg mailing list >> [email protected] >> https://www.ietf.org/mailman/listinfo/rtgwg >> >> >> >> >> -- >> >> Best regards, >> >> Alexander Azimov >> >> >> >> _______________________________________________ >> rtgwg mailing list >> [email protected] >> https://www.ietf.org/mailman/listinfo/rtgwg > > > -- > Best regards, > Alexander Azimov
_______________________________________________ rtgwg mailing list [email protected] https://www.ietf.org/mailman/listinfo/rtgwg
