Hi, Acee: Actually I think so. We would like to hear more comments for the two approaches.
It’s better to update the draft based on the consensus or through discussion. Aijun Wang China Telecom > On Jan 17, 2022, at 21:15, Acee Lindem (acee) > <[email protected]> wrote: > > > Hi Aijung, > So I guess you’re saying the current draft with a single aggregated cost is > incorrect and will be updated? > Thanks, > Acee > > From: Aijun Wang <[email protected]> > Date: Sunday, January 16, 2022 at 9:47 PM > To: Acee Lindem <[email protected]>, 'Linda Dunbar' > <[email protected]>, John E Drake <[email protected]> > Cc: "Les Ginsberg (ginsberg)" <[email protected]>, 'Gyan Mishra' > <[email protected]>, Robert Raszuk <[email protected]>, "[email protected]" > <[email protected]> > Subject: RE: [Lsr] Seeking feedback to the revised > draft-dunbar-lsr-5g-edge-compute > > Hi, Acee: > > My thought is that the traditional IGP metric is the cost from ingress > routers(Ra/Rb) to egress routers(R1/R2/R3) which is illustrated in > https://datatracker.ietf.org/doc/html/draft-dunbar-lsr-5g-edge-compute-04#appendix-A. > Such metric will be same(because of the ANYCAST address be advertised > simultaneously via R1/R2/R3 at the same time for one application server, for > example, S1/aa08::4450). > > But, In the above mentioned scenario, there are other factors to be > considered, or other factors may be more signification for the selection of > ANYCAST server, for example, the Capacity Index, the Preference Index and > other constraints. > The egress router(A-ER) now calculate the aggregated metric based on these > factors. The derived cost should be considered or added at the IGP SPF > calculation to the ANYCAST prefix. > > There are two way to advertise such additional information: > 1) To define another type prefix cost, and also the new Flexalgo > algorithm, to indicate that both the traditional prefix metric and this > additional aggregated metric should be considered together to select the > right egress router > 2) To put the additional cost information within the Stub-Link TLV that > defined in > https://datatracker.ietf.org/doc/html/draft-wang-lsr-stub-link-attributes-03, > > > Both can results in the EPE(Egress Peering Engineering)-like effect. > But I am prefer to the second option, because in such scenario, the stub link > bandwidth, stub link delay etc factors should also be considered when to > select the best egress, they are not the attribute of prefixes. > > We should also know that for these “no inter-as boundary link”, or ‘stub > link’, the associated prefix will not be advertised into the IGP > automatically, only the local interface address of the stub link on the > egress router will be advertised. > > Best Regards > > Aijun Wang > China Telecom > > From: [email protected] <[email protected]> On Behalf Of Acee Lindem > (acee) > Sent: Sunday, January 16, 2022 10:42 PM > To: Linda Dunbar <[email protected]>; Aijun Wang > <[email protected]>; John E Drake > <[email protected]> > Cc: Les Ginsberg (ginsberg) <[email protected]>; Gyan Mishra > <[email protected]>; Robert Raszuk <[email protected]>; [email protected] > Subject: Re: [Lsr] Seeking feedback to the revised > draft-dunbar-lsr-5g-edge-compute > > Hi Linda, > > I guess you misunderstood me. Since the only advertises a single aggregated > metric, you don’t need a new aggregated cost TLV or to use flex algorithm. > Just set the base IGP metric for the prefix to the aggregated metric and IGPs > will route based on that metric. > > Thanks, > Acee > > From: Linda Dunbar <[email protected]> > Date: Saturday, January 15, 2022 at 8:03 PM > To: Acee Lindem <[email protected]>, Aijun Wang <[email protected]>, > John E Drake <[email protected]> > Cc: "Les Ginsberg (ginsberg)" <[email protected]>, Gyan Mishra > <[email protected]>, Robert Raszuk <[email protected]>, "[email protected]" > <[email protected]> > Subject: RE: [Lsr] Seeking feedback to the revised > draft-dunbar-lsr-5g-edge-compute > > Acee, John, Robert, and LSR experts: > > We have updated the draft to reflect the comments and suggestions on the LSR > mailing list. > https://datatracker.ietf.org/doc/draft-dunbar-lsr-5g-edge-compute/ > > In particular: > > - This draft describes using additional site costs to influence the > shortest path computation for a specific set of prefixes. As there are a > small number of egress routers having those prefixes (or destinations) that > need to incorporate site costs in SPF computation, Flexible Algorithms > [LSR-FlexAlgo] is used to indicate the need for the site costs to be > considered for the specific topologies. > > Need a Flag in the Flexible Algorithm TLV to indicate that “site-cost” needs > to be included for the constrained SPF to reach the Prefix. > Therefore, it is not informational draft. > > The “Site Cost” associated with a prefix (i.e., ANYCAST prefix) can be a > value configured on the router to which the prefix is attached. The actual > mechanism of assigning “Site Cost” or the detailed algorithm is out of the > scope of document > The “site cost” change rate is comparable with the rate that the application > controller adds or removes the application instances at locations to adjust > the workload distribution. Typically, the rate of change could be in days. On > rare occasions, there might need rate changes in hours. > > We have added a section to emphasize that It is important that the “site > cost” metric doesn’t change too frequently to avoid route oscillation within > the network. > > Thank you. > > Linda > > From: Acee Lindem (acee) <[email protected]> > Sent: Saturday, January 15, 2022 5:30 AM > To: Aijun Wang <[email protected]>; John E Drake > <[email protected]> > Cc: Les Ginsberg (ginsberg) <[email protected]>; Linda Dunbar > <[email protected]>; Gyan Mishra <[email protected]>; Robert > Raszuk <[email protected]>; [email protected] > Subject: Re: [Lsr] Seeking feedback to the revised > draft-dunbar-lsr-5g-edge-compute > > Hi Aijun, Linda, > > Independent of the ongoing debate on whether advertising the server metrics > in the IGPs… > > Now that the draft is simplified to use a single aggregated metric, why not > make the draft informational and use the base IGP metrics? This avoid the > burden of adding a new flex algorithm. > > Thanks, > Acee > > From: Lsr <[email protected]> on behalf of Aijun Wang > <[email protected]> > Date: Friday, January 14, 2022 at 10:38 PM > To: John E Drake <[email protected]> > Cc: "Les Ginsberg (ginsberg)" <[email protected]>, Linda Dunbar > <[email protected]>, Gyan Mishra <[email protected]>, Robert > Raszuk <[email protected]>, "[email protected]" <[email protected]> > Subject: Re: [Lsr] Seeking feedback to the revised > draft-dunbar-lsr-5g-edge-compute > > This draft is now proposing one aggregate cost of the application server. > > The detailed factors can also be included if necessary. But the principle for > advertising them should be controllable, as required by other dynamic metrics > in IGP. > > Aijun Wang > China Telecom > > > On Jan 15, 2022, at 08:37, John E Drake <[email protected]> > wrote: > > This is similar to the issue with the Down/Up proposal. A single metric > tells an ingress node nothing about the performance of or load on the > individual applications at a given site. > > Yours Irrespectively, > > John > > > Juniper Business Use Only > From: Aijun Wang <[email protected]> > Sent: Friday, January 14, 2022 6:58 PM > To: John E Drake <[email protected]> > Cc: Robert Raszuk <[email protected]>; Gyan Mishra <[email protected]>; > Les Ginsberg (ginsberg) <[email protected]>; Linda Dunbar > <[email protected]>; [email protected] > Subject: Re: [Lsr] Seeking feedback to the revised > draft-dunbar-lsr-5g-edge-compute > > [External Email. Be cautious of content] > > Hi, John: > Here I would also like to hear your own opinions. If not, please see my > responses for both you and Robert: > > https://datatracker.ietf.org/doc/html/draft-ietf-lsr-flex-algo-bw-con-01 has > introduced the “delay metric” into the IGP. Such metric may be variant in > every link within the IGP. > The proposal in draft-dunbar-lsr-5g-edge-compute is only for the > stub-link’s/prefixes characteristics, it is the aggregate cost to the server > that measured from the router. > > All the factors that mentioned by Robert maybe the parameters that influences > the performance of the server, which will be reflected in the aggregate cost. > > Then, the conclusion is that IGP has now the capabilities to deal with the > dynamics value(the change frequencies can certainly be controlled, thinking > how we control the flapping interface)within the network , the aggregate cost > or other quasi-static factor to the server at the edge of the network can > also be considered together. > Such approaches can certainly let the IGP give more optimal behavior to > forward the traffic to the appropriate destination, or follow an optimal path. > > Aijun Wang > China Telecom > > > On Jan 14, 2022, at 23:49, John E Drake <[email protected]> > wrote: > > Robert is correct on all points. > > Yours Irrespectively, > > John > > > Juniper Business Use Only > From: Lsr <[email protected]> On Behalf Of Robert Raszuk > Sent: Friday, January 14, 2022 4:20 AM > To: Gyan Mishra <[email protected]> > Cc: Les Ginsberg (ginsberg) <[email protected]>; Linda Dunbar > <[email protected]>; [email protected] > Subject: Re: [Lsr] Seeking feedback to the revised > draft-dunbar-lsr-5g-edge-compute > > [External Email. Be cautious of content] > > Gyan, > > This is not a network discussion. There are well proven techniques to direct > user sessions or user requests to a pool of servers deployed and operational. > All provide robust services. Network plays very little to no role in that. > > There are also lot's of factors involved in making that decision (CPU load, > RAM, Storage, IO, CPU Temp etc ...) and IMO it would be very bad to now make > IGP to carry it and make routing decisions (even if separate topology) based > on that information. > > I do not see this like a move into the right direction. That is my input. > > Kind regards, > Robert. > > > > > > > > > On Fri, Jan 14, 2022 at 4:53 AM Gyan Mishra <[email protected]> wrote: > Robert > > Responses in-line > > > > On Thu, Jan 13, 2022 at 5:55 AM Robert Raszuk <[email protected]> wrote: > Gyan, > > I see what the draft is trying to do now. /* I did not even consider this for > the reason described below. */ > > But what you wrote requires little correction: > > "So now the server you are on gets overloaded and a site cost gets advertised > in the IGP at which point the connection receives a TCP reset" > > if you s/connection/all connections/ then you quickly realize that what is > proposed here is a disaster. > > Gyan> Remember this is Anycast proximity based routing along with ECMP / > UCMP flow based load balancing and most vendors modern routers support some > sort of x-tuple ECMP/UCMP hash so the flows would be evenly distributed, so > if you have 10s of 100s of paths, the flows would be evenly distributed > across all the paths. Since it’s Anycast proximity each path leads to a > different Application LB VIP and backend server. So all the TCP connections > would be uniformly distributed across all the paths. > > Only the connections hashed to the path with overloaded server would get > reset and it would be no different then if the server went down as the > connections would get reset anyway in that case. > > In this case instead of being pinned to a bad connection you are now reset > to a good connection resulting in better QOE for the end user and a Happy > customer. > > To me thats a positive not a negative. > > A good analogy would be let’s say you are on WIFI and on the same channel > that your neighbors are on and have horrible bandwidth. Do you stay on that > bad channel and limp along all day or to you flip to an unused channel. > > Another example is if you have a server that has run out of resources. Do > you fail production traffic off the server taking it out of rotation or let > it stay as is and pray things get better. This draft is a good example of > how IGP can save the day with site metric. > > Breaking all existing flows going to one LB to suddenly timeout and all go to > the other LB(s) is never a technique any one would seriously deploy in a > production network. > > Gyan> Application load balancing can be done with DNS based GEO load > balancing based on client and server IP database where you have more discrete > control however the failover is much slower. > > Leave alone that doing that has potential to immediately overload the other > LB(s)/server(s) too. > > Gyan> The idea with Anycast load balancing is that you may have 10 or even > 100s of paths, so if one server fails the load can be evenly distributed > based on statistical multiplexing algorithm calculated by the server teams > engineering the sizing of the server clusters to ensure what you described > won’t happen. > > For me the conclusion is that IGP transport level is not the proper layer to > address the requirement. > > Cheers, > Robert. > > > On Thu, Jan 13, 2022 at 7:05 AM Gyan Mishra <[email protected]> wrote: > > Hi Les > > Agreed. > > My thoughts are that the context of the draft is based on an Anycast VIP > address of a server which is proximity based load balancing and not > necessarily ECMP/UCMP and only if the proximity is the same for multiple > paths to the Anycast VIP would there be a ECMP/UCMP possibility. > > Let’s say if it’s proximity based and one path is preferred, the flow will > take that path. So now the server you are on gets overloaded and a site cost > gets advertised in the IGP at which point the connection receives a TCP reset > and flow re-establishes on the alternate path based on the site cost and > remains there until the server goes down or gets overloaded or a better path > comes along. > > For ECMP case, ECMP has flow affinity so the flow will stay on the same path > long lived until the connection terminates. > > So now in ECMP case the flow hashed to a path and maintains its affinity to > that path. Now all of sudden the server gets overloaded and we get a better > site cost advertised. So now the session terminates on current path and > establishes again on the Anycast VIP new path based on the site cost > advertised. > > The failover I believe results in the user refreshing their browser which is > better than hanging. > > As the VIP prefix is the only one that experiences reconvergence on new path > based on site cost if there is any instability with the servers that will be > reflected to the IGP Anycast prefix as well. > > Is that a good or bad thing. I think if you had to pick your poison as here > the issue Linda is trying to solve is a server issue but leveraging the IGP > to force re-convergence when the server is in a half baked state meaning it’s > busy and connections are being dropped or very slow QOE for end user. If you > did nothing and let it ride the the user would be stuck on a bad connection. > > So this solution dynamically fixed the issue. > > As far as oscillation that is not a big deal as you are in a much worse off > state connected to a dying server on its last leg as far as memory and CPU. > > This solution I can see can apply to any client / server connection and not > just 5G and can be used by enterprises as well as SP for their customers to > have an drastically improved QOE. > > I saw some feedback on the TLV and I think once that is all worked out I am > in favor of advancing this draft. > > Kind Regards > > Gyan > > > On Wed, Jan 12, 2022 at 10:16 PM Les Ginsberg (ginsberg) <[email protected]> > wrote: > Gyan – > > The difference between ECMP and UCMP is not significant in this discussion. > I don’t want to speak for Robert, but for me his point was that IGPs can do > “multipath” well – but this does not translate into managing flows. > Please see my other responses on this thread. > > Thanx. > > Les > > > From: Gyan Mishra <[email protected]> > Sent: Wednesday, January 12, 2022 5:26 PM > To: Robert Raszuk <[email protected]> > Cc: Les Ginsberg (ginsberg) <[email protected]>; Linda Dunbar > <[email protected]>; [email protected] > Subject: Re: [Lsr] Seeking feedback to the revised > draft-dunbar-lsr-5g-edge-compute > > > Robert > > Here are a few examples of UCMP drafts below used in core and data center use > cases. > > https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-unequal-lb-15 > > https://datatracker.ietf.org/doc/html/draft-mohanty-bess-weighted-hrw-02 > > https://datatracker.ietf.org/doc/html/draft-ietf-idr-link-bandwidth > > https://datatracker.ietf.org/doc/html/draft-mohanty-bess-ebgp-dmz > > > > There are many use cases in routing for unequal cost load balancing > capabilities. > > Kind Regards > > Gyan > > On Wed, Jan 12, 2022 at 2:23 PM Robert Raszuk <[email protected]> wrote: > Linda, > > > IGP has been used for the Multi-path computation for a long time > > IGP can and does ECMP well. Moreover, injecting metric of anycast server > destination plays no role in it as all paths would inherit that external to > the IGP cost. > > Unequal cost load balancing or intelligent traffic spread has always been > done at the application layer *for example MPLS* > > Thx a lot, > R. > > On Wed, Jan 12, 2022 at 8:18 PM Linda Dunbar <[email protected]> > wrote: > Robert, > > Please see inline in green: > > From: Robert Raszuk <[email protected]> > Sent: Wednesday, January 12, 2022 1:00 PM > To: Linda Dunbar <[email protected]> > Cc: Les Ginsberg (ginsberg) <[email protected]>; [email protected] > Subject: Re: [Lsr] Seeking feedback to the revised > draft-dunbar-lsr-5g-edge-compute > > Hi Linda, > > [LES:] It is my opinion that what you propose will not achieve your goals – > in part because IGPs only influence forwarding on a per packet basis – not a > per flow/connection basis. > > [Linda] Most vendors do support flow based ECMP, with Shortest Path computed > by attributes advertised by IGP. > > > I am with Les here. ECMP has nothing to do with his point. > > [Linda] Les said that “IGP only influence forwarding on a per packet basis”. > I am saying that vendors supporting “forwarding per flow” with equal cost > computed by IGP implies that forwarding on modern routers are no longer > purely per packet basis. > > > Draft says: > > When those multiple server instances share one IP address (ANYCAST), the > transient network and load conditions can be incorporated in selecting an > optimal path among server instances for UEs. > > So if we apply any new metric to indicate load of a single anycast address > how is this going to help anything ? > > [Linda] The “Load” or “Aggregated Site Cost” is to differentiate multiple > paths with the same routing distance. > > > You would need a mechanism where the network is smart and say per src-dst > tuple or more spreads the traffic. IGP does not play that game today I am > afraid. > [Linda] There is one SRC and multiple paths to one DST. IGP has been used for > the Multi-path computation for a long time. > > Thank you, Linda > > Thx a lot, > R. > > > > > > > > _______________________________________________ > Lsr mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/lsr > -- > > > Gyan Mishra > Network Solutions Architect > Email [email protected] > M 301 502-1347 > > > -- > > > Gyan Mishra > Network Solutions Architect > Email [email protected] > M 301 502-1347 > > > -- > > > Gyan Mishra > Network Solutions Architect > Email [email protected] > M 301 502-1347 > > > _______________________________________________ > Lsr mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/lsr > _______________________________________________ > Lsr mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/lsr > _______________________________________________ > Lsr mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/lsr
_______________________________________________ Lsr mailing list [email protected] https://www.ietf.org/mailman/listinfo/lsr
