Hi,Peng, I have two questions, maybe they have been discussed before. It seems that CAN put computing information at the network layer and select the route based on this information,since the resource in a data center is time-varing, it is very possible that the dynamic nature of computing information in data centers will have overwhelming effect to the routing system of the network. The second concern is about the measurement of “computing”, although the term of computing is very popular and even hot, I haven’t found a unified measurement about computing. For a given data center, computing is about the quality of its CPUs, VMs or containers? Or it is the synthetical value of multiple metrics?
Regards Chongfeng > 2022年6月17日 下午10:12,[email protected] 写道: > > Hi Joel, > > Thanks. We have discussed and replied this problem that CAN won't want to > influence underlay routing, it just selects the service instance, and use the > existing routing methods to transport. > > I thinks there might be some misunderstanding of the description in the > discussion, rather than the differences of the thoughts. > > BTW, only very first emails of the issues posting are expected to cc rtgwg to > involve the people who are interested in it. We will back to dyncast mailing > list itself for further discussion. Thanks for mention that. > > Regards, > Peng > > [email protected] > > From: Joel Halpern <mailto:[email protected]> > Date: 2022-06-17 21:35 > To: [email protected] <mailto:[email protected]>; dyncast > <mailto:[email protected]> > CC: rtgwg <mailto:[email protected]> > Subject: Re: [Dyncast] CAN BoF issues #7 #17 #32 > If CAN does isntance selection and DSCP marking, then it can influence > routing and select appropraite instances. It is an understandable, > deployable, and probably scalable solution with the selection and marking > deployed at an appropriate place. If that place is the PE, and we want to > use a different IP address, then it probably uses a tunnel to deliver the > packets. If that place is the end host, then it can do what it wants. > > However, if you expect the routing system to be including and respecting > information about compute end point capabilities, and want the routing system > to manage suitable server stickiness and all the other needed properties, > then I think the version of CAN you are describing is a bad idea that will > harm the infrastructure by mixing funcitonality in inappropriate places. > > Yours, > Joel > > PS: I retain rtgwg on the copy for now, but as far as I can tell this > discussion belongs exclusively on the dyncast list. > > On 6/17/2022 4:13 AM, [email protected] > <mailto:[email protected]> wrote: >> Hi Dirk, >> >> For mode 1, CAN is only aware of computing information, because the basic >> routing could select the 'best' path naturely. >> >> For mode 2, CAN could also know more about the network path when the >> computing node selection is done, for instance, SR policy, network slicing, >> detnet, etc. and then utilize them. I don't think it will influence the >> underlay routing, some apps could require for the specific routing >> policy/strategy even there is no CAN service. >> >> CAN aims to provide the joint optimization service to specific applications. >> The difference is that whether to select the 'best' resource all the time, >> or just select the 'appropriate' one based on more awareness and decision >> making. >> >> Regards, >> Peng >> [email protected] <mailto:[email protected]> >> >> From: Dirk Trossen <mailto:[email protected]> >> Date: 2022-06-17 15:01 >> To: Linda Dunbar <mailto:[email protected]>; >> [email protected] <mailto:[email protected]>; dyncast >> <mailto:[email protected]> >> CC: rtgwg <mailto:[email protected]>; David R. Oran >> <mailto:[email protected]>; jefftant.ietf <mailto:[email protected]> >> Subject: RE: [Dyncast] CAN BoF issues #7 #17 #32 >> Hi Linda, Peng, all, >> >> Let us tease apart what “include the path selection” may mean since the >> nature of this inclusion may be significant in difference. >> >> For this, let us assume a service instance S_1 as one of possibly several >> ones for service S. S_1 may be reachable over a number of network paths, the >> selection of some of which would significantly impact any compute-aware >> selection of S_1 over the other available service instances for S. I can see >> two modes of ‘including path selection”: >> >> 1. S_1 exposes two (or more) IP addresses, where each IP address >> reflects a path from the client to the exposed address. IP addresses may be >> exposed across more than one network operator, multi-homing the service >> instance. Now here, ‘path selection’ is indirectly done by picking one IP >> address over all others, including the IP addresses of other service >> instances, and indeed, such indirect path selection may well be done through >> a metric that measures against (at least one) crucial path-related metric. >> But ultimately, the CAN provider selects one of possibly many IP address >> still, right? More importantly, it remains the task of the underlay routing >> infrastructure (again, which could include more than one network operator) >> to determine what it deems as the ‘best’ path to each of the IP addresses >> (including the multi-homed S_1 addresses). >> 2. Let’s stick with one IP address to S_1 now though but there are >> still at least two possible paths to it, where the selection of one over any >> of the other possible ones could well impact the compute-aware suitability >> of S_1 over any of the other service instances. Problem here is that >> ‘including the path selection’ would mean to impact the routing to the >> single S_1 IP address in a manner that that routing decision takes the >> compute-awareness into account. The path selection here is not indirect but >> direct, together with the IP address (i.e., service instance endpoint) >> selection. What is required here is that CAN provider and underlay somehow >> work together in selecting one path over another (to the same IP address), >> which in turn would mean to impact the overall routing decision for S_1’s IP >> address, which in turn would mean to impact the underlay routing >> infrastructure since the resulting (compute-aware) path configuration, in >> the form of suitable forwarding entries, needs distribution in the underlay >> infrastructure. >> >> I think we have to be clear which of the two options we see in the CAN scope >> but also if I may have missed options here. As we can see already from those >> two options, they have a significant impact on the architecture we may >> envision for CAN but also for its solution adoption. From my side, I have >> seen CAN mainly as an endpoint selection problem, so understood ‘path >> selection’ as an indirect one in the manner described in item 1. I just want >> to throw the options out here to solicit feedback from the community on this >> so that we get a good understanding moving forward. >> >> Best, >> >> Dirk >> >> From: Dyncast [mailto:[email protected] >> <mailto:[email protected]>] On Behalf Of Linda Dunbar >> Sent: 15 June 2022 23:07 >> To: [email protected] <mailto:[email protected]>; dyncast >> <[email protected]> <mailto:[email protected]> >> Cc: rtgwg <[email protected]> <mailto:[email protected]>; David R. Oran >> <[email protected]> <mailto:[email protected]>; jefftant.ietf >> <[email protected]> <mailto:[email protected]> >> Subject: Re: [Dyncast] CAN BoF issues #7 #17 #32 >> >> Peng, >> >> For Issue #32, you said: “CAN does not compute path, it selects endpoints.” >> >> If CAN means Computing Aware Networking, it should include the path >> selection. Maybe CAN is about Selecting (or computing) the optimal paths >> based on the combination of network conditions and the end point computing >> available resources? >> >> My two cents, >> >> Linda >> >> From: Dyncast <[email protected] <mailto:[email protected]>> >> On Behalf Of [email protected] <mailto:[email protected]> >> Sent: Monday, June 13, 2022 10:00 PM >> To: dyncast <[email protected] <mailto:[email protected]>> >> Cc: rtgwg <[email protected] <mailto:[email protected]>>; David R. Oran >> <[email protected] <mailto:[email protected]>>; jefftant.ietf >> <[email protected] <mailto:[email protected]>> >> Subject: [Dyncast] CAN BoF issues #7 #17 #32 >> >> Dear All, >> >> Here are the responses to issues #7 #17 #32, any comments are welcome! The >> issues and responses are also copied to the questioner ( >> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatatracker.ietf.org%2Fdoc%2Fminutes-113-can%2F&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HA8ebRR0zU586fKOEn%2BX245pVB5wQ51BBnJjXYWD4dw%3D&reserved=0>https://datatracker.ietf.org/doc/minutes-113-can/ >> >> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatatracker.ietf.org%2Fdoc%2Fminutes-113-can%2F&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HA8ebRR0zU586fKOEn%2BX245pVB5wQ51BBnJjXYWD4dw%3D&reserved=0>) >> >> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatatracker.ietf.org%2Fdoc%2Fminutes-113-can%2F&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HA8ebRR0zU586fKOEn%2BX245pVB5wQ51BBnJjXYWD4dw%3D&reserved=0>, >> hope for further suggestions and confirmation. Thanks! >> >> #7 This seems to assume conventional non-distributed applications just >> running at the edge. What about modern frameworks like Sapphire? and Ray? >> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCAN-IETF%2FCAN-BoF-ietf113%2Fissues%2F7&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=DpLlwOTLZ8V7gF%2B2JvSBIbXUnEqpEdpVWfYzv9IgRzA%3D&reserved=0> >> It would be good to understand the multi-site requirements of such >> frameworks, which seems to mainly run in single DCs. >> >> #17 Whether the interests of the organization deploying the application and >> the organization providing the network connectivity are aligned. Google >> doesn't worry about this because they are both. >> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCAN-IETF%2FCAN-BoF-ietf113%2Fissues%2F17&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=4%2B%2FmX48%2FoHZRp8m7xVV9kOitmL6pmfb56M%2F8bGPNNDM%3D&reserved=0> >> The question is more what the scope and semantic of information is that will >> need to cross organizational boundaries. This needs further study, in >> particular when assuming stakeholder division between service and network >> provider. >> >> #32 How to effectively compute paths? Shall we put CPUs into account? >> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCAN-IETF%2FCAN-BoF-ietf113%2Fissues%2F32&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pEZtXQ54gaT4Bx4gwrKyJWyBLM6YImEwnSpg%2B5m%2FiO4%3D&reserved=0> >> CAN does not compute path, it selects endpoints. Path selection (to a given >> endpoint) is subject to the routing at the IP underlay. For selecting >> endpoints, CPU information may be taken into account to achieve the >> 'compute-awareness' that CAN strives for. >> >> You can also add your comments to any of >> them(https://github.com/CAN-IETF/CAN-BoF-ietf113/issues >> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCAN-IETF%2FCAN-BoF-ietf113%2Fissues&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=C2YRche0EjTbxhZWVwHSvYhN8OA7SCcfXhLSFA%2Bqnbk%3D&reserved=0>). >> >> >> Regards, >> Peng >> >> [email protected] <mailto:[email protected]> >> >> From: Linda Dunbar <mailto:[email protected]> >> Date: 2022-05-11 06:11 >> To: [email protected] <mailto:[email protected]> >> Subject: [Dyncast] Categories of the CAN BoF issues >> CAN BoF proponents: >> >> Many thanks for creating the CAN BoF issues tracking in the Github: >> https://github.com/CAN-IETF/CAN-BoF-ietf113/issues/created_by/CAN-IETF?page=1&q=is%3Aopen+is%3Aissue+author%3ACAN-IETF >> >> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCAN-IETF%2FCAN-BoF-ietf113%2Fissues%2Fcreated_by%2FCAN-IETF%3Fpage%3D1%26q%3Dis%253Aopen%2Bis%253Aissue%2Bauthor%253ACAN-IETF&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ZqH4%2FI1csqsOVjpnw1TmFJJzMX86fCfPzgjbjfAnJHY%3D&reserved=0> >> >> I went through the issues captured in the Github and characterized them into >> groups. Some issues can be lumped together for the discussion. There are >> quite a few issues related to the requirements, which need to be clarified. >> >> Best Regards, Linda >> >> >> Issues associated with Applications vs. Underlay networks: >> · Consider not to load underlay network with application details. #35 >> · We have multiple upper layer application. Do we have additional >> needs for routing(e.g. WG?) or we are using those applications and won't >> need such new WG? #30 >> · It needs application information too, so it can't just make a >> decision at the network layer. #23 >> · This is not striked as a routing problem; it's all service >> discovery that can be done in higher layers. #21 >> · 3GPP and URSP solve this based on UPF selection. It uses both >> endpoint + application. #20 >> · One overlay plane per application. Resources/metric specific to the >> plane. #19 >> · How does the application layer or the transport layer learn the >> network status to steering traffic? #16 >> >> Need more clear requirements for CAN (to be addressed by >> draft-liu-dyncast-ps-usecases): >> · Need to understand if three are requirement to avoid extra messages >> or 1ms of latency #36 >> · Regarding the flow affinity, is it from network perspective or from >> application/computation perspective? #33 >> · How to effectively compute paths? Shall we put CPUs into account? >> #32 >> · What happens when the user moves? If so we also need to move >> application context. #25 >> · It can only move the services around as fast as it can update the >> routing plane. which comes back to the point about service discovery >> (waiting for convergence/distribution as opposed to just updating the SD >> server) #24 >> · Whether the interests of the organization deploying the application >> and the organization providing the network connectivity are aligned. Google >> doesn't worry about this because they are both. #17 >> o The question is more what the scope and semantic of information is that >> will need to cross organizational boundaries. This needs further study, in >> particular when assuming stakeholder division between service and network >> provider. >> · It seems impossible to satisfy that requirement simultaneously with >> the latency requirement. #15 >> · It wasn't clear that how hard of a requirement session persistence >> is. #13 >> o A session usually creates ephemeral state. If execution changes from one >> (e.g., virtualized) service instance to another, state/context needs >> transfer to another. Such required transfer of state/context makes it >> desirable to have session persistence (or instance affinity) as the default, >> removing the need for explicit context transfer, while also supporting an >> explicit state/context transfer (e.g., when metrics change significantly). >> · Should it select UPF based on the application? Steering is done per >> user? or per application? #9 >> · This seems to assume conventional non-distributed applications just >> running at the edge. what about modern frameworks like Sapphire? and Ray? #7 >> o It would be good to understand the multi-site requirements of such >> framework, which I have understood to mainly run in single DCs. >> · Relation to 3GPP UPF #6 >> · Relation to ALTO #5 >> · Do the mobility issues and associated protocols are also in scope? >> There are scenarios where routing alone would not be sufficient. #4 >> · What is the position in the edge location regarding to UPF? #3 >> · Is there some sort of authorization model so that an edge can >> indicate whether or not it will provide compute services? #2 >> · What is CNC and the relationship with CAN #1 >> >> Measurement of the Computing Resources (to be addressed by >> draft-du-computing-resource-representation): >> · It is hard to use existing work to measure the computation, but we >> can optimize the latency through the performance monitoring. We have >> performance/measurement matrix over there. #34 >> · Clarifications on the computing resource, its requirements and >> characteristics would be helpful. #27 >> · Each application may have a different definition of "resources" >> these then have to be boiled down into a single topology Network Aware >> Computing (NAC! :) does scale #14 >> · Is computing resource measurable? #10 <> >> o It is, and how to use the measurement would be solution related. See >> IFIP Networking 2022 paper on how to simply expose “computing capability” >> and achieve better steering with such simple measure. >> · Why compute resource is different with other resources? #8 >> · >> Load Balance based solutions: >> · The point is that we need a standardized LB protocol #18 >> · The LB as part of the application itself is superior (part of the >> distributed application itself is to obtain and keep updating the "best" >> unicast location to use). #22 >> · If there is anything missing from current lbs that would prevent >> their use as-is? other than there is for market reasons no interop standard >> between different lbs? #12 >> · For the load balance, should it learn the network’s status? #11 >> · >> Dyncast based Solution issues: >> · For Dyncast, when the time is short, is it possible for the router >> to decide the routing? It is too fast. #31 >> · Is dyncast proposed to encapsulate? #29 >> · Will CAN dyncast impact each and every router? How to avoid loops? >> #28 >> · What's the assumed scale of a D-router? 10 ^ 6 sessions? 100^ 8? >> What's the assumed update rate? !Gb? 1Tb? #26 >> >> >> >> >> >> >> _______________________________________________ >> rtgwg mailing list >> [email protected] <mailto:[email protected]> >> https://www.ietf.org/mailman/listinfo/rtgwg >> <https://www.ietf.org/mailman/listinfo/rtgwg> > _______________________________________________ > rtgwg mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/rtgwg >
_______________________________________________ rtgwg mailing list [email protected] https://www.ietf.org/mailman/listinfo/rtgwg
