Re: [alto] ALTO service query spanning multiple domains (ECS)
Hi, Richard: 发件人: alto [mailto:alto-boun...@ietf.org] 代表 Y. Richard Yang 发送时间: 2022年9月14日 1:47 收件人: IETF ALTO 主题: Re: [alto] ALTO service query spanning multiple domains (ECS) Hi all, There were quite extensive discussions, on the reactive/on-demand, multi-domain design below, during the weekly meeting early this morning and here is a brief summary of the key points. - The design can be decomposed into two components: routing composition and cost metric composition, from single domains to multiple domains: (metricVal, egressIP) = alto-server.query(ingressIP, srcIP, dstIP, metric) [Qin Wu] I feel this formula doesn’t explicitly indicate this is multi-domain setting, when srcIP and dstIP belong to different domains, we need to calculate ingressIP and egressIP in each domain one by one, also metricValue is not end to end metric, it is edge to edge metric. => egressIP = query(ingressIP, pkt-attributes) // Q1 metricVal = query(ingressIP, pkt-attributes, metric) //Q2 - It turns out that there can be at least multiple potential design points: For Q1: It can have two designs: - D1.1 It can be that Q1 is out of scope for ALTO, for example, by using a routing system to conduct the query [Qin Wu] if each alto server can calculate edge to edge metric from ingress node to egress node in each domain, why bother routing system to conduct the query, the routing system can provide measurements and inject measurement results to alto server. No? - D1.2 It can be a new service/metric type in ALTO [Qin Wu] yes, you can introduce new service/metric type in ALTO, we have already define similar service/metric type in PCE, see RFC8233. In addition, as I clarified in the previous email, we need to distinguish server to server communication within a domain and one across domains, this require ALTO protocol extension, e.g., inter-domain flag indication, I am not sure stitching label is required, In PCE use case, stitching label is required to setup end to end path across domain. For Q2: it also can have two designs: - D2.1 using the current RFC7285 for Q2, with specifying ingressIP as src, in the ECS query This design is good if a network is destination-based routing - D2.2 We add an extension to ALTO ECS, to include general packet (and pseudo packet) attributes For the extension design path, it can help to (1) convey the matching requirements in the IRD (for example, what packet attributes to be included) (2) return indicates the equivalent classes (matching masks). [Qin Wu] I am not sure caching is required in this case, I am worried about matching masks might be error prone. I suggest to investigate how Backward-Recursive PCE-Based Computation (BRPC) Procedure defined in RFC5441 can be used, instead to solve your problem. The design team will proceed with the simple designs first to push forward the deployment, but will document the proceeding. Cheers, Richard On Mon, Sep 12, 2022 at 10:50 PM Y. Richard Yang mailto:y...@cs.yale.edu>> wrote: Hi all, During the weekly meeting last week, we went over the details when deploying ALTO in a multi-domain setting, say the FTS/Rucio setting supporting the TCN deployment [1]. Below is the endpoint cost service (ECS) case, and it was suggested that we post it to the WG mailing list to update the WG and get potential feedback. Problem: An ALTO client queries the endpoint cost from srcIP to dstIP for a given performance metric (e.g., latency). Consider the case that the srcIP and dstIP belong to different networks, with the whole layer-3 path as the list [ip[0], ip[1], ..., ip[N]], where ip[0] = srcIP and ip[N] = dstIP. Define Net(ip) as the function that maps an IP address to the network that owns the IP---ignore the complexity such as anycast since the deployment does not have this case. Then Net(srcIP) != Net(dstIP), if it is multi-domain. Consider the initial deployment that we have only an ALTO server for each network; that is, it provides ALTO service for only Net(srcIP) == Net(dstIP). Then, there is not a single ALTO server that can provide the answer. Basic solution (one src-dst flow): Map the list [ip[0], ..., ip[N]] to a list of segments, where each segment starts with an IP address, and ends with the first IP address in the sequence that leaves the network of the start IP address. Hence, the basic query framework at an aggreation ALTO client: - alto-ecs(srcIP, dstIP, metric) metrics = EMPTY ingressIP = srcIP do { alto-server = server-discovery(ingressIP) (metricVal, egressIP) = alto-server.query(ingressIP, srcIP, dstIP, metric) metrics.add(metricVal) ingressIP = egressIP } while (egressIP != dstIP) The preceding assumes a procedure that collects segment attributes, and it can be a single pass composition using a metric-dependent function (e.g., latency is addition, and bw is min). Multi-flow queries: ALTO ECS supports the qu
Re: [alto] ALTO service query spanning multiple domains (ECS)
Richard and All: Come back for this multi-domain setting issue, I feel the first problem is about inter-domain alto query, when we assume each domain has one alto server, is the sequence of domain traversed is either administratively predetermined or discovered? from source domain to destination domain, there are multiple intermediate domains in parallel, e.g., , , you have two domain sequence set, how do we tackle this case? How inter domain query is exchanged across domain, I think server to server communication is required. In this case, one server can be seen as client, the other server plays the role of ALTO server. So the question is how does ALTO server know Server to server communication happens within a single domain or across domain, I believe alto protocol extension is required, let’s say inter-domain flags to distinguish single domain communication from server to server communication across domain, Similar idea can be seen in section 2.3 of draft-ietf-pce-stateful-interdomain. Secondly, if we really want to support multi-domain setting, cross domain Topology Confidentiality needs to be considered, we can explore how JOSE can be used to carry ALTO payloads across a collection of intermediate servers in different domain. RFC5520 (https://datatracker.ietf.org/doc/rfc5520/)defines path key to hide the contents of a segment of a path. Maybe JOSE can be used to provide similar functionality. Regarding Multi-flow queries, using caches seems to provide an optimization solution to the first problem but also adds a lot of complexity. Without end to end stitching label, it seems hard to get it implemented. If we want to tackle this hard problem, I suggest to take a look at Backward-Recursive PCE-Based Computation (BRPC) Procedure defined in RFC5441 and Inter-domain Path Management defined in section 5 of draft-ietf-pce-stateful-interdomain. Regarding metric-dependent function, you can refer to section 3.3 of RFC8233 for more details, do we want to define more new objective function? One more question to the pseudo code below: alto-server = server-discovery(ingressIP) Is server discovery cross domain server discovery defined in RFC8686? -Qin 发件人: alto [mailto:alto-boun...@ietf.org] 代表 Y. Richard Yang 发送时间: 2022年9月13日 10:51 收件人: IETF ALTO 主题: [alto] ALTO service query spanning multiple domains (ECS) Hi all, During the weekly meeting last week, we went over the details when deploying ALTO in a multi-domain setting, say the FTS/Rucio setting supporting the TCN deployment [1]. Below is the endpoint cost service (ECS) case, and it was suggested that we post it to the WG mailing list to update the WG and get potential feedback. Problem: An ALTO client queries the endpoint cost from srcIP to dstIP for a given performance metric (e.g., latency). Consider the case that the srcIP and dstIP belong to different networks, with the whole layer-3 path as the list [ip[0], ip[1], ..., ip[N]], where ip[0] = srcIP and ip[N] = dstIP. Define Net(ip) as the function that maps an IP address to the network that owns the IP---ignore the complexity such as anycast since the deployment does not have this case. Then Net(srcIP) != Net(dstIP), if it is multi-domain. Consider the initial deployment that we have only an ALTO server for each network; that is, it provides ALTO service for only Net(srcIP) == Net(dstIP). Then, there is not a single ALTO server that can provide the answer. Basic solution (one src-dst flow): Map the list [ip[0], ..., ip[N]] to a list of segments, where each segment starts with an IP address, and ends with the first IP address in the sequence that leaves the network of the start IP address. Hence, the basic query framework at an aggreation ALTO client: - alto-ecs(srcIP, dstIP, metric) metrics = EMPTY ingressIP = srcIP do { alto-server = server-discovery(ingressIP) (metricVal, egressIP) = alto-server.query(ingressIP, srcIP, dstIP, metric) metrics.add(metricVal) ingressIP = egressIP } while (egressIP != dstIP) The preceding assumes a procedure that collects segment attributes, and it can be a single pass composition using a metric-dependent function (e.g., latency is addition, and bw is min). Multi-flow queries: ALTO ECS supports the querying of multiple src-dst pairs. A simple solution is to query each src-dst pair one-by-one. Such a query is necessary because the routing can be dependent on packet attributes (srcIP, dstIP) and a pseudo packet attribute (ingressIP), and the ALTO client cannot reuse the results. To allow reuse (both in multi-flow queries and caching of past queries), it helps that the ALTO server indicates equivalent classes, which Kai and Jensen investigated. A revision of the protocol using caching and equivalent class is: alto-server-cache: indexed by ALTO server, pairs - alto-ecs(srcIP, dstIP, metric) metrics = EMPTY ingressIP = srcIP do { alto-server = server-discovery(in
Re: [alto] ALTO service query spanning multiple domains (ECS)
Hi all, There were quite extensive discussions, on the reactive/on-demand, multi-domain design below, during the weekly meeting early this morning and here is a brief summary of the key points. - The design can be decomposed into two components: routing composition and cost metric composition, from single domains to multiple domains: (metricVal, egressIP) = alto-server.query(ingressIP, srcIP, dstIP, metric) => egressIP = query(ingressIP, pkt-attributes) // Q1 metricVal = query(ingressIP, pkt-attributes, metric) //Q2 - It turns out that there can be at least multiple potential design points: For Q1: It can have two designs: - D1.1 It can be that Q1 is out of scope for ALTO, for example, by using a routing system to conduct the query - D1.2 It can be a new service/metric type in ALTO For Q2: it also can have two designs: - D2.1 using the current RFC7285 for Q2, with specifying ingressIP as src, in the ECS query This design is good if a network is destination-based routing - D2.2 We add an extension to ALTO ECS, to include general packet (and pseudo packet) attributes For the extension design path, it can help to (1) convey the matching requirements in the IRD (for example, what packet attributes to be included) (2) return indicates the equivalent classes (matching masks). The design team will proceed with the simple designs first to push forward the deployment, but will document the proceeding. Cheers, Richard On Mon, Sep 12, 2022 at 10:50 PM Y. Richard Yang wrote: > Hi all, > > During the weekly meeting last week, we went over the details when > deploying ALTO in a multi-domain setting, say the FTS/Rucio setting > supporting the TCN deployment [1]. Below is the endpoint cost service (ECS) > case, and it was suggested that we post it to the WG mailing list to update > the WG and get potential feedback. > > Problem: An ALTO client queries the endpoint cost from srcIP to dstIP for > a given performance metric (e.g., latency). Consider the case that the > srcIP and dstIP belong to different networks, with the whole layer-3 path > as the list [ip[0], ip[1], ..., ip[N]], where ip[0] = srcIP and ip[N] = > dstIP. Define Net(ip) as the function that maps an IP address to the > network that owns the IP---ignore the complexity such as anycast since the > deployment does not have this case. Then Net(srcIP) != Net(dstIP), if it is > multi-domain. Consider the initial deployment that we have only an ALTO > server for each network; that is, it provides ALTO service for only > Net(srcIP) == Net(dstIP). Then, there is not a single ALTO server that can > provide the answer. > > Basic solution (one src-dst flow): Map the list [ip[0], ..., ip[N]] to a > list of segments, where each segment starts with an IP address, and ends > with the first IP address in the sequence that leaves the network of the > start IP address. Hence, the basic query framework at an aggreation ALTO > client: > - alto-ecs(srcIP, dstIP, metric) > metrics = EMPTY > ingressIP = srcIP > do { > alto-server = server-discovery(ingressIP) > (metricVal, egressIP) = alto-server.query(ingressIP, srcIP, dstIP, > metric) > metrics.add(metricVal) > ingressIP = egressIP > } while (egressIP != dstIP) > > The preceding assumes a procedure that collects segment attributes, and it > can be a single pass composition using a metric-dependent function (e.g., > latency is addition, and bw is min). > > Multi-flow queries: ALTO ECS supports the querying of multiple src-dst > pairs. A simple solution is to query each src-dst pair one-by-one. Such a > query is necessary because the routing can be dependent on packet > attributes (srcIP, dstIP) and a pseudo packet attribute (ingressIP), and > the ALTO client cannot reuse the results. To allow reuse (both in > multi-flow queries and caching of past queries), it helps that the ALTO > server indicates equivalent classes, which Kai and Jensen investigated. > > A revision of the protocol using caching and equivalent class is: > alto-server-cache: indexed by ALTO server, pairs > - alto-ecs(srcIP, dstIP, metric) > metrics = EMPTY > ingressIP = srcIP > do { > alto-server = server-discovery(ingressIP) > if (alto-server-cache.match(alto-server, ingressIP, srcIP, dstIP) > use cache results > else > (metricVal, egressIP; ingressIPMask, srcIPMask, dstIPMask) > = alto-server.query(ingressIP, srcIP, dstIP, metric) > alto-server-cache.add(alto-server, , > , > metrics.add(metricVal) > ingressIP = egressIP > } while (egressIP != dstIP) > > The mask design is a special case. For the general case, the most flexible > equivalent class may be using predicates (e.g., supporting identifying the > lower entries of longest prefix matching). It is an issue that can benefit > from more benchmarking, or if there are any related pointers, the team will > appr
Re: [alto] ALTO service query spanning multiple domains (ECS)
Hi Richard and all, Thanks for the heads up. Our approach can be considered as a proactive mode of the multidomain query described in the last email. Instead of searching for each (srcIP, dstIP) pair, the algorithm works at the granularity of IP prefixes. We consider the case that each domain operates an ALTO server, and the ALTO server can be discovered with the IP address of the border router. Also it is presumed that each domain D knows the IP prefixes (P_(D,1), ..., P_(D,K)) owned by itself (i.e., Net(P_(D,x)) = D), and the ingress port (i.e., access point) of the prefixes. Currently the algorithm only works for prefix-based routing but can be extended to handle tunnels, as reported in a related study [KATRA]. There are two independent processes: 1. prefix path discovery, and 2. ALTO queries. The prefix path discovery works as the reverse process of BGP. In the beginning, each domain D_i announces to its neighbors D_j 1) the cross product of Ps (prefixes owned by D_i) and Pd (prefixes announced to D_i from D_j), and 2) the ingress port of each valid (Ps, Pd) pair to D_j, which is the peer of the egress port for each (Ps, Pd) pair in D_i. Upon the message that some (Ps, Pd) arrives at D_j at ingress, D_j updates its local cross products, and finds the ingress of (Ps, Pd) of the next domain. Then this process is repeatedly conducted in each domain until convergence (guaranteed if the routing is correct, i.e., no loops or blackholes). The pseudo code below shows the high-level process but simplifies some details (like handling withdrawls triggered by local network updates). Initialize (D_i): // Construct the initial (srcPrefix, ingress, dstPrefix) cross products P_in = {prefixes owned by D_i} Ingress_in = {ingress for each prefix in D_i} P_out = {prefixes announced from all neighbors} Local_In = cross-product(zip(P_in, ingress_in), P_out) Local_out = {} // empty set // Announce to neighbors Synchronize (D_i, LocalPairs) Synchronize (D_i, CrossProduct): // Get the egress port for each (srcPrefix, ingress, dstPrefix) pair // Note that the prefixes may be split or merged based on the local FIB M = lookup-local-fib-and-group-by-egress-port(CrossProduct) // Announce to neighbors foreach (egress, {(Ps, Pd}) in M: next_ingress = get-peer-address(egress) D_j = server-discovery(next_ingress) Local_out = Local_out U {(Ps, egress, Pd)} emit-message(D_j, {(Ps, Pd)}, next_ingress) Update (D_i, {(Ps, Pd)}, ingress): // When there are updates from neighbors that (Ps, Pd) arrives in D_i from ingress // construct a new tuple (Ps, ingress, Pd) Delta = {(Ps, ingress, Pd)} Local_in = Local_in U Delta // Find the local paths for {(Ps, ingress, Pd)} and (iteratively) announce the pairs to neighbors Synchronize (D_i, Delta) After the prefix path discovery phase, each domain can determine whether a given (srcIP, dtsIP) pair traverses its network and, if yes, what is the path. Thus, the query part becomes relatively simple: Global-Query({(srcIP, dstIP)}): foreach D_i: metrics_i = Local-Query(D_i, {(srcIP, dstIP}) metrics = merge({metrics_i}) return metrics Local-Query(D_i, {(srcIP, dstIP)}): foreach (srcIP_k, dstIP_k): ingress_k = lookup-ingress(Local_in, srcIP_k, dstIP_k) path_k = look-up-fib-local-path(srcIP_k, ingress_k, dstIP_k) metrics_i = extract_metrics({path_k}) An improvement is to not broadcast the information of the complete prefix set but only those that can be queried. This can be done by replacing P_in = {prefixes owned by D_i} Ingress_in = {ingress for each prefix in D_i} P_out = {prefixes announced from all neighbors} with P_in = {prefixes owned by D_i and in Interested_in} Ingress_in = {ingress for each prefix in D_i} P_out = {prefixes announced from all neighbors and in Interested_out} where Interested_in and Interested_out are the prefixes declared by the clients, which contains all IP addresses that will be contained in an ALTO query. [KATRA]: https://www.usenix.org/conference/nsdi22/presentation/beckett Best regards, Kai and Jensen On Tue, Sep 13, 2022 at 10:51 AM Y. Richard Yang wrote: > Hi all, > > During the weekly meeting last week, we went over the details when > deploying ALTO in a multi-domain setting, say the FTS/Rucio setting > supporting the TCN deployment [1]. Below is the endpoint cost service (ECS) > case, and it was suggested that we post it to the WG mailing list to update > the WG and get potential feedback. > > Problem: An ALTO client queries the endpoint cost from srcIP to dstIP for > a given performance metric (e.g., latency). Consider the case that the > srcIP and dstIP belong to different networks, with the whole layer-3 path > as the list [ip[0], ip[1], ..., ip[N]], where ip[0] = srcIP and ip[N] = > dstIP. Define Net(ip) as the function that maps an IP address to the > network that owns the IP---ignore the complexity such as anycast since the > deployment does not have this case. Then Net(srcIP)