May be I missed something, but when I tried setting logical routers into specific chassis, still the E/W traffic was handled in a distributed way (from original chassis to destination chassis without going through the router chassis), such chassis was only used for N/S, but may be I got something wrong.
On Wed, Sep 20, 2017 at 4:48 PM, Gao Zhenyu <[email protected]> wrote: > " > But, if an ovn port in foo (chassis A) wants to talk to alice1 (chassis > B), > wouldn't all that E/W routing will happen virtually and the end result is > just a tunneled packet between chassis A and chassis B ? " > [ Now the hash function base on dst IP, if foo1 only talks to alice1, and > it is the tunnel packet between chassisA and chassis B ] > > The benifit is if you have two ovn-routers and those router are ONLY > deployed in chassis C and chassis D, the traffics can be sperated in two > paths automatically. Otherwise you need to config static rule one by one to > seperate traffics. > To make a long story short, you also can do same thing by config numerous > static rules to seperate traffic but the multipath can do it > automatically. > > 2017-09-20 22:08 GMT+08:00 Miguel Angel Ajo Pelayo <[email protected]>: > >> I forgot to say thank you very much for the explanation and diagrams. >> >> On Wed, Sep 20, 2017 at 4:07 PM, Miguel Angel Ajo Pelayo < >> [email protected]> wrote: >> >>> But, if an ovn port in foo (chassis A) wants to talk to alice1 (chassis >>> B), >>> wouldn't all that E/W routing will happen virtually and the end result >>> is just a tunneled packet between chassis A and chassis B ? >>> >>> What's the benefit of multipath there if the possible failing link is >>> always the connection between chassis A and chassis B ? >>> >>> I suspect there's something I'm missing on the picture. >>> >>> On Wed, Sep 20, 2017 at 3:49 PM, Gao Zhenyu <[email protected]> >>> wrote: >>> >>>> You can take a look at this patch that implement a testcase : >>>> https://patchwork.ozlabs.org/patch/815475/ >>>> >>>> In the testcase, we have R1, R2, R3. >>>> >>>> R1 and R2 that are connected to each other via LS "join" in >>>> 20.0.0.0/24 network. >>>> R1 and R3 that are connected to each other via LS "join2" in >>>> 20.0.0.0/24 network. >>>> R1 has switchess foo (192.168.1.0/24) connected to it. R2 and R3 has >>>> alice (172.16.1.0/24) connected to it. >>>> R2 and R3 are gateway routers. >>>> >>>> A packet send to alice1/aclie2 from foo have mulitpath to destination: >>>> 1. foo-->R1-->join-->R2-->alice. >>>> 2. foo-->R1-->join2-->R3-->alice. >>>> >>>> In this testcase, it simulates two packet, one's destination is >>>> 172.16.1.2, another is 172.16.1.4. The mulitpath that was configured in R1 >>>> can seperate those traffics to R2/R3. Finally, 172.16.1.2 packet travels >>>> path2, 172.16.1.4 packet travels path1 >>>> >>>> +------+ >>>> | foo | >>>> +------+ >>>> | >>>> | >>>> +------+ >>>> | R1 |---------+ >>>> +------+ | >>>> | | >>>> | | >>>> +------+ +-------+ >>>> | join | | join2 | >>>> +------+ +-------+ >>>> | | >>>> | | >>>> +------+ +-------+ >>>> | R2 | | R3 | >>>> +------+ +-------+ >>>> | | >>>> | | >>>> +-----------------+ >>>> | alice | >>>> +-----------------+ >>>> | | >>>> alice1 alice2 >>>> >>>> Please let me know if you have any question on it. :) >>>> >>>> Thanks >>>> Zhenyu Gao >>>> >>>> 2017-09-20 20:58 GMT+08:00 Miguel Angel Ajo Pelayo <[email protected] >>>> >: >>>> >>>>> Can you share an example of how this would benefit E/W routing. I'm >>>>> just not seeing the specific use case myself out of ignorance. >>>>> >>>>> It'd be great if you could explain how would it work between several >>>>> ports in the networks and routers (may be a diagram?) otherwise I can't be >>>>> really helpful reviewing :) >>>>> >>>>> Cheers, and thanks for the patience. >>>>> >>>>> On Wed, Sep 20, 2017 at 12:25 PM, Gao Zhenyu <[email protected]> >>>>> wrote: >>>>> >>>>>> Thanks for the suggestions! >>>>>> >>>>>> Not all Logical port has a real ofp_port connect with it. And >>>>>> bundle_load/bundle actions need real ovs port. >>>>>> Especially in ovn router port, all router port are virtual port which >>>>>> just a number/reg in our ovs-flows. >>>>>> >>>>>> This implement of multipath can seperate ovn east-west traffic, it >>>>>> helps dispatch traffic to gateways and routers easily. >>>>>> >>>>>> For south-north traffic, we can have bundle/bundle_load action to >>>>>> consider the remote tunnel up/down status. I would like to make it step >>>>>> by >>>>>> step and implement it in my next series patches. >>>>>> >>>>>> Thanks >>>>>> Zhenyu Gao >>>>>> >>>>>> 2017-09-20 17:53 GMT+08:00 Miguel Angel Ajo Pelayo < >>>>>> [email protected]>: >>>>>> >>>>>>> I'm not very familiar with multipath implementations, >>>>>>> >>>>>>> but would it be possible to use bundle( ouput action with hrw >>>>>>> algorithm instead of multipath calculation to a register?. >>>>>>> >>>>>>> I say this, because if you look at lib/multipath.c lib/bundle.c you >>>>>>> will find that bundle.c is going to consider the up/down status >>>>>>> (slave_enabled check) of the links. >>>>>>> >>>>>>> That way the controller doesn't need to modify any flow based on >>>>>>> link status. >>>>>>> >>>>>>> On Wed, Sep 20, 2017 at 5:45 AM, Gao Zhenyu <[email protected] >>>>>>> > wrote: >>>>>>> >>>>>>>> Thansk for the questions. >>>>>>>> >>>>>>>> the multipath_port can be set via ovn-nbctl. >>>>>>>> Like : ovn-nbctl -- --id=@lrt create Logical_Router_Static_Route >>>>>>>> ip_prefix=0.0.0.0/0 nexthop=10.88.77.1 multipath_port=[mp1,mp2] -- >>>>>>>> add Logical_Router edge1 static_routes @lrt >>>>>>>> This patch haven't implement a ovn-nbctl command to configure >>>>>>>> multipath routing. Because I am still considering reusing nexthop or >>>>>>>> output_port(make them become array entries), and want to collect >>>>>>>> suggestions on it. >>>>>>>> >>>>>>>> About the status of next -hop, I would like to introduce >>>>>>>> bundle_load and bfd to make it later. >>>>>>>> >>>>>>>> Thanks >>>>>>>> Zhenyu Gao >>>>>>>> >>>>>>>> 2017-09-20 11:13 GMT+08:00 <[email protected]>: >>>>>>>> >>>>>>>>> How to configure multipath_port in static_route? I think the the >>>>>>>>> multipath >>>>>>>>> can be figured out from exist data of static_route, may not need >>>>>>>>> to add >>>>>>>>> this multipath_port column. >>>>>>>>> >>>>>>>>> And I think we should add a status column to indicate the nexthop >>>>>>>>> state. >>>>>>>>> When some of nexthop in multipath is down, ovn should change the >>>>>>>>> correspond flows. >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Zhenyu Gao <[email protected]> >>>>>>>>> 发件人: [email protected] >>>>>>>>> 2017/09/19 19:37 >>>>>>>>> >>>>>>>>> 收件人: [email protected], [email protected], >>>>>>>>> [email protected], [email protected], [email protected], >>>>>>>>> 抄送: >>>>>>>>> 主题: [ovs-dev] [PATCH v1 1/3] Add multipath static router >>>>>>>>> in >>>>>>>>> OVN northd and north-db >>>>>>>>> >>>>>>>>> >>>>>>>>> 1. ovn-nb.ovsschema was updated to add new field multipath_port. >>>>>>>>> 2. Add multipath feature in ovn-northd part. northd generates >>>>>>>>> multipath >>>>>>>>> flows to dispatch traffic by using packet's IP dst address if user >>>>>>>>> set >>>>>>>>> Logical_Router_Static_Route's multipath_port with ports. >>>>>>>>> 3. Add new table(lr_in_multipath) in ovn-northd's router ingress >>>>>>>>> stages >>>>>>>>> to dispatch traffic to ports. >>>>>>>>> 4. Add multipath flow in Table 5(lr_in_ip_routing) and store hash >>>>>>>>> result >>>>>>>>> into reg0. reg9[2] was used to indicate packet which need >>>>>>>>> dispatching. >>>>>>>>> 5. Add multipath feature description in ovn/northd/ovn-northd.8.xml >>>>>>>>> and ovn/ovn-nb.xml >>>>>>>>> >>>>>>>>> Signed-off-by: Zhenyu Gao <[email protected]> >>>>>>>>> --- >>>>>>>>> ovn/northd/ovn-northd.8.xml | 67 +++++++++++- >>>>>>>>> ovn/northd/ovn-northd.c | 245 >>>>>>>>> ++++++++++++++++++++++++++++++++++++++------ >>>>>>>>> ovn/ovn-nb.ovsschema | 6 +- >>>>>>>>> ovn/ovn-nb.xml | 9 ++ >>>>>>>>> 4 files changed, 289 insertions(+), 38 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/ovn/northd/ovn-northd.8.xml >>>>>>>>> b/ovn/northd/ovn-northd.8.xml >>>>>>>>> index 0d85ec0..b1ce9a9 100644 >>>>>>>>> --- a/ovn/northd/ovn-northd.8.xml >>>>>>>>> +++ b/ovn/northd/ovn-northd.8.xml >>>>>>>>> @@ -1598,6 +1598,9 @@ icmp4 { >>>>>>>>> port (ingress table <code>ARP Request</code> will generate >>>>>>>>> an ARP >>>>>>>>> request, if needed, with <code>reg0</code> as the target >>>>>>>>> protocol >>>>>>>>> address and <code>reg1</code> as the source protocol >>>>>>>>> address). >>>>>>>>> + A IP route can be configured that it has multipath to >>>>>>>>> next-hop. >>>>>>>>> + If a packet has multipath to destination, OVN assign the >>>>>>>>> port >>>>>>>>> + index into reg[0] to indicate the packet's output port in >>>>>>>>> table 6. >>>>>>>>> </p> >>>>>>>>> >>>>>>>>> <p> >>>>>>>>> @@ -1617,6 +1620,28 @@ icmp4 { >>>>>>>>> >>>>>>>>> <li> >>>>>>>>> <p> >>>>>>>>> + IPv4/IPV6 multipath routing table. For each route to >>>>>>>>> IPv4/IPv6 >>>>>>>>> + network <var>N</var> with netmask <var>M</var>, on >>>>>>>>> multipath >>>>>>>>> port >>>>>>>>> + <var>P</var> with IP address <var>A</var> and Ethernet >>>>>>>>> + address <var>E</var>, a logical flow with match >>>>>>>>> + <code>ip4.dst ==<var>N</var>/<var>M</var></code>,whose >>>>>>>>> priority >>>>>>>>> + is the number of 1-bits plus 10 in <var>M</var>, >>>>>>>>> + has the following actions: >>>>>>>>> + </p> >>>>>>>>> + >>>>>>>>> + <pre> >>>>>>>>> +ip.ttl--; >>>>>>>>> +multipath (nw_dst, 0, modulo_n, <var>n_links</var>, 0, reg0); >>>>>>>>> +reg9[2] = 1 >>>>>>>>> +next; >>>>>>>>> + </pre> >>>>>>>>> + <p> >>>>>>>>> + <var>n_links</var> is the number of multipath port. >>>>>>>>> + </p> >>>>>>>>> + </li> >>>>>>>>> + >>>>>>>>> + <li> >>>>>>>>> + <p> >>>>>>>>> IPv4 routing table. For each route to IPv4 network >>>>>>>>> <var>N</var> with >>>>>>>>> netmask <var>M</var>, on router port <var>P</var> with >>>>>>>>> IP >>>>>>>>> address >>>>>>>>> <var>A</var> and Ethernet >>>>>>>>> @@ -1686,7 +1711,43 @@ next; >>>>>>>>> </li> >>>>>>>>> </ul> >>>>>>>>> >>>>>>>>> - <h3>Ingress Table 6: ARP/ND Resolution</h3> >>>>>>>>> + <h3>Ingress Table 6: Multipath</h3> >>>>>>>>> + <p> >>>>>>>>> + Any packet taht reaches this table is an IP packet and >>>>>>>>> reg9[2]=1 >>>>>>>>> + using the following flows to route to corresponding port. >>>>>>>>> This >>>>>>>>> table >>>>>>>>> + implement dispatching by consuming reg0. >>>>>>>>> + </p> >>>>>>>>> + >>>>>>>>> + <ul> >>>>>>>>> + <li> >>>>>>>>> + <p> >>>>>>>>> + A packet with netmask <var>M</var>, IP address >>>>>>>>> <var>A</var> and >>>>>>>>> + <code>reg9[2] = 1</code>, whose priority above 1 has >>>>>>>>> following >>>>>>>>> + actions: >>>>>>>>> + </p> >>>>>>>>> + >>>>>>>>> + <pre> >>>>>>>>> +reg0 = <var>G</var>; >>>>>>>>> +reg1 = <var>A</var>; >>>>>>>>> +eth.src = <var>E</var>; >>>>>>>>> +outport = <var>P</var>; >>>>>>>>> +flags.loopback = 1; >>>>>>>>> +next; >>>>>>>>> + </pre> >>>>>>>>> + >>>>>>>>> + <p> >>>>>>>>> + <var>G</var> is the gateway IP address. <var>A</var>, >>>>>>>>> <var>E</var> >>>>>>>>> + and <var>P</var> are the values that were described in >>>>>>>>> multipath >>>>>>>>> + routeing in table 5 >>>>>>>>> + </p> >>>>>>>>> + >>>>>>>>> + <p> >>>>>>>>> + A priority-0 logical flow with match has actions >>>>>>>>> <code>next;</code>. >>>>>>>>> + </p> >>>>>>>>> + </li> >>>>>>>>> + </ul> >>>>>>>>> + >>>>>>>>> + <h3>Ingress Table 7: ARP/ND Resolution</h3> >>>>>>>>> >>>>>>>>> <p> >>>>>>>>> Any packet that reaches this table is an IP packet whose >>>>>>>>> next-hop >>>>>>>>> @@ -1779,7 +1840,7 @@ next; >>>>>>>>> </li> >>>>>>>>> </ul> >>>>>>>>> >>>>>>>>> - <h3>Ingress Table 7: Gateway Redirect</h3> >>>>>>>>> + <h3>Ingress Table 8: Gateway Redirect</h3> >>>>>>>>> >>>>>>>>> <p> >>>>>>>>> For distributed logical routers where one of the logical >>>>>>>>> router >>>>>>>>> @@ -1836,7 +1897,7 @@ next; >>>>>>>>> </li> >>>>>>>>> </ul> >>>>>>>>> >>>>>>>>> - <h3>Ingress Table 8: ARP Request</h3> >>>>>>>>> + <h3>Ingress Table 9: ARP Request</h3> >>>>>>>>> >>>>>>>>> <p> >>>>>>>>> In the common case where the Ethernet destination has been >>>>>>>>> resolved, this >>>>>>>>> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c >>>>>>>>> index 49e4ac3..44d1fd4 100644 >>>>>>>>> --- a/ovn/northd/ovn-northd.c >>>>>>>>> +++ b/ovn/northd/ovn-northd.c >>>>>>>>> @@ -135,9 +135,10 @@ enum ovn_stage { >>>>>>>>> PIPELINE_STAGE(ROUTER, IN, UNSNAT, 3, "lr_in_unsnat") >>>>>>>>> \ >>>>>>>>> PIPELINE_STAGE(ROUTER, IN, DNAT, 4, "lr_in_dnat") >>>>>>>>> \ >>>>>>>>> PIPELINE_STAGE(ROUTER, IN, IP_ROUTING, 5, >>>>>>>>> "lr_in_ip_routing") \ >>>>>>>>> - PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 6, >>>>>>>>> "lr_in_arp_resolve") \ >>>>>>>>> - PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 7, >>>>>>>>> "lr_in_gw_redirect") \ >>>>>>>>> - PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 8, >>>>>>>>> "lr_in_arp_request") \ >>>>>>>>> + PIPELINE_STAGE(ROUTER, IN, MULTIPATH, 6, >>>>>>>>> "lr_in_multipath") \ >>>>>>>>> + PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 7, >>>>>>>>> "lr_in_arp_resolve") \ >>>>>>>>> + PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 8, >>>>>>>>> "lr_in_gw_redirect") \ >>>>>>>>> + PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 9, >>>>>>>>> "lr_in_arp_request") \ >>>>>>>>> >>>>>>>>> \ >>>>>>>>> /* Logical router egress stages. */ >>>>>>>>> \ >>>>>>>>> PIPELINE_STAGE(ROUTER, OUT, UNDNAT, 0, "lr_out_undnat") >>>>>>>>> \ >>>>>>>>> @@ -173,6 +174,11 @@ enum ovn_stage { >>>>>>>>> * one of the logical router's own IP addresses. */ >>>>>>>>> #define REGBIT_EGRESS_LOOPBACK "reg9[1]" >>>>>>>>> >>>>>>>>> +/* Indicate multipath action has process this packet and store >>>>>>>>> hash >>>>>>>>> result >>>>>>>>> + * into other regX. Should consume the hash result to determin >>>>>>>>> the right >>>>>>>>> + * output port. */ >>>>>>>>> +#define REGBIT_MULTIPATH "reg9[2]" >>>>>>>>> + >>>>>>>>> /* Returns an "enum ovn_stage" built from the arguments. */ >>>>>>>>> static enum ovn_stage >>>>>>>>> ovn_stage_build(enum ovn_datapath_type dp_type, enum ovn_pipeline >>>>>>>>> pipeline, >>>>>>>>> @@ -4142,72 +4148,165 @@ add_route(struct hmap *lflows, const >>>>>>>>> struct >>>>>>>>> ovn_port *op, >>>>>>>>> } >>>>>>>>> >>>>>>>>> static void >>>>>>>>> -build_static_route_flow(struct hmap *lflows, struct ovn_datapath >>>>>>>>> *od, >>>>>>>>> - struct hmap *ports, >>>>>>>>> - const struct >>>>>>>>> nbrec_logical_router_static_route >>>>>>>>> *route) >>>>>>>>> +add_multipath_route(struct hmap *lflows, uint32_t port_num, >>>>>>>>> + struct ovn_port **out_ports, >>>>>>>>> + const char **lrp_addr_s, >>>>>>>>> + struct ovn_datapath *od, >>>>>>>>> + const char *network_s, int plen, >>>>>>>>> + const char *gateway, const char *policy) >>>>>>>>> +{ >>>>>>>>> + bool is_ipv4 = strchr(network_s, '.') ? true : false; >>>>>>>>> + struct ds match = DS_EMPTY_INITIALIZER; >>>>>>>>> + const char *dir; >>>>>>>>> + uint16_t priority; >>>>>>>>> + >>>>>>>>> + if (policy && !strcmp(policy, "src-ip")) { >>>>>>>>> + dir = "src"; >>>>>>>>> + priority = plen * 2; >>>>>>>>> + } else { >>>>>>>>> + dir = "dst"; >>>>>>>>> + priority = (plen * 2) + 1; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + /* Set higer priority than regular route. */ >>>>>>>>> + priority += 10; >>>>>>>>> + >>>>>>>>> + ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" : >>>>>>>>> "6", dir, >>>>>>>>> + network_s, plen); >>>>>>>>> + >>>>>>>>> + struct ds actions = DS_EMPTY_INITIALIZER; >>>>>>>>> + >>>>>>>>> + ds_put_format(&actions, "ip.ttl--; "); >>>>>>>>> + ds_put_format(&actions, >>>>>>>>> + "multipath (nw_dst, 0, modulo_n, %u, 0, reg0); " >>>>>>>>> + "%s = 1; " >>>>>>>>> + "next;", >>>>>>>>> + port_num, REGBIT_MULTIPATH); >>>>>>>>> + >>>>>>>>> + /* The priority here is calculated to implement >>>>>>>>> longest-prefix-match >>>>>>>>> + * routing. */ >>>>>>>>> + ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_ROUTING, priority, >>>>>>>>> + ds_cstr(&match), ds_cstr(&actions)); >>>>>>>>> + >>>>>>>>> + for (int i = 0; i < port_num; i++) { >>>>>>>>> + struct ds mp_match = DS_EMPTY_INITIALIZER; >>>>>>>>> + struct ds mp_actions = DS_EMPTY_INITIALIZER; >>>>>>>>> + >>>>>>>>> + ds_put_format(&mp_match, "%s == 1 && reg0 == %d && ", >>>>>>>>> + REGBIT_MULTIPATH, i); >>>>>>>>> + ds_put_format(&mp_match, "ip%s.%s == %s/%d", >>>>>>>>> + is_ipv4 ? "4" : "6", dir, >>>>>>>>> + network_s, plen); >>>>>>>>> + >>>>>>>>> + ds_put_format(&mp_actions, "%sreg0 = ", is_ipv4 ? "" : >>>>>>>>> "xx"); >>>>>>>>> + if (gateway) { >>>>>>>>> + ds_put_cstr(&mp_actions, gateway); >>>>>>>>> + } else { >>>>>>>>> + ds_put_format(&mp_actions, "ip%s.dst", is_ipv4 ? "4" >>>>>>>>> : "6"); >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + ds_put_format(&mp_actions, "; " >>>>>>>>> + "%sreg1 = %s; " >>>>>>>>> + "eth.src = %s; " >>>>>>>>> + "outport = %s; " >>>>>>>>> + "flags.loopback = 1; " >>>>>>>>> + "next;", >>>>>>>>> + is_ipv4 ? "" : "xx", >>>>>>>>> + lrp_addr_s[i], >>>>>>>>> + out_ports[i]->lrp_networks.ea_s, >>>>>>>>> + out_ports[i]->json_key); >>>>>>>>> + >>>>>>>>> + /* Add flow in table 6 to determin the right output port >>>>>>>>> + * for this traffic. */ >>>>>>>>> + ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, priority, >>>>>>>>> + ds_cstr(&mp_match), ds_cstr(&mp_actions)); >>>>>>>>> + ds_destroy(&mp_match); >>>>>>>>> + ds_destroy(&mp_actions); >>>>>>>>> + } >>>>>>>>> + ds_destroy(&match); >>>>>>>>> + ds_destroy(&actions); >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> +static bool >>>>>>>>> +verify_nexthop_prefix(const struct nbrec_logical_router_static_ro >>>>>>>>> ute >>>>>>>>> *route, >>>>>>>>> + bool *is_ipv4, char **prefix_s, unsigned >>>>>>>>> int *plen) >>>>>>>>> { >>>>>>>>> ovs_be32 nexthop; >>>>>>>>> - const char *lrp_addr_s = NULL; >>>>>>>>> - unsigned int plen; >>>>>>>>> - bool is_ipv4; >>>>>>>>> >>>>>>>>> /* Verify that the next hop is an IP address with an all-ones >>>>>>>>> mask. >>>>>>>>> */ >>>>>>>>> - char *error = ip_parse_cidr(route->nexthop, &nexthop, &plen); >>>>>>>>> + char *error = ip_parse_cidr(route->nexthop, &nexthop, plen); >>>>>>>>> if (!error) { >>>>>>>>> - if (plen != 32) { >>>>>>>>> + if (*plen != 32) { >>>>>>>>> static struct vlog_rate_limit rl = >>>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>>> 1); >>>>>>>>> VLOG_WARN_RL(&rl, "bad next hop mask %s", >>>>>>>>> route->nexthop); >>>>>>>>> - return; >>>>>>>>> + return false; >>>>>>>>> } >>>>>>>>> - is_ipv4 = true; >>>>>>>>> + *is_ipv4 = true; >>>>>>>>> } else { >>>>>>>>> free(error); >>>>>>>>> >>>>>>>>> struct in6_addr ip6; >>>>>>>>> - error = ipv6_parse_cidr(route->nexthop, &ip6, &plen); >>>>>>>>> + error = ipv6_parse_cidr(route->nexthop, &ip6, plen); >>>>>>>>> if (!error) { >>>>>>>>> - if (plen != 128) { >>>>>>>>> + if (*plen != 128) { >>>>>>>>> static struct vlog_rate_limit rl = >>>>>>>>> VLOG_RATE_LIMIT_INIT(5, 1); >>>>>>>>> VLOG_WARN_RL(&rl, "bad next hop mask %s", >>>>>>>>> route->nexthop); >>>>>>>>> - return; >>>>>>>>> + return false; >>>>>>>>> } >>>>>>>>> - is_ipv4 = false; >>>>>>>>> + *is_ipv4 = false; >>>>>>>>> } else { >>>>>>>>> static struct vlog_rate_limit rl = >>>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>>> 1); >>>>>>>>> VLOG_WARN_RL(&rl, "bad next hop ip address %s", >>>>>>>>> route->nexthop); >>>>>>>>> free(error); >>>>>>>>> - return; >>>>>>>>> + return false; >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> - char *prefix_s; >>>>>>>>> - if (is_ipv4) { >>>>>>>>> + if (*is_ipv4) { >>>>>>>>> ovs_be32 prefix; >>>>>>>>> /* Verify that ip prefix is a valid IPv4 address. */ >>>>>>>>> - error = ip_parse_cidr(route->ip_prefix, &prefix, &plen); >>>>>>>>> + error = ip_parse_cidr(route->ip_prefix, &prefix, plen); >>>>>>>>> if (error) { >>>>>>>>> static struct vlog_rate_limit rl = >>>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>>> 1); >>>>>>>>> VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes >>>>>>>>> %s", >>>>>>>>> route->ip_prefix); >>>>>>>>> free(error); >>>>>>>>> - return; >>>>>>>>> + return false; >>>>>>>>> } >>>>>>>>> - prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix & >>>>>>>>> be32_prefix_mask(plen))); >>>>>>>>> + *prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix >>>>>>>>> + & >>>>>>>>> be32_prefix_mask(*plen))); >>>>>>>>> } else { >>>>>>>>> /* Verify that ip prefix is a valid IPv6 address. */ >>>>>>>>> struct in6_addr prefix; >>>>>>>>> - error = ipv6_parse_cidr(route->ip_prefix, &prefix, >>>>>>>>> &plen); >>>>>>>>> + error = ipv6_parse_cidr(route->ip_prefix, &prefix, plen); >>>>>>>>> if (error) { >>>>>>>>> static struct vlog_rate_limit rl = >>>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>>> 1); >>>>>>>>> VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes >>>>>>>>> %s", >>>>>>>>> route->ip_prefix); >>>>>>>>> free(error); >>>>>>>>> - return; >>>>>>>>> + return false; >>>>>>>>> } >>>>>>>>> - struct in6_addr mask = ipv6_create_mask(plen); >>>>>>>>> + struct in6_addr mask = ipv6_create_mask(*plen); >>>>>>>>> struct in6_addr network = ipv6_addr_bitand(&prefix, >>>>>>>>> &mask); >>>>>>>>> - prefix_s = xmalloc(INET6_ADDRSTRLEN); >>>>>>>>> - inet_ntop(AF_INET6, &network, prefix_s, INET6_ADDRSTRLEN); >>>>>>>>> + *prefix_s = xmalloc(INET6_ADDRSTRLEN); >>>>>>>>> + inet_ntop(AF_INET6, &network, *prefix_s, >>>>>>>>> INET6_ADDRSTRLEN); >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + return true; >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> +static void >>>>>>>>> +build_static_route_flow(struct hmap *lflows, struct ovn_datapath >>>>>>>>> *od, >>>>>>>>> + struct hmap *ports, >>>>>>>>> + const struct >>>>>>>>> nbrec_logical_router_static_route >>>>>>>>> *route) >>>>>>>>> +{ >>>>>>>>> + const char *lrp_addr_s = NULL; >>>>>>>>> + unsigned int plen; >>>>>>>>> + bool is_ipv4; >>>>>>>>> + char *prefix_s = NULL; >>>>>>>>> + >>>>>>>>> + if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s, >>>>>>>>> &plen)) { >>>>>>>>> + return; >>>>>>>>> } >>>>>>>>> >>>>>>>>> /* Find the outgoing port. */ >>>>>>>>> @@ -4270,7 +4369,75 @@ build_static_route_flow(struct hmap >>>>>>>>> *lflows, struct >>>>>>>>> ovn_datapath *od, >>>>>>>>> policy); >>>>>>>>> >>>>>>>>> free_prefix_s: >>>>>>>>> - free(prefix_s); >>>>>>>>> + if (prefix_s) { >>>>>>>>> + free(prefix_s); >>>>>>>>> + } >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> +static void >>>>>>>>> +build_multipath_flow(struct hmap *lflows, struct ovn_datapath *od, >>>>>>>>> + struct hmap *ports, >>>>>>>>> + const struct nbrec_logical_router_static_ro >>>>>>>>> ute >>>>>>>>> *route) >>>>>>>>> +{ >>>>>>>>> + unsigned int plen; >>>>>>>>> + bool is_ipv4; >>>>>>>>> + char *prefix_s = NULL; >>>>>>>>> + >>>>>>>>> + if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s, >>>>>>>>> &plen)) { >>>>>>>>> + return; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + /* Find the outgoing port. */ >>>>>>>>> + struct ovn_port **out_ports = xmalloc(route->n_multipath_port >>>>>>>>> * >>>>>>>>> + sizeof(struct >>>>>>>>> ovn_port *)); >>>>>>>>> + const char **lrp_addr_s = xmalloc(route->n_multipath_port * >>>>>>>>> + sizeof(const char *)); >>>>>>>>> + for (int i = 0; i < route->n_multipath_port; i++) { >>>>>>>>> + // TODO May need to consider some ports are not found? >>>>>>>>> + out_ports[i] = ovn_port_find(ports, >>>>>>>>> route->multipath_port[i]); >>>>>>>>> + if (!out_ports[i]) { >>>>>>>>> + static struct vlog_rate_limit rl = >>>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>>> 1); >>>>>>>>> + VLOG_WARN_RL(&rl, "Bad out port %s for static route >>>>>>>>> %s", >>>>>>>>> + route->multipath_port[i], >>>>>>>>> route->ip_prefix); >>>>>>>>> + goto free_ports_lrp_addr; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + lrp_addr_s[i] = find_lrp_member_ip(out_ports[i], >>>>>>>>> route->nexthop); >>>>>>>>> + if (!lrp_addr_s[i]) { >>>>>>>>> + if (is_ipv4) { >>>>>>>>> + if (out_ports[i]->lrp_networks.n_ipv4_addrs) { >>>>>>>>> + lrp_addr_s[i] = out_ports[i]-> >>>>>>>>> + lrp_networks.ipv4_addrs[0].addr_s; >>>>>>>>> + } >>>>>>>>> + } else { >>>>>>>>> + if (out_ports[i]->lrp_networks.n_ipv6_addrs) { >>>>>>>>> + lrp_addr_s[i] = out_ports[i]-> >>>>>>>>> + lrp_networks.ipv6_addrs[0].addr_s; >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + if (!lrp_addr_s[i]) { >>>>>>>>> + static struct vlog_rate_limit rl = >>>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>>> 1); >>>>>>>>> + VLOG_WARN_RL(&rl, >>>>>>>>> + "%s has no path for static route %s; >>>>>>>>> next hop >>>>>>>>> %s", >>>>>>>>> + route->multipath_port[i], >>>>>>>>> route->ip_prefix, >>>>>>>>> + route->nexthop); >>>>>>>>> + goto free_ports_lrp_addr; >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + >>>>>>>>> + char *policy = route->policy ? route->policy : "dst-ip"; >>>>>>>>> + add_multipath_route(lflows, route->n_multipath_port, >>>>>>>>> + out_ports, lrp_addr_s, od, >>>>>>>>> + prefix_s, plen, route->nexthop, policy); >>>>>>>>> + >>>>>>>>> +free_ports_lrp_addr: >>>>>>>>> + free(out_ports); >>>>>>>>> + free(lrp_addr_s); >>>>>>>>> + if (prefix_s) { >>>>>>>>> + free(prefix_s); >>>>>>>>> + } >>>>>>>>> } >>>>>>>>> >>>>>>>>> static void >>>>>>>>> @@ -5344,7 +5511,7 @@ build_lrouter_flows(struct hmap *datapaths, >>>>>>>>> struct >>>>>>>>> hmap *ports, >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> - /* Convert the static routes to flows. */ >>>>>>>>> + /* Convert the static routes and multipath route to flows. */ >>>>>>>>> HMAP_FOR_EACH (od, key_node, datapaths) { >>>>>>>>> if (!od->nbr) { >>>>>>>>> continue; >>>>>>>>> @@ -5355,12 +5522,24 @@ build_lrouter_flows(struct hmap >>>>>>>>> *datapaths, struct >>>>>>>>> hmap *ports, >>>>>>>>> >>>>>>>>> route = od->nbr->static_routes[i]; >>>>>>>>> build_static_route_flow(lflows, od, ports, route); >>>>>>>>> + /* Logical router ingress table 5-6: Multipath >>>>>>>>> Routing. >>>>>>>>> + * >>>>>>>>> + * If router has configured a traffic has multiple >>>>>>>>> paths >>>>>>>>> + * to destination. The right output port should be >>>>>>>>> firgured >>>>>>>>> + * out by computing IP packet's header */ >>>>>>>>> + if (route->n_multipath_port > 1) { >>>>>>>>> + /* Generate multipath routes in table 5,6 for >>>>>>>>> + * dedicated traffic */ >>>>>>>>> + build_multipath_flow(lflows, od, ports, route); >>>>>>>>> + } >>>>>>>>> } >>>>>>>>> + /* Packets are allowed by default in table 6. */ >>>>>>>>> + ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, 0, "1", >>>>>>>>> "next;"); >>>>>>>>> } >>>>>>>>> >>>>>>>>> /* XXX destination unreachable */ >>>>>>>>> >>>>>>>>> - /* Local router ingress table 6: ARP Resolution. >>>>>>>>> + /* Local router ingress table 7: ARP Resolution. >>>>>>>>> * >>>>>>>>> * Any packet that reaches this table is an IP packet whose >>>>>>>>> next-hop >>>>>>>>> IP >>>>>>>>> * address is in reg0. (ip4.dst is the final destination.) >>>>>>>>> This table >>>>>>>>> @@ -5555,7 +5734,7 @@ build_lrouter_flows(struct hmap *datapaths, >>>>>>>>> struct >>>>>>>>> hmap *ports, >>>>>>>>> "get_nd(outport, xxreg0); next;"); >>>>>>>>> } >>>>>>>>> >>>>>>>>> - /* Logical router ingress table 7: Gateway redirect. >>>>>>>>> + /* Logical router ingress table 8: Gateway redirect. >>>>>>>>> * >>>>>>>>> * For traffic with outport equal to the l3dgw_port >>>>>>>>> * on a distributed router, this table redirects a subset >>>>>>>>> @@ -5595,7 +5774,7 @@ build_lrouter_flows(struct hmap *datapaths, >>>>>>>>> struct >>>>>>>>> hmap *ports, >>>>>>>>> ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 0, "1", >>>>>>>>> "next;"); >>>>>>>>> } >>>>>>>>> >>>>>>>>> - /* Local router ingress table 8: ARP request. >>>>>>>>> + /* Local router ingress table 9: ARP request. >>>>>>>>> * >>>>>>>>> * In the common case where the Ethernet destination has been >>>>>>>>> resolved, >>>>>>>>> * this table outputs the packet (priority 0). Otherwise, it >>>>>>>>> composes >>>>>>>>> diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema >>>>>>>>> index a077bfb..b8bdd42 100644 >>>>>>>>> --- a/ovn/ovn-nb.ovsschema >>>>>>>>> +++ b/ovn/ovn-nb.ovsschema >>>>>>>>> @@ -1,7 +1,7 @@ >>>>>>>>> { >>>>>>>>> "name": "OVN_Northbound", >>>>>>>>> "version": "5.8.0", >>>>>>>>> - "cksum": "2812300190 <(281)%20230-0190> 16766", >>>>>>>>> + "cksum": "1967092589 16903", >>>>>>>>> "tables": { >>>>>>>>> "NB_Global": { >>>>>>>>> "columns": { >>>>>>>>> @@ -235,7 +235,9 @@ >>>>>>>>> >>>>>>>>> "dst-ip"]]}, >>>>>>>>> "min": 0, "max": 1}}, >>>>>>>>> "nexthop": {"type": "string"}, >>>>>>>>> - "output_port": {"type": {"key": "string", "min": >>>>>>>>> 0, >>>>>>>>> "max": 1}}}, >>>>>>>>> + "output_port": {"type": {"key": "string", "min": >>>>>>>>> 0, >>>>>>>>> "max": 1}}, >>>>>>>>> + "multipath_port": {"type": {"key": "string", >>>>>>>>> "min": 0, >>>>>>>>> + "max": "unlimited"}}}, >>>>>>>>> "isRoot": false}, >>>>>>>>> "NAT": { >>>>>>>>> "columns": { >>>>>>>>> diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml >>>>>>>>> index 9869d7e..15feb97 100644 >>>>>>>>> --- a/ovn/ovn-nb.xml >>>>>>>>> +++ b/ovn/ovn-nb.xml >>>>>>>>> @@ -1487,6 +1487,15 @@ >>>>>>>>> address as the one via which the <ref column="nexthop"/> >>>>>>>>> is >>>>>>>>> reachable. >>>>>>>>> </p> >>>>>>>>> </column> >>>>>>>>> + <column name="multipath_port"> >>>>>>>>> + <p> >>>>>>>>> + The name of the <ref table="Logical_Router_Port"/> via >>>>>>>>> which the >>>>>>>>> packet >>>>>>>>> + needs to be sent out. When it contains more than two >>>>>>>>> ports, it >>>>>>>>> means >>>>>>>>> + packet has multiple candidate output ports. OVN uses the >>>>>>>>> packet >>>>>>>>> header >>>>>>>>> + to determin which port the packet would be delivered to. >>>>>>>>> + Currently, OVN consumes destination IP address to figure >>>>>>>>> out >>>>>>>>> port. >>>>>>>>> + </p> >>>>>>>>> + </column> >>>>>>>>> </table> >>>>>>>>> >>>>>>>>> <table name="NAT" title="NAT rules"> >>>>>>>>> -- >>>>>>>>> 1.8.3.1 >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> dev mailing list >>>>>>>>> [email protected] >>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> dev mailing list >>>>>>>>> [email protected] >>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
