" But, if an ovn port in foo (chassis A) wants to talk to alice1 (chassis B), wouldn't all that E/W routing will happen virtually and the end result is just a tunneled packet between chassis A and chassis B ? " [ Now the hash function base on dst IP, if foo1 only talks to alice1, and it is the tunnel packet between chassisA and chassis B ]
The benifit is if you have two ovn-routers and those router are ONLY deployed in chassis C and chassis D, the traffics can be sperated in two paths automatically. Otherwise you need to config static rule one by one to seperate traffics. To make a long story short, you also can do same thing by config numerous static rules to seperate traffic but the multipath can do it automatically. 2017-09-20 22:08 GMT+08:00 Miguel Angel Ajo Pelayo <[email protected]>: > I forgot to say thank you very much for the explanation and diagrams. > > On Wed, Sep 20, 2017 at 4:07 PM, Miguel Angel Ajo Pelayo < > [email protected]> wrote: > >> But, if an ovn port in foo (chassis A) wants to talk to alice1 (chassis >> B), >> wouldn't all that E/W routing will happen virtually and the end result is >> just a tunneled packet between chassis A and chassis B ? >> >> What's the benefit of multipath there if the possible failing link is >> always the connection between chassis A and chassis B ? >> >> I suspect there's something I'm missing on the picture. >> >> On Wed, Sep 20, 2017 at 3:49 PM, Gao Zhenyu <[email protected]> >> wrote: >> >>> You can take a look at this patch that implement a testcase : >>> https://patchwork.ozlabs.org/patch/815475/ >>> >>> In the testcase, we have R1, R2, R3. >>> >>> R1 and R2 that are connected to each other via LS "join" in >>> 20.0.0.0/24 network. >>> R1 and R3 that are connected to each other via LS "join2" in >>> 20.0.0.0/24 network. >>> R1 has switchess foo (192.168.1.0/24) connected to it. R2 and R3 has >>> alice (172.16.1.0/24) connected to it. >>> R2 and R3 are gateway routers. >>> >>> A packet send to alice1/aclie2 from foo have mulitpath to destination: >>> 1. foo-->R1-->join-->R2-->alice. >>> 2. foo-->R1-->join2-->R3-->alice. >>> >>> In this testcase, it simulates two packet, one's destination is >>> 172.16.1.2, another is 172.16.1.4. The mulitpath that was configured in R1 >>> can seperate those traffics to R2/R3. Finally, 172.16.1.2 packet travels >>> path2, 172.16.1.4 packet travels path1 >>> >>> +------+ >>> | foo | >>> +------+ >>> | >>> | >>> +------+ >>> | R1 |---------+ >>> +------+ | >>> | | >>> | | >>> +------+ +-------+ >>> | join | | join2 | >>> +------+ +-------+ >>> | | >>> | | >>> +------+ +-------+ >>> | R2 | | R3 | >>> +------+ +-------+ >>> | | >>> | | >>> +-----------------+ >>> | alice | >>> +-----------------+ >>> | | >>> alice1 alice2 >>> >>> Please let me know if you have any question on it. :) >>> >>> Thanks >>> Zhenyu Gao >>> >>> 2017-09-20 20:58 GMT+08:00 Miguel Angel Ajo Pelayo <[email protected]> >>> : >>> >>>> Can you share an example of how this would benefit E/W routing. I'm >>>> just not seeing the specific use case myself out of ignorance. >>>> >>>> It'd be great if you could explain how would it work between several >>>> ports in the networks and routers (may be a diagram?) otherwise I can't be >>>> really helpful reviewing :) >>>> >>>> Cheers, and thanks for the patience. >>>> >>>> On Wed, Sep 20, 2017 at 12:25 PM, Gao Zhenyu <[email protected]> >>>> wrote: >>>> >>>>> Thanks for the suggestions! >>>>> >>>>> Not all Logical port has a real ofp_port connect with it. And >>>>> bundle_load/bundle actions need real ovs port. >>>>> Especially in ovn router port, all router port are virtual port which >>>>> just a number/reg in our ovs-flows. >>>>> >>>>> This implement of multipath can seperate ovn east-west traffic, it >>>>> helps dispatch traffic to gateways and routers easily. >>>>> >>>>> For south-north traffic, we can have bundle/bundle_load action to >>>>> consider the remote tunnel up/down status. I would like to make it step by >>>>> step and implement it in my next series patches. >>>>> >>>>> Thanks >>>>> Zhenyu Gao >>>>> >>>>> 2017-09-20 17:53 GMT+08:00 Miguel Angel Ajo Pelayo < >>>>> [email protected]>: >>>>> >>>>>> I'm not very familiar with multipath implementations, >>>>>> >>>>>> but would it be possible to use bundle( ouput action with hrw >>>>>> algorithm instead of multipath calculation to a register?. >>>>>> >>>>>> I say this, because if you look at lib/multipath.c lib/bundle.c you >>>>>> will find that bundle.c is going to consider the up/down status >>>>>> (slave_enabled check) of the links. >>>>>> >>>>>> That way the controller doesn't need to modify any flow based on link >>>>>> status. >>>>>> >>>>>> On Wed, Sep 20, 2017 at 5:45 AM, Gao Zhenyu <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Thansk for the questions. >>>>>>> >>>>>>> the multipath_port can be set via ovn-nbctl. >>>>>>> Like : ovn-nbctl -- --id=@lrt create Logical_Router_Static_Route >>>>>>> ip_prefix=0.0.0.0/0 nexthop=10.88.77.1 multipath_port=[mp1,mp2] -- >>>>>>> add Logical_Router edge1 static_routes @lrt >>>>>>> This patch haven't implement a ovn-nbctl command to configure >>>>>>> multipath routing. Because I am still considering reusing nexthop or >>>>>>> output_port(make them become array entries), and want to collect >>>>>>> suggestions on it. >>>>>>> >>>>>>> About the status of next -hop, I would like to introduce bundle_load >>>>>>> and bfd to make it later. >>>>>>> >>>>>>> Thanks >>>>>>> Zhenyu Gao >>>>>>> >>>>>>> 2017-09-20 11:13 GMT+08:00 <[email protected]>: >>>>>>> >>>>>>>> How to configure multipath_port in static_route? I think the the >>>>>>>> multipath >>>>>>>> can be figured out from exist data of static_route, may not need to >>>>>>>> add >>>>>>>> this multipath_port column. >>>>>>>> >>>>>>>> And I think we should add a status column to indicate the nexthop >>>>>>>> state. >>>>>>>> When some of nexthop in multipath is down, ovn should change the >>>>>>>> correspond flows. >>>>>>>> >>>>>>>> Thanks. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Zhenyu Gao <[email protected]> >>>>>>>> 发件人: [email protected] >>>>>>>> 2017/09/19 19:37 >>>>>>>> >>>>>>>> 收件人: [email protected], [email protected], >>>>>>>> [email protected], [email protected], [email protected], >>>>>>>> 抄送: >>>>>>>> 主题: [ovs-dev] [PATCH v1 1/3] Add multipath static router in >>>>>>>> OVN northd and north-db >>>>>>>> >>>>>>>> >>>>>>>> 1. ovn-nb.ovsschema was updated to add new field multipath_port. >>>>>>>> 2. Add multipath feature in ovn-northd part. northd generates >>>>>>>> multipath >>>>>>>> flows to dispatch traffic by using packet's IP dst address if user >>>>>>>> set >>>>>>>> Logical_Router_Static_Route's multipath_port with ports. >>>>>>>> 3. Add new table(lr_in_multipath) in ovn-northd's router ingress >>>>>>>> stages >>>>>>>> to dispatch traffic to ports. >>>>>>>> 4. Add multipath flow in Table 5(lr_in_ip_routing) and store hash >>>>>>>> result >>>>>>>> into reg0. reg9[2] was used to indicate packet which need >>>>>>>> dispatching. >>>>>>>> 5. Add multipath feature description in ovn/northd/ovn-northd.8.xml >>>>>>>> and ovn/ovn-nb.xml >>>>>>>> >>>>>>>> Signed-off-by: Zhenyu Gao <[email protected]> >>>>>>>> --- >>>>>>>> ovn/northd/ovn-northd.8.xml | 67 +++++++++++- >>>>>>>> ovn/northd/ovn-northd.c | 245 >>>>>>>> ++++++++++++++++++++++++++++++++++++++------ >>>>>>>> ovn/ovn-nb.ovsschema | 6 +- >>>>>>>> ovn/ovn-nb.xml | 9 ++ >>>>>>>> 4 files changed, 289 insertions(+), 38 deletions(-) >>>>>>>> >>>>>>>> diff --git a/ovn/northd/ovn-northd.8.xml >>>>>>>> b/ovn/northd/ovn-northd.8.xml >>>>>>>> index 0d85ec0..b1ce9a9 100644 >>>>>>>> --- a/ovn/northd/ovn-northd.8.xml >>>>>>>> +++ b/ovn/northd/ovn-northd.8.xml >>>>>>>> @@ -1598,6 +1598,9 @@ icmp4 { >>>>>>>> port (ingress table <code>ARP Request</code> will generate >>>>>>>> an ARP >>>>>>>> request, if needed, with <code>reg0</code> as the target >>>>>>>> protocol >>>>>>>> address and <code>reg1</code> as the source protocol >>>>>>>> address). >>>>>>>> + A IP route can be configured that it has multipath to >>>>>>>> next-hop. >>>>>>>> + If a packet has multipath to destination, OVN assign the port >>>>>>>> + index into reg[0] to indicate the packet's output port in >>>>>>>> table 6. >>>>>>>> </p> >>>>>>>> >>>>>>>> <p> >>>>>>>> @@ -1617,6 +1620,28 @@ icmp4 { >>>>>>>> >>>>>>>> <li> >>>>>>>> <p> >>>>>>>> + IPv4/IPV6 multipath routing table. For each route to >>>>>>>> IPv4/IPv6 >>>>>>>> + network <var>N</var> with netmask <var>M</var>, on >>>>>>>> multipath >>>>>>>> port >>>>>>>> + <var>P</var> with IP address <var>A</var> and Ethernet >>>>>>>> + address <var>E</var>, a logical flow with match >>>>>>>> + <code>ip4.dst ==<var>N</var>/<var>M</var></code>,whose >>>>>>>> priority >>>>>>>> + is the number of 1-bits plus 10 in <var>M</var>, >>>>>>>> + has the following actions: >>>>>>>> + </p> >>>>>>>> + >>>>>>>> + <pre> >>>>>>>> +ip.ttl--; >>>>>>>> +multipath (nw_dst, 0, modulo_n, <var>n_links</var>, 0, reg0); >>>>>>>> +reg9[2] = 1 >>>>>>>> +next; >>>>>>>> + </pre> >>>>>>>> + <p> >>>>>>>> + <var>n_links</var> is the number of multipath port. >>>>>>>> + </p> >>>>>>>> + </li> >>>>>>>> + >>>>>>>> + <li> >>>>>>>> + <p> >>>>>>>> IPv4 routing table. For each route to IPv4 network >>>>>>>> <var>N</var> with >>>>>>>> netmask <var>M</var>, on router port <var>P</var> with IP >>>>>>>> address >>>>>>>> <var>A</var> and Ethernet >>>>>>>> @@ -1686,7 +1711,43 @@ next; >>>>>>>> </li> >>>>>>>> </ul> >>>>>>>> >>>>>>>> - <h3>Ingress Table 6: ARP/ND Resolution</h3> >>>>>>>> + <h3>Ingress Table 6: Multipath</h3> >>>>>>>> + <p> >>>>>>>> + Any packet taht reaches this table is an IP packet and >>>>>>>> reg9[2]=1 >>>>>>>> + using the following flows to route to corresponding port. >>>>>>>> This >>>>>>>> table >>>>>>>> + implement dispatching by consuming reg0. >>>>>>>> + </p> >>>>>>>> + >>>>>>>> + <ul> >>>>>>>> + <li> >>>>>>>> + <p> >>>>>>>> + A packet with netmask <var>M</var>, IP address >>>>>>>> <var>A</var> and >>>>>>>> + <code>reg9[2] = 1</code>, whose priority above 1 has >>>>>>>> following >>>>>>>> + actions: >>>>>>>> + </p> >>>>>>>> + >>>>>>>> + <pre> >>>>>>>> +reg0 = <var>G</var>; >>>>>>>> +reg1 = <var>A</var>; >>>>>>>> +eth.src = <var>E</var>; >>>>>>>> +outport = <var>P</var>; >>>>>>>> +flags.loopback = 1; >>>>>>>> +next; >>>>>>>> + </pre> >>>>>>>> + >>>>>>>> + <p> >>>>>>>> + <var>G</var> is the gateway IP address. <var>A</var>, >>>>>>>> <var>E</var> >>>>>>>> + and <var>P</var> are the values that were described in >>>>>>>> multipath >>>>>>>> + routeing in table 5 >>>>>>>> + </p> >>>>>>>> + >>>>>>>> + <p> >>>>>>>> + A priority-0 logical flow with match has actions >>>>>>>> <code>next;</code>. >>>>>>>> + </p> >>>>>>>> + </li> >>>>>>>> + </ul> >>>>>>>> + >>>>>>>> + <h3>Ingress Table 7: ARP/ND Resolution</h3> >>>>>>>> >>>>>>>> <p> >>>>>>>> Any packet that reaches this table is an IP packet whose >>>>>>>> next-hop >>>>>>>> @@ -1779,7 +1840,7 @@ next; >>>>>>>> </li> >>>>>>>> </ul> >>>>>>>> >>>>>>>> - <h3>Ingress Table 7: Gateway Redirect</h3> >>>>>>>> + <h3>Ingress Table 8: Gateway Redirect</h3> >>>>>>>> >>>>>>>> <p> >>>>>>>> For distributed logical routers where one of the logical >>>>>>>> router >>>>>>>> @@ -1836,7 +1897,7 @@ next; >>>>>>>> </li> >>>>>>>> </ul> >>>>>>>> >>>>>>>> - <h3>Ingress Table 8: ARP Request</h3> >>>>>>>> + <h3>Ingress Table 9: ARP Request</h3> >>>>>>>> >>>>>>>> <p> >>>>>>>> In the common case where the Ethernet destination has been >>>>>>>> resolved, this >>>>>>>> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c >>>>>>>> index 49e4ac3..44d1fd4 100644 >>>>>>>> --- a/ovn/northd/ovn-northd.c >>>>>>>> +++ b/ovn/northd/ovn-northd.c >>>>>>>> @@ -135,9 +135,10 @@ enum ovn_stage { >>>>>>>> PIPELINE_STAGE(ROUTER, IN, UNSNAT, 3, "lr_in_unsnat") >>>>>>>> \ >>>>>>>> PIPELINE_STAGE(ROUTER, IN, DNAT, 4, "lr_in_dnat") >>>>>>>> \ >>>>>>>> PIPELINE_STAGE(ROUTER, IN, IP_ROUTING, 5, >>>>>>>> "lr_in_ip_routing") \ >>>>>>>> - PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 6, >>>>>>>> "lr_in_arp_resolve") \ >>>>>>>> - PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 7, >>>>>>>> "lr_in_gw_redirect") \ >>>>>>>> - PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 8, >>>>>>>> "lr_in_arp_request") \ >>>>>>>> + PIPELINE_STAGE(ROUTER, IN, MULTIPATH, 6, >>>>>>>> "lr_in_multipath") \ >>>>>>>> + PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 7, >>>>>>>> "lr_in_arp_resolve") \ >>>>>>>> + PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 8, >>>>>>>> "lr_in_gw_redirect") \ >>>>>>>> + PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 9, >>>>>>>> "lr_in_arp_request") \ >>>>>>>> >>>>>>>> \ >>>>>>>> /* Logical router egress stages. */ >>>>>>>> \ >>>>>>>> PIPELINE_STAGE(ROUTER, OUT, UNDNAT, 0, "lr_out_undnat") >>>>>>>> \ >>>>>>>> @@ -173,6 +174,11 @@ enum ovn_stage { >>>>>>>> * one of the logical router's own IP addresses. */ >>>>>>>> #define REGBIT_EGRESS_LOOPBACK "reg9[1]" >>>>>>>> >>>>>>>> +/* Indicate multipath action has process this packet and store hash >>>>>>>> result >>>>>>>> + * into other regX. Should consume the hash result to determin the >>>>>>>> right >>>>>>>> + * output port. */ >>>>>>>> +#define REGBIT_MULTIPATH "reg9[2]" >>>>>>>> + >>>>>>>> /* Returns an "enum ovn_stage" built from the arguments. */ >>>>>>>> static enum ovn_stage >>>>>>>> ovn_stage_build(enum ovn_datapath_type dp_type, enum ovn_pipeline >>>>>>>> pipeline, >>>>>>>> @@ -4142,72 +4148,165 @@ add_route(struct hmap *lflows, const struct >>>>>>>> ovn_port *op, >>>>>>>> } >>>>>>>> >>>>>>>> static void >>>>>>>> -build_static_route_flow(struct hmap *lflows, struct ovn_datapath >>>>>>>> *od, >>>>>>>> - struct hmap *ports, >>>>>>>> - const struct nbrec_logical_router_static_ro >>>>>>>> ute >>>>>>>> *route) >>>>>>>> +add_multipath_route(struct hmap *lflows, uint32_t port_num, >>>>>>>> + struct ovn_port **out_ports, >>>>>>>> + const char **lrp_addr_s, >>>>>>>> + struct ovn_datapath *od, >>>>>>>> + const char *network_s, int plen, >>>>>>>> + const char *gateway, const char *policy) >>>>>>>> +{ >>>>>>>> + bool is_ipv4 = strchr(network_s, '.') ? true : false; >>>>>>>> + struct ds match = DS_EMPTY_INITIALIZER; >>>>>>>> + const char *dir; >>>>>>>> + uint16_t priority; >>>>>>>> + >>>>>>>> + if (policy && !strcmp(policy, "src-ip")) { >>>>>>>> + dir = "src"; >>>>>>>> + priority = plen * 2; >>>>>>>> + } else { >>>>>>>> + dir = "dst"; >>>>>>>> + priority = (plen * 2) + 1; >>>>>>>> + } >>>>>>>> + >>>>>>>> + /* Set higer priority than regular route. */ >>>>>>>> + priority += 10; >>>>>>>> + >>>>>>>> + ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" : "6", >>>>>>>> dir, >>>>>>>> + network_s, plen); >>>>>>>> + >>>>>>>> + struct ds actions = DS_EMPTY_INITIALIZER; >>>>>>>> + >>>>>>>> + ds_put_format(&actions, "ip.ttl--; "); >>>>>>>> + ds_put_format(&actions, >>>>>>>> + "multipath (nw_dst, 0, modulo_n, %u, 0, reg0); " >>>>>>>> + "%s = 1; " >>>>>>>> + "next;", >>>>>>>> + port_num, REGBIT_MULTIPATH); >>>>>>>> + >>>>>>>> + /* The priority here is calculated to implement >>>>>>>> longest-prefix-match >>>>>>>> + * routing. */ >>>>>>>> + ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_ROUTING, priority, >>>>>>>> + ds_cstr(&match), ds_cstr(&actions)); >>>>>>>> + >>>>>>>> + for (int i = 0; i < port_num; i++) { >>>>>>>> + struct ds mp_match = DS_EMPTY_INITIALIZER; >>>>>>>> + struct ds mp_actions = DS_EMPTY_INITIALIZER; >>>>>>>> + >>>>>>>> + ds_put_format(&mp_match, "%s == 1 && reg0 == %d && ", >>>>>>>> + REGBIT_MULTIPATH, i); >>>>>>>> + ds_put_format(&mp_match, "ip%s.%s == %s/%d", >>>>>>>> + is_ipv4 ? "4" : "6", dir, >>>>>>>> + network_s, plen); >>>>>>>> + >>>>>>>> + ds_put_format(&mp_actions, "%sreg0 = ", is_ipv4 ? "" : >>>>>>>> "xx"); >>>>>>>> + if (gateway) { >>>>>>>> + ds_put_cstr(&mp_actions, gateway); >>>>>>>> + } else { >>>>>>>> + ds_put_format(&mp_actions, "ip%s.dst", is_ipv4 ? "4" : >>>>>>>> "6"); >>>>>>>> + } >>>>>>>> + >>>>>>>> + ds_put_format(&mp_actions, "; " >>>>>>>> + "%sreg1 = %s; " >>>>>>>> + "eth.src = %s; " >>>>>>>> + "outport = %s; " >>>>>>>> + "flags.loopback = 1; " >>>>>>>> + "next;", >>>>>>>> + is_ipv4 ? "" : "xx", >>>>>>>> + lrp_addr_s[i], >>>>>>>> + out_ports[i]->lrp_networks.ea_s, >>>>>>>> + out_ports[i]->json_key); >>>>>>>> + >>>>>>>> + /* Add flow in table 6 to determin the right output port >>>>>>>> + * for this traffic. */ >>>>>>>> + ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, priority, >>>>>>>> + ds_cstr(&mp_match), ds_cstr(&mp_actions)); >>>>>>>> + ds_destroy(&mp_match); >>>>>>>> + ds_destroy(&mp_actions); >>>>>>>> + } >>>>>>>> + ds_destroy(&match); >>>>>>>> + ds_destroy(&actions); >>>>>>>> +} >>>>>>>> + >>>>>>>> +static bool >>>>>>>> +verify_nexthop_prefix(const struct nbrec_logical_router_static_ro >>>>>>>> ute >>>>>>>> *route, >>>>>>>> + bool *is_ipv4, char **prefix_s, unsigned int >>>>>>>> *plen) >>>>>>>> { >>>>>>>> ovs_be32 nexthop; >>>>>>>> - const char *lrp_addr_s = NULL; >>>>>>>> - unsigned int plen; >>>>>>>> - bool is_ipv4; >>>>>>>> >>>>>>>> /* Verify that the next hop is an IP address with an all-ones >>>>>>>> mask. >>>>>>>> */ >>>>>>>> - char *error = ip_parse_cidr(route->nexthop, &nexthop, &plen); >>>>>>>> + char *error = ip_parse_cidr(route->nexthop, &nexthop, plen); >>>>>>>> if (!error) { >>>>>>>> - if (plen != 32) { >>>>>>>> + if (*plen != 32) { >>>>>>>> static struct vlog_rate_limit rl = >>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>> 1); >>>>>>>> VLOG_WARN_RL(&rl, "bad next hop mask %s", >>>>>>>> route->nexthop); >>>>>>>> - return; >>>>>>>> + return false; >>>>>>>> } >>>>>>>> - is_ipv4 = true; >>>>>>>> + *is_ipv4 = true; >>>>>>>> } else { >>>>>>>> free(error); >>>>>>>> >>>>>>>> struct in6_addr ip6; >>>>>>>> - error = ipv6_parse_cidr(route->nexthop, &ip6, &plen); >>>>>>>> + error = ipv6_parse_cidr(route->nexthop, &ip6, plen); >>>>>>>> if (!error) { >>>>>>>> - if (plen != 128) { >>>>>>>> + if (*plen != 128) { >>>>>>>> static struct vlog_rate_limit rl = >>>>>>>> VLOG_RATE_LIMIT_INIT(5, 1); >>>>>>>> VLOG_WARN_RL(&rl, "bad next hop mask %s", >>>>>>>> route->nexthop); >>>>>>>> - return; >>>>>>>> + return false; >>>>>>>> } >>>>>>>> - is_ipv4 = false; >>>>>>>> + *is_ipv4 = false; >>>>>>>> } else { >>>>>>>> static struct vlog_rate_limit rl = >>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>> 1); >>>>>>>> VLOG_WARN_RL(&rl, "bad next hop ip address %s", >>>>>>>> route->nexthop); >>>>>>>> free(error); >>>>>>>> - return; >>>>>>>> + return false; >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> - char *prefix_s; >>>>>>>> - if (is_ipv4) { >>>>>>>> + if (*is_ipv4) { >>>>>>>> ovs_be32 prefix; >>>>>>>> /* Verify that ip prefix is a valid IPv4 address. */ >>>>>>>> - error = ip_parse_cidr(route->ip_prefix, &prefix, &plen); >>>>>>>> + error = ip_parse_cidr(route->ip_prefix, &prefix, plen); >>>>>>>> if (error) { >>>>>>>> static struct vlog_rate_limit rl = >>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>> 1); >>>>>>>> VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes >>>>>>>> %s", >>>>>>>> route->ip_prefix); >>>>>>>> free(error); >>>>>>>> - return; >>>>>>>> + return false; >>>>>>>> } >>>>>>>> - prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix & >>>>>>>> be32_prefix_mask(plen))); >>>>>>>> + *prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix >>>>>>>> + & >>>>>>>> be32_prefix_mask(*plen))); >>>>>>>> } else { >>>>>>>> /* Verify that ip prefix is a valid IPv6 address. */ >>>>>>>> struct in6_addr prefix; >>>>>>>> - error = ipv6_parse_cidr(route->ip_prefix, &prefix, &plen); >>>>>>>> + error = ipv6_parse_cidr(route->ip_prefix, &prefix, plen); >>>>>>>> if (error) { >>>>>>>> static struct vlog_rate_limit rl = >>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>> 1); >>>>>>>> VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes >>>>>>>> %s", >>>>>>>> route->ip_prefix); >>>>>>>> free(error); >>>>>>>> - return; >>>>>>>> + return false; >>>>>>>> } >>>>>>>> - struct in6_addr mask = ipv6_create_mask(plen); >>>>>>>> + struct in6_addr mask = ipv6_create_mask(*plen); >>>>>>>> struct in6_addr network = ipv6_addr_bitand(&prefix, &mask); >>>>>>>> - prefix_s = xmalloc(INET6_ADDRSTRLEN); >>>>>>>> - inet_ntop(AF_INET6, &network, prefix_s, INET6_ADDRSTRLEN); >>>>>>>> + *prefix_s = xmalloc(INET6_ADDRSTRLEN); >>>>>>>> + inet_ntop(AF_INET6, &network, *prefix_s, INET6_ADDRSTRLEN); >>>>>>>> + } >>>>>>>> + >>>>>>>> + return true; >>>>>>>> +} >>>>>>>> + >>>>>>>> +static void >>>>>>>> +build_static_route_flow(struct hmap *lflows, struct ovn_datapath >>>>>>>> *od, >>>>>>>> + struct hmap *ports, >>>>>>>> + const struct nbrec_logical_router_static_ro >>>>>>>> ute >>>>>>>> *route) >>>>>>>> +{ >>>>>>>> + const char *lrp_addr_s = NULL; >>>>>>>> + unsigned int plen; >>>>>>>> + bool is_ipv4; >>>>>>>> + char *prefix_s = NULL; >>>>>>>> + >>>>>>>> + if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s, &plen)) >>>>>>>> { >>>>>>>> + return; >>>>>>>> } >>>>>>>> >>>>>>>> /* Find the outgoing port. */ >>>>>>>> @@ -4270,7 +4369,75 @@ build_static_route_flow(struct hmap *lflows, >>>>>>>> struct >>>>>>>> ovn_datapath *od, >>>>>>>> policy); >>>>>>>> >>>>>>>> free_prefix_s: >>>>>>>> - free(prefix_s); >>>>>>>> + if (prefix_s) { >>>>>>>> + free(prefix_s); >>>>>>>> + } >>>>>>>> +} >>>>>>>> + >>>>>>>> +static void >>>>>>>> +build_multipath_flow(struct hmap *lflows, struct ovn_datapath *od, >>>>>>>> + struct hmap *ports, >>>>>>>> + const struct nbrec_logical_router_static_ro >>>>>>>> ute >>>>>>>> *route) >>>>>>>> +{ >>>>>>>> + unsigned int plen; >>>>>>>> + bool is_ipv4; >>>>>>>> + char *prefix_s = NULL; >>>>>>>> + >>>>>>>> + if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s, &plen)) >>>>>>>> { >>>>>>>> + return; >>>>>>>> + } >>>>>>>> + >>>>>>>> + /* Find the outgoing port. */ >>>>>>>> + struct ovn_port **out_ports = xmalloc(route->n_multipath_port >>>>>>>> * >>>>>>>> + sizeof(struct >>>>>>>> ovn_port *)); >>>>>>>> + const char **lrp_addr_s = xmalloc(route->n_multipath_port * >>>>>>>> + sizeof(const char *)); >>>>>>>> + for (int i = 0; i < route->n_multipath_port; i++) { >>>>>>>> + // TODO May need to consider some ports are not found? >>>>>>>> + out_ports[i] = ovn_port_find(ports, >>>>>>>> route->multipath_port[i]); >>>>>>>> + if (!out_ports[i]) { >>>>>>>> + static struct vlog_rate_limit rl = >>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>> 1); >>>>>>>> + VLOG_WARN_RL(&rl, "Bad out port %s for static route >>>>>>>> %s", >>>>>>>> + route->multipath_port[i], >>>>>>>> route->ip_prefix); >>>>>>>> + goto free_ports_lrp_addr; >>>>>>>> + } >>>>>>>> + >>>>>>>> + lrp_addr_s[i] = find_lrp_member_ip(out_ports[i], >>>>>>>> route->nexthop); >>>>>>>> + if (!lrp_addr_s[i]) { >>>>>>>> + if (is_ipv4) { >>>>>>>> + if (out_ports[i]->lrp_networks.n_ipv4_addrs) { >>>>>>>> + lrp_addr_s[i] = out_ports[i]-> >>>>>>>> + lrp_networks.ipv4_addrs[0].addr_s; >>>>>>>> + } >>>>>>>> + } else { >>>>>>>> + if (out_ports[i]->lrp_networks.n_ipv6_addrs) { >>>>>>>> + lrp_addr_s[i] = out_ports[i]-> >>>>>>>> + lrp_networks.ipv6_addrs[0].addr_s; >>>>>>>> + } >>>>>>>> + } >>>>>>>> + } >>>>>>>> + if (!lrp_addr_s[i]) { >>>>>>>> + static struct vlog_rate_limit rl = >>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>> 1); >>>>>>>> + VLOG_WARN_RL(&rl, >>>>>>>> + "%s has no path for static route %s; next >>>>>>>> hop >>>>>>>> %s", >>>>>>>> + route->multipath_port[i], >>>>>>>> route->ip_prefix, >>>>>>>> + route->nexthop); >>>>>>>> + goto free_ports_lrp_addr; >>>>>>>> + } >>>>>>>> + } >>>>>>>> + >>>>>>>> + >>>>>>>> + char *policy = route->policy ? route->policy : "dst-ip"; >>>>>>>> + add_multipath_route(lflows, route->n_multipath_port, >>>>>>>> + out_ports, lrp_addr_s, od, >>>>>>>> + prefix_s, plen, route->nexthop, policy); >>>>>>>> + >>>>>>>> +free_ports_lrp_addr: >>>>>>>> + free(out_ports); >>>>>>>> + free(lrp_addr_s); >>>>>>>> + if (prefix_s) { >>>>>>>> + free(prefix_s); >>>>>>>> + } >>>>>>>> } >>>>>>>> >>>>>>>> static void >>>>>>>> @@ -5344,7 +5511,7 @@ build_lrouter_flows(struct hmap *datapaths, >>>>>>>> struct >>>>>>>> hmap *ports, >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> - /* Convert the static routes to flows. */ >>>>>>>> + /* Convert the static routes and multipath route to flows. */ >>>>>>>> HMAP_FOR_EACH (od, key_node, datapaths) { >>>>>>>> if (!od->nbr) { >>>>>>>> continue; >>>>>>>> @@ -5355,12 +5522,24 @@ build_lrouter_flows(struct hmap *datapaths, >>>>>>>> struct >>>>>>>> hmap *ports, >>>>>>>> >>>>>>>> route = od->nbr->static_routes[i]; >>>>>>>> build_static_route_flow(lflows, od, ports, route); >>>>>>>> + /* Logical router ingress table 5-6: Multipath Routing. >>>>>>>> + * >>>>>>>> + * If router has configured a traffic has multiple >>>>>>>> paths >>>>>>>> + * to destination. The right output port should be >>>>>>>> firgured >>>>>>>> + * out by computing IP packet's header */ >>>>>>>> + if (route->n_multipath_port > 1) { >>>>>>>> + /* Generate multipath routes in table 5,6 for >>>>>>>> + * dedicated traffic */ >>>>>>>> + build_multipath_flow(lflows, od, ports, route); >>>>>>>> + } >>>>>>>> } >>>>>>>> + /* Packets are allowed by default in table 6. */ >>>>>>>> + ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, 0, "1", >>>>>>>> "next;"); >>>>>>>> } >>>>>>>> >>>>>>>> /* XXX destination unreachable */ >>>>>>>> >>>>>>>> - /* Local router ingress table 6: ARP Resolution. >>>>>>>> + /* Local router ingress table 7: ARP Resolution. >>>>>>>> * >>>>>>>> * Any packet that reaches this table is an IP packet whose >>>>>>>> next-hop >>>>>>>> IP >>>>>>>> * address is in reg0. (ip4.dst is the final destination.) >>>>>>>> This table >>>>>>>> @@ -5555,7 +5734,7 @@ build_lrouter_flows(struct hmap *datapaths, >>>>>>>> struct >>>>>>>> hmap *ports, >>>>>>>> "get_nd(outport, xxreg0); next;"); >>>>>>>> } >>>>>>>> >>>>>>>> - /* Logical router ingress table 7: Gateway redirect. >>>>>>>> + /* Logical router ingress table 8: Gateway redirect. >>>>>>>> * >>>>>>>> * For traffic with outport equal to the l3dgw_port >>>>>>>> * on a distributed router, this table redirects a subset >>>>>>>> @@ -5595,7 +5774,7 @@ build_lrouter_flows(struct hmap *datapaths, >>>>>>>> struct >>>>>>>> hmap *ports, >>>>>>>> ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 0, "1", >>>>>>>> "next;"); >>>>>>>> } >>>>>>>> >>>>>>>> - /* Local router ingress table 8: ARP request. >>>>>>>> + /* Local router ingress table 9: ARP request. >>>>>>>> * >>>>>>>> * In the common case where the Ethernet destination has been >>>>>>>> resolved, >>>>>>>> * this table outputs the packet (priority 0). Otherwise, it >>>>>>>> composes >>>>>>>> diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema >>>>>>>> index a077bfb..b8bdd42 100644 >>>>>>>> --- a/ovn/ovn-nb.ovsschema >>>>>>>> +++ b/ovn/ovn-nb.ovsschema >>>>>>>> @@ -1,7 +1,7 @@ >>>>>>>> { >>>>>>>> "name": "OVN_Northbound", >>>>>>>> "version": "5.8.0", >>>>>>>> - "cksum": "2812300190 <(281)%20230-0190> 16766", >>>>>>>> + "cksum": "1967092589 16903", >>>>>>>> "tables": { >>>>>>>> "NB_Global": { >>>>>>>> "columns": { >>>>>>>> @@ -235,7 +235,9 @@ >>>>>>>> >>>>>>>> "dst-ip"]]}, >>>>>>>> "min": 0, "max": 1}}, >>>>>>>> "nexthop": {"type": "string"}, >>>>>>>> - "output_port": {"type": {"key": "string", "min": 0, >>>>>>>> "max": 1}}}, >>>>>>>> + "output_port": {"type": {"key": "string", "min": 0, >>>>>>>> "max": 1}}, >>>>>>>> + "multipath_port": {"type": {"key": "string", >>>>>>>> "min": 0, >>>>>>>> + "max": "unlimited"}}}, >>>>>>>> "isRoot": false}, >>>>>>>> "NAT": { >>>>>>>> "columns": { >>>>>>>> diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml >>>>>>>> index 9869d7e..15feb97 100644 >>>>>>>> --- a/ovn/ovn-nb.xml >>>>>>>> +++ b/ovn/ovn-nb.xml >>>>>>>> @@ -1487,6 +1487,15 @@ >>>>>>>> address as the one via which the <ref column="nexthop"/> is >>>>>>>> reachable. >>>>>>>> </p> >>>>>>>> </column> >>>>>>>> + <column name="multipath_port"> >>>>>>>> + <p> >>>>>>>> + The name of the <ref table="Logical_Router_Port"/> via >>>>>>>> which the >>>>>>>> packet >>>>>>>> + needs to be sent out. When it contains more than two >>>>>>>> ports, it >>>>>>>> means >>>>>>>> + packet has multiple candidate output ports. OVN uses the >>>>>>>> packet >>>>>>>> header >>>>>>>> + to determin which port the packet would be delivered to. >>>>>>>> + Currently, OVN consumes destination IP address to figure >>>>>>>> out >>>>>>>> port. >>>>>>>> + </p> >>>>>>>> + </column> >>>>>>>> </table> >>>>>>>> >>>>>>>> <table name="NAT" title="NAT rules"> >>>>>>>> -- >>>>>>>> 1.8.3.1 >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> dev mailing list >>>>>>>> [email protected] >>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> dev mailing list >>>>>>>> [email protected] >>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
