On Thu, Sep 21, 2017 at 2:21 PM, Gao Zhenyu <[email protected]> wrote:
> I think the S/N or E/W are not the matter we should considering now. > > The multipath implementation is based on the existing ovn workflows. If > you can use route to dispatch traffics to different node/logical port, then > the multipath can make it. Otherwise it must get bug in multipath. > If the static route cannot dispatch traffic to some nodes or logical port > then the multipath cannot make it as well. > > I am not sure if my understanding is right: I think if you deploy a router > only on a specific ovn-node, then traffic between A(src)---router----B(dst) > should go through this router. > That is the point where I'm not sure either. On the tests I made, if you configured an specific chassis for a router, only the N/S traffic went through that router (so multipath would make sense there), but in E/W the traffic going through a router was directly going from origin port chassis, to destination port chassis, without going through the specific router chassis. May be it is a bug, or may be it was missconfiguration at my side. I will double check it. > > Any suggestions and comments are welcome :) > > > Thanks > Zhenyu Gao > > 2017-09-21 19:07 GMT+08:00 Miguel Angel Ajo Pelayo <[email protected]>: > >> May be I missed something, but when I tried setting logical routers into >> specific chassis, still the E/W traffic was handled in a distributed way >> (from original chassis to destination chassis without going through the >> router chassis), such chassis was only used for N/S, but may be I got >> something wrong. >> >> >> On Wed, Sep 20, 2017 at 4:48 PM, Gao Zhenyu <[email protected]> >> wrote: >> >>> " >>> But, if an ovn port in foo (chassis A) wants to talk to alice1 (chassis >>> B), >>> wouldn't all that E/W routing will happen virtually and the end result >>> is just a tunneled packet between chassis A and chassis B ? " >>> [ Now the hash function base on dst IP, if foo1 only talks to alice1, >>> and it is the tunnel packet between chassisA and chassis B ] >>> >>> The benifit is if you have two ovn-routers and those router are ONLY >>> deployed in chassis C and chassis D, the traffics can be sperated in two >>> paths automatically. Otherwise you need to config static rule one by one to >>> seperate traffics. >>> To make a long story short, you also can do same thing by config >>> numerous static rules to seperate traffic but the multipath can do it >>> automatically. >>> >>> 2017-09-20 22:08 GMT+08:00 Miguel Angel Ajo Pelayo <[email protected]> >>> : >>> >>>> I forgot to say thank you very much for the explanation and diagrams. >>>> >>>> On Wed, Sep 20, 2017 at 4:07 PM, Miguel Angel Ajo Pelayo < >>>> [email protected]> wrote: >>>> >>>>> But, if an ovn port in foo (chassis A) wants to talk to alice1 >>>>> (chassis B), >>>>> wouldn't all that E/W routing will happen virtually and the end result >>>>> is just a tunneled packet between chassis A and chassis B ? >>>>> >>>>> What's the benefit of multipath there if the possible failing link is >>>>> always the connection between chassis A and chassis B ? >>>>> >>>>> I suspect there's something I'm missing on the picture. >>>>> >>>>> On Wed, Sep 20, 2017 at 3:49 PM, Gao Zhenyu <[email protected]> >>>>> wrote: >>>>> >>>>>> You can take a look at this patch that implement a testcase : >>>>>> https://patchwork.ozlabs.org/patch/815475/ >>>>>> >>>>>> In the testcase, we have R1, R2, R3. >>>>>> >>>>>> R1 and R2 that are connected to each other via LS "join" in >>>>>> 20.0.0.0/24 network. >>>>>> R1 and R3 that are connected to each other via LS "join2" in >>>>>> 20.0.0.0/24 network. >>>>>> R1 has switchess foo (192.168.1.0/24) connected to it. R2 and R3 >>>>>> has alice (172.16.1.0/24) connected to it. >>>>>> R2 and R3 are gateway routers. >>>>>> >>>>>> A packet send to alice1/aclie2 from foo have mulitpath to >>>>>> destination: >>>>>> 1. foo-->R1-->join-->R2-->alice. >>>>>> 2. foo-->R1-->join2-->R3-->alice. >>>>>> >>>>>> In this testcase, it simulates two packet, one's destination is >>>>>> 172.16.1.2, another is 172.16.1.4. The mulitpath that was configured in >>>>>> R1 >>>>>> can seperate those traffics to R2/R3. Finally, 172.16.1.2 packet travels >>>>>> path2, 172.16.1.4 packet travels path1 >>>>>> >>>>>> +------+ >>>>>> | foo | >>>>>> +------+ >>>>>> | >>>>>> | >>>>>> +------+ >>>>>> | R1 |---------+ >>>>>> +------+ | >>>>>> | | >>>>>> | | >>>>>> +------+ +-------+ >>>>>> | join | | join2 | >>>>>> +------+ +-------+ >>>>>> | | >>>>>> | | >>>>>> +------+ +-------+ >>>>>> | R2 | | R3 | >>>>>> +------+ +-------+ >>>>>> | | >>>>>> | | >>>>>> +-----------------+ >>>>>> | alice | >>>>>> +-----------------+ >>>>>> | | >>>>>> alice1 alice2 >>>>>> >>>>>> Please let me know if you have any question on it. :) >>>>>> >>>>>> Thanks >>>>>> Zhenyu Gao >>>>>> >>>>>> 2017-09-20 20:58 GMT+08:00 Miguel Angel Ajo Pelayo < >>>>>> [email protected]>: >>>>>> >>>>>>> Can you share an example of how this would benefit E/W routing. I'm >>>>>>> just not seeing the specific use case myself out of ignorance. >>>>>>> >>>>>>> It'd be great if you could explain how would it work between several >>>>>>> ports in the networks and routers (may be a diagram?) otherwise I can't >>>>>>> be >>>>>>> really helpful reviewing :) >>>>>>> >>>>>>> Cheers, and thanks for the patience. >>>>>>> >>>>>>> On Wed, Sep 20, 2017 at 12:25 PM, Gao Zhenyu < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Thanks for the suggestions! >>>>>>>> >>>>>>>> Not all Logical port has a real ofp_port connect with it. And >>>>>>>> bundle_load/bundle actions need real ovs port. >>>>>>>> Especially in ovn router port, all router port are virtual port >>>>>>>> which just a number/reg in our ovs-flows. >>>>>>>> >>>>>>>> This implement of multipath can seperate ovn east-west traffic, it >>>>>>>> helps dispatch traffic to gateways and routers easily. >>>>>>>> >>>>>>>> For south-north traffic, we can have bundle/bundle_load action to >>>>>>>> consider the remote tunnel up/down status. I would like to make it >>>>>>>> step by >>>>>>>> step and implement it in my next series patches. >>>>>>>> >>>>>>>> Thanks >>>>>>>> Zhenyu Gao >>>>>>>> >>>>>>>> 2017-09-20 17:53 GMT+08:00 Miguel Angel Ajo Pelayo < >>>>>>>> [email protected]>: >>>>>>>> >>>>>>>>> I'm not very familiar with multipath implementations, >>>>>>>>> >>>>>>>>> but would it be possible to use bundle( ouput action with hrw >>>>>>>>> algorithm instead of multipath calculation to a register?. >>>>>>>>> >>>>>>>>> I say this, because if you look at lib/multipath.c lib/bundle.c >>>>>>>>> you will find that bundle.c is going to consider the up/down status >>>>>>>>> (slave_enabled check) of the links. >>>>>>>>> >>>>>>>>> That way the controller doesn't need to modify any flow based on >>>>>>>>> link status. >>>>>>>>> >>>>>>>>> On Wed, Sep 20, 2017 at 5:45 AM, Gao Zhenyu < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Thansk for the questions. >>>>>>>>>> >>>>>>>>>> the multipath_port can be set via ovn-nbctl. >>>>>>>>>> Like : ovn-nbctl -- --id=@lrt create >>>>>>>>>> Logical_Router_Static_Route ip_prefix=0.0.0.0/0 >>>>>>>>>> nexthop=10.88.77.1 multipath_port=[mp1,mp2] -- add Logical_Router >>>>>>>>>> edge1 >>>>>>>>>> static_routes @lrt >>>>>>>>>> This patch haven't implement a ovn-nbctl command to configure >>>>>>>>>> multipath routing. Because I am still considering reusing nexthop or >>>>>>>>>> output_port(make them become array entries), and want to collect >>>>>>>>>> suggestions on it. >>>>>>>>>> >>>>>>>>>> About the status of next -hop, I would like to introduce >>>>>>>>>> bundle_load and bfd to make it later. >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Zhenyu Gao >>>>>>>>>> >>>>>>>>>> 2017-09-20 11:13 GMT+08:00 <[email protected]>: >>>>>>>>>> >>>>>>>>>>> How to configure multipath_port in static_route? I think the the >>>>>>>>>>> multipath >>>>>>>>>>> can be figured out from exist data of static_route, may not need >>>>>>>>>>> to add >>>>>>>>>>> this multipath_port column. >>>>>>>>>>> >>>>>>>>>>> And I think we should add a status column to indicate the >>>>>>>>>>> nexthop state. >>>>>>>>>>> When some of nexthop in multipath is down, ovn should change the >>>>>>>>>>> correspond flows. >>>>>>>>>>> >>>>>>>>>>> Thanks. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Zhenyu Gao <[email protected]> >>>>>>>>>>> 发件人: [email protected] >>>>>>>>>>> 2017/09/19 19:37 >>>>>>>>>>> >>>>>>>>>>> 收件人: [email protected], [email protected], >>>>>>>>>>> [email protected], [email protected], [email protected], >>>>>>>>>>> 抄送: >>>>>>>>>>> 主题: [ovs-dev] [PATCH v1 1/3] Add multipath static >>>>>>>>>>> router in >>>>>>>>>>> OVN northd and north-db >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 1. ovn-nb.ovsschema was updated to add new field multipath_port. >>>>>>>>>>> 2. Add multipath feature in ovn-northd part. northd generates >>>>>>>>>>> multipath >>>>>>>>>>> flows to dispatch traffic by using packet's IP dst address if >>>>>>>>>>> user set >>>>>>>>>>> Logical_Router_Static_Route's multipath_port with ports. >>>>>>>>>>> 3. Add new table(lr_in_multipath) in ovn-northd's router ingress >>>>>>>>>>> stages >>>>>>>>>>> to dispatch traffic to ports. >>>>>>>>>>> 4. Add multipath flow in Table 5(lr_in_ip_routing) and store >>>>>>>>>>> hash result >>>>>>>>>>> into reg0. reg9[2] was used to indicate packet which need >>>>>>>>>>> dispatching. >>>>>>>>>>> 5. Add multipath feature description in >>>>>>>>>>> ovn/northd/ovn-northd.8.xml >>>>>>>>>>> and ovn/ovn-nb.xml >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Zhenyu Gao <[email protected]> >>>>>>>>>>> --- >>>>>>>>>>> ovn/northd/ovn-northd.8.xml | 67 +++++++++++- >>>>>>>>>>> ovn/northd/ovn-northd.c | 245 >>>>>>>>>>> ++++++++++++++++++++++++++++++++++++++------ >>>>>>>>>>> ovn/ovn-nb.ovsschema | 6 +- >>>>>>>>>>> ovn/ovn-nb.xml | 9 ++ >>>>>>>>>>> 4 files changed, 289 insertions(+), 38 deletions(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/ovn/northd/ovn-northd.8.xml >>>>>>>>>>> b/ovn/northd/ovn-northd.8.xml >>>>>>>>>>> index 0d85ec0..b1ce9a9 100644 >>>>>>>>>>> --- a/ovn/northd/ovn-northd.8.xml >>>>>>>>>>> +++ b/ovn/northd/ovn-northd.8.xml >>>>>>>>>>> @@ -1598,6 +1598,9 @@ icmp4 { >>>>>>>>>>> port (ingress table <code>ARP Request</code> will >>>>>>>>>>> generate an ARP >>>>>>>>>>> request, if needed, with <code>reg0</code> as the target >>>>>>>>>>> protocol >>>>>>>>>>> address and <code>reg1</code> as the source protocol >>>>>>>>>>> address). >>>>>>>>>>> + A IP route can be configured that it has multipath to >>>>>>>>>>> next-hop. >>>>>>>>>>> + If a packet has multipath to destination, OVN assign the >>>>>>>>>>> port >>>>>>>>>>> + index into reg[0] to indicate the packet's output port in >>>>>>>>>>> table 6. >>>>>>>>>>> </p> >>>>>>>>>>> >>>>>>>>>>> <p> >>>>>>>>>>> @@ -1617,6 +1620,28 @@ icmp4 { >>>>>>>>>>> >>>>>>>>>>> <li> >>>>>>>>>>> <p> >>>>>>>>>>> + IPv4/IPV6 multipath routing table. For each route to >>>>>>>>>>> IPv4/IPv6 >>>>>>>>>>> + network <var>N</var> with netmask <var>M</var>, on >>>>>>>>>>> multipath >>>>>>>>>>> port >>>>>>>>>>> + <var>P</var> with IP address <var>A</var> and Ethernet >>>>>>>>>>> + address <var>E</var>, a logical flow with match >>>>>>>>>>> + <code>ip4.dst ==<var>N</var>/<var>M</var></code>,whose >>>>>>>>>>> priority >>>>>>>>>>> + is the number of 1-bits plus 10 in <var>M</var>, >>>>>>>>>>> + has the following actions: >>>>>>>>>>> + </p> >>>>>>>>>>> + >>>>>>>>>>> + <pre> >>>>>>>>>>> +ip.ttl--; >>>>>>>>>>> +multipath (nw_dst, 0, modulo_n, <var>n_links</var>, 0, reg0); >>>>>>>>>>> +reg9[2] = 1 >>>>>>>>>>> +next; >>>>>>>>>>> + </pre> >>>>>>>>>>> + <p> >>>>>>>>>>> + <var>n_links</var> is the number of multipath port. >>>>>>>>>>> + </p> >>>>>>>>>>> + </li> >>>>>>>>>>> + >>>>>>>>>>> + <li> >>>>>>>>>>> + <p> >>>>>>>>>>> IPv4 routing table. For each route to IPv4 network >>>>>>>>>>> <var>N</var> with >>>>>>>>>>> netmask <var>M</var>, on router port <var>P</var> >>>>>>>>>>> with IP >>>>>>>>>>> address >>>>>>>>>>> <var>A</var> and Ethernet >>>>>>>>>>> @@ -1686,7 +1711,43 @@ next; >>>>>>>>>>> </li> >>>>>>>>>>> </ul> >>>>>>>>>>> >>>>>>>>>>> - <h3>Ingress Table 6: ARP/ND Resolution</h3> >>>>>>>>>>> + <h3>Ingress Table 6: Multipath</h3> >>>>>>>>>>> + <p> >>>>>>>>>>> + Any packet taht reaches this table is an IP packet and >>>>>>>>>>> reg9[2]=1 >>>>>>>>>>> + using the following flows to route to corresponding port. >>>>>>>>>>> This >>>>>>>>>>> table >>>>>>>>>>> + implement dispatching by consuming reg0. >>>>>>>>>>> + </p> >>>>>>>>>>> + >>>>>>>>>>> + <ul> >>>>>>>>>>> + <li> >>>>>>>>>>> + <p> >>>>>>>>>>> + A packet with netmask <var>M</var>, IP address >>>>>>>>>>> <var>A</var> and >>>>>>>>>>> + <code>reg9[2] = 1</code>, whose priority above 1 has >>>>>>>>>>> following >>>>>>>>>>> + actions: >>>>>>>>>>> + </p> >>>>>>>>>>> + >>>>>>>>>>> + <pre> >>>>>>>>>>> +reg0 = <var>G</var>; >>>>>>>>>>> +reg1 = <var>A</var>; >>>>>>>>>>> +eth.src = <var>E</var>; >>>>>>>>>>> +outport = <var>P</var>; >>>>>>>>>>> +flags.loopback = 1; >>>>>>>>>>> +next; >>>>>>>>>>> + </pre> >>>>>>>>>>> + >>>>>>>>>>> + <p> >>>>>>>>>>> + <var>G</var> is the gateway IP address. <var>A</var>, >>>>>>>>>>> <var>E</var> >>>>>>>>>>> + and <var>P</var> are the values that were described in >>>>>>>>>>> multipath >>>>>>>>>>> + routeing in table 5 >>>>>>>>>>> + </p> >>>>>>>>>>> + >>>>>>>>>>> + <p> >>>>>>>>>>> + A priority-0 logical flow with match has actions >>>>>>>>>>> <code>next;</code>. >>>>>>>>>>> + </p> >>>>>>>>>>> + </li> >>>>>>>>>>> + </ul> >>>>>>>>>>> + >>>>>>>>>>> + <h3>Ingress Table 7: ARP/ND Resolution</h3> >>>>>>>>>>> >>>>>>>>>>> <p> >>>>>>>>>>> Any packet that reaches this table is an IP packet whose >>>>>>>>>>> next-hop >>>>>>>>>>> @@ -1779,7 +1840,7 @@ next; >>>>>>>>>>> </li> >>>>>>>>>>> </ul> >>>>>>>>>>> >>>>>>>>>>> - <h3>Ingress Table 7: Gateway Redirect</h3> >>>>>>>>>>> + <h3>Ingress Table 8: Gateway Redirect</h3> >>>>>>>>>>> >>>>>>>>>>> <p> >>>>>>>>>>> For distributed logical routers where one of the logical >>>>>>>>>>> router >>>>>>>>>>> @@ -1836,7 +1897,7 @@ next; >>>>>>>>>>> </li> >>>>>>>>>>> </ul> >>>>>>>>>>> >>>>>>>>>>> - <h3>Ingress Table 8: ARP Request</h3> >>>>>>>>>>> + <h3>Ingress Table 9: ARP Request</h3> >>>>>>>>>>> >>>>>>>>>>> <p> >>>>>>>>>>> In the common case where the Ethernet destination has been >>>>>>>>>>> resolved, this >>>>>>>>>>> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c >>>>>>>>>>> index 49e4ac3..44d1fd4 100644 >>>>>>>>>>> --- a/ovn/northd/ovn-northd.c >>>>>>>>>>> +++ b/ovn/northd/ovn-northd.c >>>>>>>>>>> @@ -135,9 +135,10 @@ enum ovn_stage { >>>>>>>>>>> PIPELINE_STAGE(ROUTER, IN, UNSNAT, 3, >>>>>>>>>>> "lr_in_unsnat") \ >>>>>>>>>>> PIPELINE_STAGE(ROUTER, IN, DNAT, 4, "lr_in_dnat") >>>>>>>>>>> \ >>>>>>>>>>> PIPELINE_STAGE(ROUTER, IN, IP_ROUTING, 5, >>>>>>>>>>> "lr_in_ip_routing") \ >>>>>>>>>>> - PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 6, >>>>>>>>>>> "lr_in_arp_resolve") \ >>>>>>>>>>> - PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 7, >>>>>>>>>>> "lr_in_gw_redirect") \ >>>>>>>>>>> - PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 8, >>>>>>>>>>> "lr_in_arp_request") \ >>>>>>>>>>> + PIPELINE_STAGE(ROUTER, IN, MULTIPATH, 6, >>>>>>>>>>> "lr_in_multipath") \ >>>>>>>>>>> + PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 7, >>>>>>>>>>> "lr_in_arp_resolve") \ >>>>>>>>>>> + PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 8, >>>>>>>>>>> "lr_in_gw_redirect") \ >>>>>>>>>>> + PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 9, >>>>>>>>>>> "lr_in_arp_request") \ >>>>>>>>>>> >>>>>>>>>>> \ >>>>>>>>>>> /* Logical router egress stages. */ >>>>>>>>>>> \ >>>>>>>>>>> PIPELINE_STAGE(ROUTER, OUT, UNDNAT, 0, "lr_out_undnat") >>>>>>>>>>> \ >>>>>>>>>>> @@ -173,6 +174,11 @@ enum ovn_stage { >>>>>>>>>>> * one of the logical router's own IP addresses. */ >>>>>>>>>>> #define REGBIT_EGRESS_LOOPBACK "reg9[1]" >>>>>>>>>>> >>>>>>>>>>> +/* Indicate multipath action has process this packet and store >>>>>>>>>>> hash >>>>>>>>>>> result >>>>>>>>>>> + * into other regX. Should consume the hash result to determin >>>>>>>>>>> the right >>>>>>>>>>> + * output port. */ >>>>>>>>>>> +#define REGBIT_MULTIPATH "reg9[2]" >>>>>>>>>>> + >>>>>>>>>>> /* Returns an "enum ovn_stage" built from the arguments. */ >>>>>>>>>>> static enum ovn_stage >>>>>>>>>>> ovn_stage_build(enum ovn_datapath_type dp_type, enum >>>>>>>>>>> ovn_pipeline >>>>>>>>>>> pipeline, >>>>>>>>>>> @@ -4142,72 +4148,165 @@ add_route(struct hmap *lflows, const >>>>>>>>>>> struct >>>>>>>>>>> ovn_port *op, >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> static void >>>>>>>>>>> -build_static_route_flow(struct hmap *lflows, struct >>>>>>>>>>> ovn_datapath *od, >>>>>>>>>>> - struct hmap *ports, >>>>>>>>>>> - const struct >>>>>>>>>>> nbrec_logical_router_static_route >>>>>>>>>>> *route) >>>>>>>>>>> +add_multipath_route(struct hmap *lflows, uint32_t port_num, >>>>>>>>>>> + struct ovn_port **out_ports, >>>>>>>>>>> + const char **lrp_addr_s, >>>>>>>>>>> + struct ovn_datapath *od, >>>>>>>>>>> + const char *network_s, int plen, >>>>>>>>>>> + const char *gateway, const char *policy) >>>>>>>>>>> +{ >>>>>>>>>>> + bool is_ipv4 = strchr(network_s, '.') ? true : false; >>>>>>>>>>> + struct ds match = DS_EMPTY_INITIALIZER; >>>>>>>>>>> + const char *dir; >>>>>>>>>>> + uint16_t priority; >>>>>>>>>>> + >>>>>>>>>>> + if (policy && !strcmp(policy, "src-ip")) { >>>>>>>>>>> + dir = "src"; >>>>>>>>>>> + priority = plen * 2; >>>>>>>>>>> + } else { >>>>>>>>>>> + dir = "dst"; >>>>>>>>>>> + priority = (plen * 2) + 1; >>>>>>>>>>> + } >>>>>>>>>>> + >>>>>>>>>>> + /* Set higer priority than regular route. */ >>>>>>>>>>> + priority += 10; >>>>>>>>>>> + >>>>>>>>>>> + ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" : >>>>>>>>>>> "6", dir, >>>>>>>>>>> + network_s, plen); >>>>>>>>>>> + >>>>>>>>>>> + struct ds actions = DS_EMPTY_INITIALIZER; >>>>>>>>>>> + >>>>>>>>>>> + ds_put_format(&actions, "ip.ttl--; "); >>>>>>>>>>> + ds_put_format(&actions, >>>>>>>>>>> + "multipath (nw_dst, 0, modulo_n, %u, 0, >>>>>>>>>>> reg0); " >>>>>>>>>>> + "%s = 1; " >>>>>>>>>>> + "next;", >>>>>>>>>>> + port_num, REGBIT_MULTIPATH); >>>>>>>>>>> + >>>>>>>>>>> + /* The priority here is calculated to implement >>>>>>>>>>> longest-prefix-match >>>>>>>>>>> + * routing. */ >>>>>>>>>>> + ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_ROUTING, priority, >>>>>>>>>>> + ds_cstr(&match), ds_cstr(&actions)); >>>>>>>>>>> + >>>>>>>>>>> + for (int i = 0; i < port_num; i++) { >>>>>>>>>>> + struct ds mp_match = DS_EMPTY_INITIALIZER; >>>>>>>>>>> + struct ds mp_actions = DS_EMPTY_INITIALIZER; >>>>>>>>>>> + >>>>>>>>>>> + ds_put_format(&mp_match, "%s == 1 && reg0 == %d && ", >>>>>>>>>>> + REGBIT_MULTIPATH, i); >>>>>>>>>>> + ds_put_format(&mp_match, "ip%s.%s == %s/%d", >>>>>>>>>>> + is_ipv4 ? "4" : "6", dir, >>>>>>>>>>> + network_s, plen); >>>>>>>>>>> + >>>>>>>>>>> + ds_put_format(&mp_actions, "%sreg0 = ", is_ipv4 ? "" : >>>>>>>>>>> "xx"); >>>>>>>>>>> + if (gateway) { >>>>>>>>>>> + ds_put_cstr(&mp_actions, gateway); >>>>>>>>>>> + } else { >>>>>>>>>>> + ds_put_format(&mp_actions, "ip%s.dst", is_ipv4 ? >>>>>>>>>>> "4" : "6"); >>>>>>>>>>> + } >>>>>>>>>>> + >>>>>>>>>>> + ds_put_format(&mp_actions, "; " >>>>>>>>>>> + "%sreg1 = %s; " >>>>>>>>>>> + "eth.src = %s; " >>>>>>>>>>> + "outport = %s; " >>>>>>>>>>> + "flags.loopback = 1; " >>>>>>>>>>> + "next;", >>>>>>>>>>> + is_ipv4 ? "" : "xx", >>>>>>>>>>> + lrp_addr_s[i], >>>>>>>>>>> + out_ports[i]->lrp_networks.ea_s, >>>>>>>>>>> + out_ports[i]->json_key); >>>>>>>>>>> + >>>>>>>>>>> + /* Add flow in table 6 to determin the right output port >>>>>>>>>>> + * for this traffic. */ >>>>>>>>>>> + ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, >>>>>>>>>>> priority, >>>>>>>>>>> + ds_cstr(&mp_match), ds_cstr(&mp_actions)); >>>>>>>>>>> + ds_destroy(&mp_match); >>>>>>>>>>> + ds_destroy(&mp_actions); >>>>>>>>>>> + } >>>>>>>>>>> + ds_destroy(&match); >>>>>>>>>>> + ds_destroy(&actions); >>>>>>>>>>> +} >>>>>>>>>>> + >>>>>>>>>>> +static bool >>>>>>>>>>> +verify_nexthop_prefix(const struct >>>>>>>>>>> nbrec_logical_router_static_route >>>>>>>>>>> *route, >>>>>>>>>>> + bool *is_ipv4, char **prefix_s, unsigned >>>>>>>>>>> int *plen) >>>>>>>>>>> { >>>>>>>>>>> ovs_be32 nexthop; >>>>>>>>>>> - const char *lrp_addr_s = NULL; >>>>>>>>>>> - unsigned int plen; >>>>>>>>>>> - bool is_ipv4; >>>>>>>>>>> >>>>>>>>>>> /* Verify that the next hop is an IP address with an >>>>>>>>>>> all-ones mask. >>>>>>>>>>> */ >>>>>>>>>>> - char *error = ip_parse_cidr(route->nexthop, &nexthop, >>>>>>>>>>> &plen); >>>>>>>>>>> + char *error = ip_parse_cidr(route->nexthop, &nexthop, plen); >>>>>>>>>>> if (!error) { >>>>>>>>>>> - if (plen != 32) { >>>>>>>>>>> + if (*plen != 32) { >>>>>>>>>>> static struct vlog_rate_limit rl = >>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>>>>> 1); >>>>>>>>>>> VLOG_WARN_RL(&rl, "bad next hop mask %s", >>>>>>>>>>> route->nexthop); >>>>>>>>>>> - return; >>>>>>>>>>> + return false; >>>>>>>>>>> } >>>>>>>>>>> - is_ipv4 = true; >>>>>>>>>>> + *is_ipv4 = true; >>>>>>>>>>> } else { >>>>>>>>>>> free(error); >>>>>>>>>>> >>>>>>>>>>> struct in6_addr ip6; >>>>>>>>>>> - error = ipv6_parse_cidr(route->nexthop, &ip6, &plen); >>>>>>>>>>> + error = ipv6_parse_cidr(route->nexthop, &ip6, plen); >>>>>>>>>>> if (!error) { >>>>>>>>>>> - if (plen != 128) { >>>>>>>>>>> + if (*plen != 128) { >>>>>>>>>>> static struct vlog_rate_limit rl = >>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5, 1); >>>>>>>>>>> VLOG_WARN_RL(&rl, "bad next hop mask %s", >>>>>>>>>>> route->nexthop); >>>>>>>>>>> - return; >>>>>>>>>>> + return false; >>>>>>>>>>> } >>>>>>>>>>> - is_ipv4 = false; >>>>>>>>>>> + *is_ipv4 = false; >>>>>>>>>>> } else { >>>>>>>>>>> static struct vlog_rate_limit rl = >>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>>>>> 1); >>>>>>>>>>> VLOG_WARN_RL(&rl, "bad next hop ip address %s", >>>>>>>>>>> route->nexthop); >>>>>>>>>>> free(error); >>>>>>>>>>> - return; >>>>>>>>>>> + return false; >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> - char *prefix_s; >>>>>>>>>>> - if (is_ipv4) { >>>>>>>>>>> + if (*is_ipv4) { >>>>>>>>>>> ovs_be32 prefix; >>>>>>>>>>> /* Verify that ip prefix is a valid IPv4 address. */ >>>>>>>>>>> - error = ip_parse_cidr(route->ip_prefix, &prefix, >>>>>>>>>>> &plen); >>>>>>>>>>> + error = ip_parse_cidr(route->ip_prefix, &prefix, plen); >>>>>>>>>>> if (error) { >>>>>>>>>>> static struct vlog_rate_limit rl = >>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>>>>> 1); >>>>>>>>>>> VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes >>>>>>>>>>> %s", >>>>>>>>>>> route->ip_prefix); >>>>>>>>>>> free(error); >>>>>>>>>>> - return; >>>>>>>>>>> + return false; >>>>>>>>>>> } >>>>>>>>>>> - prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix & >>>>>>>>>>> be32_prefix_mask(plen))); >>>>>>>>>>> + *prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix >>>>>>>>>>> + & >>>>>>>>>>> be32_prefix_mask(*plen))); >>>>>>>>>>> } else { >>>>>>>>>>> /* Verify that ip prefix is a valid IPv6 address. */ >>>>>>>>>>> struct in6_addr prefix; >>>>>>>>>>> - error = ipv6_parse_cidr(route->ip_prefix, &prefix, >>>>>>>>>>> &plen); >>>>>>>>>>> + error = ipv6_parse_cidr(route->ip_prefix, &prefix, >>>>>>>>>>> plen); >>>>>>>>>>> if (error) { >>>>>>>>>>> static struct vlog_rate_limit rl = >>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>>>>> 1); >>>>>>>>>>> VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes >>>>>>>>>>> %s", >>>>>>>>>>> route->ip_prefix); >>>>>>>>>>> free(error); >>>>>>>>>>> - return; >>>>>>>>>>> + return false; >>>>>>>>>>> } >>>>>>>>>>> - struct in6_addr mask = ipv6_create_mask(plen); >>>>>>>>>>> + struct in6_addr mask = ipv6_create_mask(*plen); >>>>>>>>>>> struct in6_addr network = ipv6_addr_bitand(&prefix, >>>>>>>>>>> &mask); >>>>>>>>>>> - prefix_s = xmalloc(INET6_ADDRSTRLEN); >>>>>>>>>>> - inet_ntop(AF_INET6, &network, prefix_s, >>>>>>>>>>> INET6_ADDRSTRLEN); >>>>>>>>>>> + *prefix_s = xmalloc(INET6_ADDRSTRLEN); >>>>>>>>>>> + inet_ntop(AF_INET6, &network, *prefix_s, >>>>>>>>>>> INET6_ADDRSTRLEN); >>>>>>>>>>> + } >>>>>>>>>>> + >>>>>>>>>>> + return true; >>>>>>>>>>> +} >>>>>>>>>>> + >>>>>>>>>>> +static void >>>>>>>>>>> +build_static_route_flow(struct hmap *lflows, struct >>>>>>>>>>> ovn_datapath *od, >>>>>>>>>>> + struct hmap *ports, >>>>>>>>>>> + const struct >>>>>>>>>>> nbrec_logical_router_static_route >>>>>>>>>>> *route) >>>>>>>>>>> +{ >>>>>>>>>>> + const char *lrp_addr_s = NULL; >>>>>>>>>>> + unsigned int plen; >>>>>>>>>>> + bool is_ipv4; >>>>>>>>>>> + char *prefix_s = NULL; >>>>>>>>>>> + >>>>>>>>>>> + if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s, >>>>>>>>>>> &plen)) { >>>>>>>>>>> + return; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> /* Find the outgoing port. */ >>>>>>>>>>> @@ -4270,7 +4369,75 @@ build_static_route_flow(struct hmap >>>>>>>>>>> *lflows, struct >>>>>>>>>>> ovn_datapath *od, >>>>>>>>>>> policy); >>>>>>>>>>> >>>>>>>>>>> free_prefix_s: >>>>>>>>>>> - free(prefix_s); >>>>>>>>>>> + if (prefix_s) { >>>>>>>>>>> + free(prefix_s); >>>>>>>>>>> + } >>>>>>>>>>> +} >>>>>>>>>>> + >>>>>>>>>>> +static void >>>>>>>>>>> +build_multipath_flow(struct hmap *lflows, struct ovn_datapath >>>>>>>>>>> *od, >>>>>>>>>>> + struct hmap *ports, >>>>>>>>>>> + const struct nbrec_logical_router_static_ro >>>>>>>>>>> ute >>>>>>>>>>> *route) >>>>>>>>>>> +{ >>>>>>>>>>> + unsigned int plen; >>>>>>>>>>> + bool is_ipv4; >>>>>>>>>>> + char *prefix_s = NULL; >>>>>>>>>>> + >>>>>>>>>>> + if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s, >>>>>>>>>>> &plen)) { >>>>>>>>>>> + return; >>>>>>>>>>> + } >>>>>>>>>>> + >>>>>>>>>>> + /* Find the outgoing port. */ >>>>>>>>>>> + struct ovn_port **out_ports = xmalloc(route->n_multipath_port >>>>>>>>>>> * >>>>>>>>>>> + sizeof(struct >>>>>>>>>>> ovn_port *)); >>>>>>>>>>> + const char **lrp_addr_s = xmalloc(route->n_multipath_port * >>>>>>>>>>> + sizeof(const char *)); >>>>>>>>>>> + for (int i = 0; i < route->n_multipath_port; i++) { >>>>>>>>>>> + // TODO May need to consider some ports are not found? >>>>>>>>>>> + out_ports[i] = ovn_port_find(ports, >>>>>>>>>>> route->multipath_port[i]); >>>>>>>>>>> + if (!out_ports[i]) { >>>>>>>>>>> + static struct vlog_rate_limit rl = >>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>>>>> 1); >>>>>>>>>>> + VLOG_WARN_RL(&rl, "Bad out port %s for static route >>>>>>>>>>> %s", >>>>>>>>>>> + route->multipath_port[i], >>>>>>>>>>> route->ip_prefix); >>>>>>>>>>> + goto free_ports_lrp_addr; >>>>>>>>>>> + } >>>>>>>>>>> + >>>>>>>>>>> + lrp_addr_s[i] = find_lrp_member_ip(out_ports[i], >>>>>>>>>>> route->nexthop); >>>>>>>>>>> + if (!lrp_addr_s[i]) { >>>>>>>>>>> + if (is_ipv4) { >>>>>>>>>>> + if (out_ports[i]->lrp_networks.n_ipv4_addrs) { >>>>>>>>>>> + lrp_addr_s[i] = out_ports[i]-> >>>>>>>>>>> + lrp_networks.ipv4_addrs[0].addr_s; >>>>>>>>>>> + } >>>>>>>>>>> + } else { >>>>>>>>>>> + if (out_ports[i]->lrp_networks.n_ipv6_addrs) { >>>>>>>>>>> + lrp_addr_s[i] = out_ports[i]-> >>>>>>>>>>> + lrp_networks.ipv6_addrs[0].addr_s; >>>>>>>>>>> + } >>>>>>>>>>> + } >>>>>>>>>>> + } >>>>>>>>>>> + if (!lrp_addr_s[i]) { >>>>>>>>>>> + static struct vlog_rate_limit rl = >>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>>>>>>> 1); >>>>>>>>>>> + VLOG_WARN_RL(&rl, >>>>>>>>>>> + "%s has no path for static route %s; >>>>>>>>>>> next hop >>>>>>>>>>> %s", >>>>>>>>>>> + route->multipath_port[i], >>>>>>>>>>> route->ip_prefix, >>>>>>>>>>> + route->nexthop); >>>>>>>>>>> + goto free_ports_lrp_addr; >>>>>>>>>>> + } >>>>>>>>>>> + } >>>>>>>>>>> + >>>>>>>>>>> + >>>>>>>>>>> + char *policy = route->policy ? route->policy : "dst-ip"; >>>>>>>>>>> + add_multipath_route(lflows, route->n_multipath_port, >>>>>>>>>>> + out_ports, lrp_addr_s, od, >>>>>>>>>>> + prefix_s, plen, route->nexthop, policy); >>>>>>>>>>> + >>>>>>>>>>> +free_ports_lrp_addr: >>>>>>>>>>> + free(out_ports); >>>>>>>>>>> + free(lrp_addr_s); >>>>>>>>>>> + if (prefix_s) { >>>>>>>>>>> + free(prefix_s); >>>>>>>>>>> + } >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> static void >>>>>>>>>>> @@ -5344,7 +5511,7 @@ build_lrouter_flows(struct hmap >>>>>>>>>>> *datapaths, struct >>>>>>>>>>> hmap *ports, >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> - /* Convert the static routes to flows. */ >>>>>>>>>>> + /* Convert the static routes and multipath route to flows. >>>>>>>>>>> */ >>>>>>>>>>> HMAP_FOR_EACH (od, key_node, datapaths) { >>>>>>>>>>> if (!od->nbr) { >>>>>>>>>>> continue; >>>>>>>>>>> @@ -5355,12 +5522,24 @@ build_lrouter_flows(struct hmap >>>>>>>>>>> *datapaths, struct >>>>>>>>>>> hmap *ports, >>>>>>>>>>> >>>>>>>>>>> route = od->nbr->static_routes[i]; >>>>>>>>>>> build_static_route_flow(lflows, od, ports, route); >>>>>>>>>>> + /* Logical router ingress table 5-6: Multipath >>>>>>>>>>> Routing. >>>>>>>>>>> + * >>>>>>>>>>> + * If router has configured a traffic has multiple >>>>>>>>>>> paths >>>>>>>>>>> + * to destination. The right output port should be >>>>>>>>>>> firgured >>>>>>>>>>> + * out by computing IP packet's header */ >>>>>>>>>>> + if (route->n_multipath_port > 1) { >>>>>>>>>>> + /* Generate multipath routes in table 5,6 for >>>>>>>>>>> + * dedicated traffic */ >>>>>>>>>>> + build_multipath_flow(lflows, od, ports, route); >>>>>>>>>>> + } >>>>>>>>>>> } >>>>>>>>>>> + /* Packets are allowed by default in table 6. */ >>>>>>>>>>> + ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, 0, "1", >>>>>>>>>>> "next;"); >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> /* XXX destination unreachable */ >>>>>>>>>>> >>>>>>>>>>> - /* Local router ingress table 6: ARP Resolution. >>>>>>>>>>> + /* Local router ingress table 7: ARP Resolution. >>>>>>>>>>> * >>>>>>>>>>> * Any packet that reaches this table is an IP packet whose >>>>>>>>>>> next-hop >>>>>>>>>>> IP >>>>>>>>>>> * address is in reg0. (ip4.dst is the final destination.) >>>>>>>>>>> This table >>>>>>>>>>> @@ -5555,7 +5734,7 @@ build_lrouter_flows(struct hmap >>>>>>>>>>> *datapaths, struct >>>>>>>>>>> hmap *ports, >>>>>>>>>>> "get_nd(outport, xxreg0); next;"); >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> - /* Logical router ingress table 7: Gateway redirect. >>>>>>>>>>> + /* Logical router ingress table 8: Gateway redirect. >>>>>>>>>>> * >>>>>>>>>>> * For traffic with outport equal to the l3dgw_port >>>>>>>>>>> * on a distributed router, this table redirects a subset >>>>>>>>>>> @@ -5595,7 +5774,7 @@ build_lrouter_flows(struct hmap >>>>>>>>>>> *datapaths, struct >>>>>>>>>>> hmap *ports, >>>>>>>>>>> ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 0, >>>>>>>>>>> "1", >>>>>>>>>>> "next;"); >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> - /* Local router ingress table 8: ARP request. >>>>>>>>>>> + /* Local router ingress table 9: ARP request. >>>>>>>>>>> * >>>>>>>>>>> * In the common case where the Ethernet destination has >>>>>>>>>>> been >>>>>>>>>>> resolved, >>>>>>>>>>> * this table outputs the packet (priority 0). Otherwise, >>>>>>>>>>> it >>>>>>>>>>> composes >>>>>>>>>>> diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema >>>>>>>>>>> index a077bfb..b8bdd42 100644 >>>>>>>>>>> --- a/ovn/ovn-nb.ovsschema >>>>>>>>>>> +++ b/ovn/ovn-nb.ovsschema >>>>>>>>>>> @@ -1,7 +1,7 @@ >>>>>>>>>>> { >>>>>>>>>>> "name": "OVN_Northbound", >>>>>>>>>>> "version": "5.8.0", >>>>>>>>>>> - "cksum": "2812300190 <(281)%20230-0190> 16766", >>>>>>>>>>> + "cksum": "1967092589 16903", >>>>>>>>>>> "tables": { >>>>>>>>>>> "NB_Global": { >>>>>>>>>>> "columns": { >>>>>>>>>>> @@ -235,7 +235,9 @@ >>>>>>>>>>> >>>>>>>>>>> "dst-ip"]]}, >>>>>>>>>>> "min": 0, "max": 1}}, >>>>>>>>>>> "nexthop": {"type": "string"}, >>>>>>>>>>> - "output_port": {"type": {"key": "string", >>>>>>>>>>> "min": 0, >>>>>>>>>>> "max": 1}}}, >>>>>>>>>>> + "output_port": {"type": {"key": "string", >>>>>>>>>>> "min": 0, >>>>>>>>>>> "max": 1}}, >>>>>>>>>>> + "multipath_port": {"type": {"key": "string", >>>>>>>>>>> "min": 0, >>>>>>>>>>> + "max": >>>>>>>>>>> "unlimited"}}}, >>>>>>>>>>> "isRoot": false}, >>>>>>>>>>> "NAT": { >>>>>>>>>>> "columns": { >>>>>>>>>>> diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml >>>>>>>>>>> index 9869d7e..15feb97 100644 >>>>>>>>>>> --- a/ovn/ovn-nb.xml >>>>>>>>>>> +++ b/ovn/ovn-nb.xml >>>>>>>>>>> @@ -1487,6 +1487,15 @@ >>>>>>>>>>> address as the one via which the <ref >>>>>>>>>>> column="nexthop"/> is >>>>>>>>>>> reachable. >>>>>>>>>>> </p> >>>>>>>>>>> </column> >>>>>>>>>>> + <column name="multipath_port"> >>>>>>>>>>> + <p> >>>>>>>>>>> + The name of the <ref table="Logical_Router_Port"/> via >>>>>>>>>>> which the >>>>>>>>>>> packet >>>>>>>>>>> + needs to be sent out. When it contains more than two >>>>>>>>>>> ports, it >>>>>>>>>>> means >>>>>>>>>>> + packet has multiple candidate output ports. OVN uses >>>>>>>>>>> the packet >>>>>>>>>>> header >>>>>>>>>>> + to determin which port the packet would be delivered to. >>>>>>>>>>> + Currently, OVN consumes destination IP address to >>>>>>>>>>> figure out >>>>>>>>>>> port. >>>>>>>>>>> + </p> >>>>>>>>>>> + </column> >>>>>>>>>>> </table> >>>>>>>>>>> >>>>>>>>>>> <table name="NAT" title="NAT rules"> >>>>>>>>>>> -- >>>>>>>>>>> 1.8.3.1 >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> dev mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> dev mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
