I think the S/N or E/W are not the matter we should considering now.

The multipath implementation is based on the existing ovn workflows. If you
can use route to dispatch traffics to different node/logical port, then the
multipath can make it. Otherwise it must get bug in multipath.
If the static route cannot dispatch traffic to some nodes or logical port
then the multipath cannot make it as well.

I am not sure if my understanding is right: I think if you deploy a router
only on a specific ovn-node, then traffic between A(src)---router----B(dst)
should go through this router.

Any suggestions and comments are welcome :)


Thanks
Zhenyu Gao

2017-09-21 19:07 GMT+08:00 Miguel Angel Ajo Pelayo <majop...@redhat.com>:

> May be I missed something, but when I tried setting logical routers into
> specific chassis, still the E/W traffic was handled in a distributed way
> (from original chassis to destination chassis without going through the
> router chassis), such chassis was only used for N/S, but may be I got
> something wrong.
>
>
> On Wed, Sep 20, 2017 at 4:48 PM, Gao Zhenyu <sysugaozhe...@gmail.com>
> wrote:
>
>> "
>> But, if an ovn port in foo (chassis A) wants to talk to alice1 (chassis
>> B),
>> wouldn't all that E/W routing will happen virtually and the end result is
>> just a tunneled packet between chassis A and chassis B ? "
>> [ Now the hash function base on dst IP, if foo1 only talks to alice1, and
>> it is the tunnel packet between chassisA and chassis B ]
>>
>> The benifit is if you have two ovn-routers and those router are ONLY
>> deployed in chassis C and chassis D, the traffics can be sperated in two
>> paths automatically. Otherwise you need to config static rule one by one to
>> seperate traffics.
>> To make a long story short, you also can do same thing by config numerous
>> static rules to seperate traffic but the multipath can do it
>> automatically.
>>
>> 2017-09-20 22:08 GMT+08:00 Miguel Angel Ajo Pelayo <majop...@redhat.com>:
>>
>>> I forgot to say thank you very much for the explanation and diagrams.
>>>
>>> On Wed, Sep 20, 2017 at 4:07 PM, Miguel Angel Ajo Pelayo <
>>> majop...@redhat.com> wrote:
>>>
>>>> But, if an ovn port in foo (chassis A) wants to talk to alice1 (chassis
>>>> B),
>>>> wouldn't all that E/W routing will happen virtually and the end result
>>>> is just a tunneled packet between chassis A and chassis B ?
>>>>
>>>> What's the benefit of multipath there if the possible failing link is
>>>> always the connection between chassis A and chassis B ?
>>>>
>>>> I suspect there's something I'm missing on the picture.
>>>>
>>>> On Wed, Sep 20, 2017 at 3:49 PM, Gao Zhenyu <sysugaozhe...@gmail.com>
>>>> wrote:
>>>>
>>>>> You can take a look at this patch that implement a testcase :
>>>>> https://patchwork.ozlabs.org/patch/815475/
>>>>>
>>>>> In the testcase, we have R1, R2, R3.
>>>>>
>>>>>  R1 and R2 that are connected to each other via LS "join"  in
>>>>> 20.0.0.0/24 network.
>>>>>  R1 and R3 that are connected to each other  via LS "join2" in
>>>>> 20.0.0.0/24 network.
>>>>>  R1 has switchess foo (192.168.1.0/24) connected to it. R2 and R3 has
>>>>> alice (172.16.1.0/24) connected to it.
>>>>>  R2 and R3 are gateway routers.
>>>>>
>>>>> A packet send  to alice1/aclie2 from foo have mulitpath to
>>>>> destination:
>>>>>    1. foo-->R1-->join-->R2-->alice.
>>>>>    2. foo-->R1-->join2-->R3-->alice.
>>>>>
>>>>> In this testcase, it simulates two packet, one's destination is
>>>>> 172.16.1.2, another is 172.16.1.4.  The mulitpath that was configured in 
>>>>> R1
>>>>> can seperate those traffics to R2/R3. Finally,  172.16.1.2 packet travels
>>>>> path2, 172.16.1.4  packet travels path1
>>>>>
>>>>>       +------+
>>>>>       |  foo |
>>>>>       +------+
>>>>>           |
>>>>>           |
>>>>>        +------+
>>>>>        |  R1 |---------+
>>>>>        +------+       |
>>>>>            |        |
>>>>>            |        |
>>>>>         +------+   +-------+
>>>>>         | join |   | join2 |
>>>>>         +------+   +-------+
>>>>>             |      |
>>>>>             |      |
>>>>>         +------+   +-------+
>>>>>         |  R2 |   |  R3  |
>>>>>         +------+   +-------+
>>>>>            |       |
>>>>>            |       |
>>>>>         +-----------------+
>>>>>         |      alice  |
>>>>>         +-----------------+
>>>>>            |         |
>>>>>           alice1     alice2
>>>>>
>>>>> Please let me know if you have any question on it. :)
>>>>>
>>>>> Thanks
>>>>> Zhenyu Gao
>>>>>
>>>>> 2017-09-20 20:58 GMT+08:00 Miguel Angel Ajo Pelayo <
>>>>> majop...@redhat.com>:
>>>>>
>>>>>> Can you share an example of how this would benefit E/W routing. I'm
>>>>>> just not seeing the specific use case myself out of ignorance.
>>>>>>
>>>>>> It'd be great if you could explain how would it work between several
>>>>>> ports in the networks and routers (may be a diagram?) otherwise I can't 
>>>>>> be
>>>>>> really helpful reviewing :)
>>>>>>
>>>>>> Cheers, and thanks for the patience.
>>>>>>
>>>>>> On Wed, Sep 20, 2017 at 12:25 PM, Gao Zhenyu <sysugaozhe...@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Thanks for the suggestions!
>>>>>>>
>>>>>>> Not all Logical port has a real ofp_port connect with it. And
>>>>>>> bundle_load/bundle actions need real ovs port.
>>>>>>> Especially in ovn router port, all router port are virtual port
>>>>>>> which just a number/reg in our ovs-flows.
>>>>>>>
>>>>>>> This implement of multipath can seperate ovn east-west traffic, it
>>>>>>> helps dispatch traffic to gateways and routers easily.
>>>>>>>
>>>>>>> For south-north traffic, we can have bundle/bundle_load action to
>>>>>>> consider the remote tunnel up/down status. I would like to make it step 
>>>>>>> by
>>>>>>> step and implement it in my next series patches.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Zhenyu Gao
>>>>>>>
>>>>>>> 2017-09-20 17:53 GMT+08:00 Miguel Angel Ajo Pelayo <
>>>>>>> majop...@redhat.com>:
>>>>>>>
>>>>>>>> I'm not very familiar with multipath implementations,
>>>>>>>>
>>>>>>>> but would it be possible to use bundle( ouput action with hrw
>>>>>>>> algorithm instead of multipath calculation to a register?.
>>>>>>>>
>>>>>>>> I say this, because if you look at lib/multipath.c lib/bundle.c you
>>>>>>>> will find that bundle.c is going to consider the up/down status
>>>>>>>> (slave_enabled check) of the links.
>>>>>>>>
>>>>>>>> That way the controller doesn't need to modify any flow based on
>>>>>>>> link status.
>>>>>>>>
>>>>>>>> On Wed, Sep 20, 2017 at 5:45 AM, Gao Zhenyu <
>>>>>>>> sysugaozhe...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Thansk for the questions.
>>>>>>>>>
>>>>>>>>> the multipath_port can be set via ovn-nbctl.
>>>>>>>>> Like : ovn-nbctl   -- --id=@lrt create Logical_Router_Static_Route
>>>>>>>>> ip_prefix=0.0.0.0/0 nexthop=10.88.77.1 multipath_port=[mp1,mp2]
>>>>>>>>> -- add Logical_Router edge1 static_routes @lrt
>>>>>>>>> This patch haven't implement a ovn-nbctl command to configure
>>>>>>>>> multipath routing. Because I am still considering reusing nexthop or
>>>>>>>>> output_port(make them become array entries), and want to collect
>>>>>>>>> suggestions on it.
>>>>>>>>>
>>>>>>>>> About the status of next -hop, I would like to introduce
>>>>>>>>> bundle_load and bfd to make it later.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Zhenyu Gao
>>>>>>>>>
>>>>>>>>> 2017-09-20 11:13 GMT+08:00 <wang.qia...@zte.com.cn>:
>>>>>>>>>
>>>>>>>>>> How to configure multipath_port in static_route? I think the the
>>>>>>>>>> multipath
>>>>>>>>>> can be figured out from exist data of static_route, may not need
>>>>>>>>>> to add
>>>>>>>>>> this multipath_port column.
>>>>>>>>>>
>>>>>>>>>> And I think we should add a status column to indicate the nexthop
>>>>>>>>>> state.
>>>>>>>>>> When some of nexthop in multipath is down, ovn should change the
>>>>>>>>>> correspond flows.
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Zhenyu Gao <sysugaozhe...@gmail.com>
>>>>>>>>>> 发件人: ovs-dev-boun...@openvswitch.org
>>>>>>>>>> 2017/09/19 19:37
>>>>>>>>>>
>>>>>>>>>>         收件人:        b...@ovn.org, majop...@redhat.com,
>>>>>>>>>> anilvenk...@redhat.com, russ...@ovn.org, d...@openvswitch.org,
>>>>>>>>>>         抄送:
>>>>>>>>>>         主题:  [ovs-dev] [PATCH v1 1/3] Add multipath static router
>>>>>>>>>> in
>>>>>>>>>> OVN northd      and north-db
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 1. ovn-nb.ovsschema was updated to add new field multipath_port.
>>>>>>>>>> 2. Add multipath feature in ovn-northd part. northd generates
>>>>>>>>>> multipath
>>>>>>>>>> flows to dispatch traffic by using packet's IP dst address if
>>>>>>>>>> user set
>>>>>>>>>> Logical_Router_Static_Route's multipath_port with ports.
>>>>>>>>>> 3. Add new table(lr_in_multipath) in ovn-northd's router ingress
>>>>>>>>>> stages
>>>>>>>>>> to dispatch traffic to ports.
>>>>>>>>>> 4. Add multipath flow in Table 5(lr_in_ip_routing) and store hash
>>>>>>>>>> result
>>>>>>>>>> into reg0. reg9[2] was used to indicate packet which need
>>>>>>>>>> dispatching.
>>>>>>>>>> 5. Add multipath feature description in
>>>>>>>>>> ovn/northd/ovn-northd.8.xml
>>>>>>>>>> and ovn/ovn-nb.xml
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Zhenyu Gao <sysugaozhe...@gmail.com>
>>>>>>>>>> ---
>>>>>>>>>>  ovn/northd/ovn-northd.8.xml |  67 +++++++++++-
>>>>>>>>>>  ovn/northd/ovn-northd.c     | 245
>>>>>>>>>> ++++++++++++++++++++++++++++++++++++++------
>>>>>>>>>>  ovn/ovn-nb.ovsschema        |   6 +-
>>>>>>>>>>  ovn/ovn-nb.xml              |   9 ++
>>>>>>>>>>  4 files changed, 289 insertions(+), 38 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/ovn/northd/ovn-northd.8.xml
>>>>>>>>>> b/ovn/northd/ovn-northd.8.xml
>>>>>>>>>> index 0d85ec0..b1ce9a9 100644
>>>>>>>>>> --- a/ovn/northd/ovn-northd.8.xml
>>>>>>>>>> +++ b/ovn/northd/ovn-northd.8.xml
>>>>>>>>>> @@ -1598,6 +1598,9 @@ icmp4 {
>>>>>>>>>>        port (ingress table <code>ARP Request</code> will generate
>>>>>>>>>> an ARP
>>>>>>>>>>        request, if needed, with <code>reg0</code> as the target
>>>>>>>>>> protocol
>>>>>>>>>>        address and <code>reg1</code> as the source protocol
>>>>>>>>>> address).
>>>>>>>>>> +      A IP route can be configured that it has multipath to
>>>>>>>>>> next-hop.
>>>>>>>>>> +      If a packet has multipath to destination, OVN assign the
>>>>>>>>>> port
>>>>>>>>>> +      index into reg[0] to indicate the packet's output port in
>>>>>>>>>> table 6.
>>>>>>>>>>      </p>
>>>>>>>>>>
>>>>>>>>>>      <p>
>>>>>>>>>> @@ -1617,6 +1620,28 @@ icmp4 {
>>>>>>>>>>
>>>>>>>>>>        <li>
>>>>>>>>>>          <p>
>>>>>>>>>> +          IPv4/IPV6 multipath routing table. For each route to
>>>>>>>>>> IPv4/IPv6
>>>>>>>>>> +          network <var>N</var> with netmask <var>M</var>, on
>>>>>>>>>> multipath
>>>>>>>>>> port
>>>>>>>>>> +          <var>P</var> with IP address <var>A</var> and Ethernet
>>>>>>>>>> +          address <var>E</var>, a logical flow with match
>>>>>>>>>> +          <code>ip4.dst ==<var>N</var>/<var>M</var></code>,whose
>>>>>>>>>> priority
>>>>>>>>>> +          is the number of 1-bits plus 10 in <var>M</var>,
>>>>>>>>>> +          has the following actions:
>>>>>>>>>> +        </p>
>>>>>>>>>> +
>>>>>>>>>> +        <pre>
>>>>>>>>>> +ip.ttl--;
>>>>>>>>>> +multipath (nw_dst, 0, modulo_n, <var>n_links</var>, 0, reg0);
>>>>>>>>>> +reg9[2] = 1
>>>>>>>>>> +next;
>>>>>>>>>> +        </pre>
>>>>>>>>>> +        <p>
>>>>>>>>>> +          <var>n_links</var> is the number of multipath port.
>>>>>>>>>> +        </p>
>>>>>>>>>> +      </li>
>>>>>>>>>> +
>>>>>>>>>> +      <li>
>>>>>>>>>> +        <p>
>>>>>>>>>>            IPv4 routing table.  For each route to IPv4 network
>>>>>>>>>> <var>N</var> with
>>>>>>>>>>            netmask <var>M</var>, on router port <var>P</var> with
>>>>>>>>>> IP
>>>>>>>>>> address
>>>>>>>>>>            <var>A</var> and Ethernet
>>>>>>>>>> @@ -1686,7 +1711,43 @@ next;
>>>>>>>>>>        </li>
>>>>>>>>>>      </ul>
>>>>>>>>>>
>>>>>>>>>> -    <h3>Ingress Table 6: ARP/ND Resolution</h3>
>>>>>>>>>> +    <h3>Ingress Table 6: Multipath</h3>
>>>>>>>>>> +    <p>
>>>>>>>>>> +      Any packet taht reaches this table is an IP packet and
>>>>>>>>>> reg9[2]=1
>>>>>>>>>> +      using the following flows to route to corresponding port.
>>>>>>>>>> This
>>>>>>>>>> table
>>>>>>>>>> +      implement dispatching by consuming reg0.
>>>>>>>>>> +    </p>
>>>>>>>>>> +
>>>>>>>>>> +    <ul>
>>>>>>>>>> +      <li>
>>>>>>>>>> +        <p>
>>>>>>>>>> +          A packet with netmask <var>M</var>, IP address
>>>>>>>>>> <var>A</var> and
>>>>>>>>>> +          <code>reg9[2] = 1</code>, whose priority above 1 has
>>>>>>>>>> following
>>>>>>>>>> +          actions:
>>>>>>>>>> +        </p>
>>>>>>>>>> +
>>>>>>>>>> +        <pre>
>>>>>>>>>> +reg0 = <var>G</var>;
>>>>>>>>>> +reg1 = <var>A</var>;
>>>>>>>>>> +eth.src = <var>E</var>;
>>>>>>>>>> +outport = <var>P</var>;
>>>>>>>>>> +flags.loopback = 1;
>>>>>>>>>> +next;
>>>>>>>>>> +        </pre>
>>>>>>>>>> +
>>>>>>>>>> +        <p>
>>>>>>>>>> +          <var>G</var> is the gateway IP address. <var>A</var>,
>>>>>>>>>> <var>E</var>
>>>>>>>>>> +          and <var>P</var> are the values that were described in
>>>>>>>>>> multipath
>>>>>>>>>> +          routeing in table 5
>>>>>>>>>> +        </p>
>>>>>>>>>> +
>>>>>>>>>> +        <p>
>>>>>>>>>> +          A priority-0 logical flow with match has actions
>>>>>>>>>> <code>next;</code>.
>>>>>>>>>> +        </p>
>>>>>>>>>> +      </li>
>>>>>>>>>> +    </ul>
>>>>>>>>>> +
>>>>>>>>>> +    <h3>Ingress Table 7: ARP/ND Resolution</h3>
>>>>>>>>>>
>>>>>>>>>>      <p>
>>>>>>>>>>        Any packet that reaches this table is an IP packet whose
>>>>>>>>>> next-hop
>>>>>>>>>> @@ -1779,7 +1840,7 @@ next;
>>>>>>>>>>        </li>
>>>>>>>>>>      </ul>
>>>>>>>>>>
>>>>>>>>>> -    <h3>Ingress Table 7: Gateway Redirect</h3>
>>>>>>>>>> +    <h3>Ingress Table 8: Gateway Redirect</h3>
>>>>>>>>>>
>>>>>>>>>>      <p>
>>>>>>>>>>        For distributed logical routers where one of the logical
>>>>>>>>>> router
>>>>>>>>>> @@ -1836,7 +1897,7 @@ next;
>>>>>>>>>>        </li>
>>>>>>>>>>      </ul>
>>>>>>>>>>
>>>>>>>>>> -    <h3>Ingress Table 8: ARP Request</h3>
>>>>>>>>>> +    <h3>Ingress Table 9: ARP Request</h3>
>>>>>>>>>>
>>>>>>>>>>      <p>
>>>>>>>>>>        In the common case where the Ethernet destination has been
>>>>>>>>>> resolved, this
>>>>>>>>>> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>>>>>>>>>> index 49e4ac3..44d1fd4 100644
>>>>>>>>>> --- a/ovn/northd/ovn-northd.c
>>>>>>>>>> +++ b/ovn/northd/ovn-northd.c
>>>>>>>>>> @@ -135,9 +135,10 @@ enum ovn_stage {
>>>>>>>>>>      PIPELINE_STAGE(ROUTER, IN,  UNSNAT,      3, "lr_in_unsnat")
>>>>>>>>>>      \
>>>>>>>>>>      PIPELINE_STAGE(ROUTER, IN,  DNAT,        4, "lr_in_dnat")
>>>>>>>>>>      \
>>>>>>>>>>      PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  5,
>>>>>>>>>> "lr_in_ip_routing")   \
>>>>>>>>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 6,
>>>>>>>>>> "lr_in_arp_resolve")  \
>>>>>>>>>> -    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 7,
>>>>>>>>>> "lr_in_gw_redirect")  \
>>>>>>>>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 8,
>>>>>>>>>> "lr_in_arp_request")  \
>>>>>>>>>> +    PIPELINE_STAGE(ROUTER, IN,  MULTIPATH,   6,
>>>>>>>>>> "lr_in_multipath")    \
>>>>>>>>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 7,
>>>>>>>>>> "lr_in_arp_resolve")  \
>>>>>>>>>> +    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 8,
>>>>>>>>>> "lr_in_gw_redirect")  \
>>>>>>>>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 9,
>>>>>>>>>> "lr_in_arp_request")  \
>>>>>>>>>>
>>>>>>>>>>      \
>>>>>>>>>>      /* Logical router egress stages. */
>>>>>>>>>>      \
>>>>>>>>>>      PIPELINE_STAGE(ROUTER, OUT, UNDNAT,    0, "lr_out_undnat")
>>>>>>>>>>       \
>>>>>>>>>> @@ -173,6 +174,11 @@ enum ovn_stage {
>>>>>>>>>>   * one of the logical router's own IP addresses. */
>>>>>>>>>>  #define REGBIT_EGRESS_LOOPBACK  "reg9[1]"
>>>>>>>>>>
>>>>>>>>>> +/* Indicate multipath action has process this packet and store
>>>>>>>>>> hash
>>>>>>>>>> result
>>>>>>>>>> + * into other regX. Should consume the hash result to determin
>>>>>>>>>> the right
>>>>>>>>>> + * output port. */
>>>>>>>>>> +#define REGBIT_MULTIPATH "reg9[2]"
>>>>>>>>>> +
>>>>>>>>>>  /* Returns an "enum ovn_stage" built from the arguments. */
>>>>>>>>>>  static enum ovn_stage
>>>>>>>>>>  ovn_stage_build(enum ovn_datapath_type dp_type, enum ovn_pipeline
>>>>>>>>>> pipeline,
>>>>>>>>>> @@ -4142,72 +4148,165 @@ add_route(struct hmap *lflows, const
>>>>>>>>>> struct
>>>>>>>>>> ovn_port *op,
>>>>>>>>>>  }
>>>>>>>>>>
>>>>>>>>>>  static void
>>>>>>>>>> -build_static_route_flow(struct hmap *lflows, struct
>>>>>>>>>> ovn_datapath *od,
>>>>>>>>>> -                        struct hmap *ports,
>>>>>>>>>> -                        const struct
>>>>>>>>>> nbrec_logical_router_static_route
>>>>>>>>>> *route)
>>>>>>>>>> +add_multipath_route(struct hmap *lflows, uint32_t port_num,
>>>>>>>>>> +                    struct ovn_port **out_ports,
>>>>>>>>>> +                    const char **lrp_addr_s,
>>>>>>>>>> +                    struct ovn_datapath *od,
>>>>>>>>>> +                    const char *network_s, int plen,
>>>>>>>>>> +                    const char *gateway, const char *policy)
>>>>>>>>>> +{
>>>>>>>>>> +    bool is_ipv4 = strchr(network_s, '.') ? true : false;
>>>>>>>>>> +    struct ds match = DS_EMPTY_INITIALIZER;
>>>>>>>>>> +    const char *dir;
>>>>>>>>>> +    uint16_t priority;
>>>>>>>>>> +
>>>>>>>>>> +    if (policy && !strcmp(policy, "src-ip")) {
>>>>>>>>>> +        dir = "src";
>>>>>>>>>> +        priority = plen * 2;
>>>>>>>>>> +    } else {
>>>>>>>>>> +        dir = "dst";
>>>>>>>>>> +        priority = (plen * 2) + 1;
>>>>>>>>>> +    }
>>>>>>>>>> +
>>>>>>>>>> +    /* Set higer priority than regular route. */
>>>>>>>>>> +    priority += 10;
>>>>>>>>>> +
>>>>>>>>>> +    ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" :
>>>>>>>>>> "6", dir,
>>>>>>>>>> +                  network_s, plen);
>>>>>>>>>> +
>>>>>>>>>> +    struct ds actions = DS_EMPTY_INITIALIZER;
>>>>>>>>>> +
>>>>>>>>>> +    ds_put_format(&actions, "ip.ttl--; ");
>>>>>>>>>> +    ds_put_format(&actions,
>>>>>>>>>> +                  "multipath (nw_dst, 0, modulo_n, %u, 0, reg0);
>>>>>>>>>> "
>>>>>>>>>> +                  "%s = 1; "
>>>>>>>>>> +                  "next;",
>>>>>>>>>> +                  port_num, REGBIT_MULTIPATH);
>>>>>>>>>> +
>>>>>>>>>> +    /* The priority here is calculated to implement
>>>>>>>>>> longest-prefix-match
>>>>>>>>>> +     * routing. */
>>>>>>>>>> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_ROUTING, priority,
>>>>>>>>>> +                  ds_cstr(&match), ds_cstr(&actions));
>>>>>>>>>> +
>>>>>>>>>> +    for (int i = 0; i < port_num; i++) {
>>>>>>>>>> +        struct ds mp_match = DS_EMPTY_INITIALIZER;
>>>>>>>>>> +        struct ds mp_actions = DS_EMPTY_INITIALIZER;
>>>>>>>>>> +
>>>>>>>>>> +        ds_put_format(&mp_match, "%s == 1 && reg0 == %d && ",
>>>>>>>>>> +                      REGBIT_MULTIPATH, i);
>>>>>>>>>> +        ds_put_format(&mp_match, "ip%s.%s == %s/%d",
>>>>>>>>>> +                      is_ipv4 ? "4" : "6", dir,
>>>>>>>>>> +                      network_s, plen);
>>>>>>>>>> +
>>>>>>>>>> +        ds_put_format(&mp_actions, "%sreg0 = ", is_ipv4 ? "" :
>>>>>>>>>> "xx");
>>>>>>>>>> +        if (gateway) {
>>>>>>>>>> +            ds_put_cstr(&mp_actions, gateway);
>>>>>>>>>> +        } else {
>>>>>>>>>> +            ds_put_format(&mp_actions, "ip%s.dst", is_ipv4 ? "4"
>>>>>>>>>> : "6");
>>>>>>>>>> +        }
>>>>>>>>>> +
>>>>>>>>>> +        ds_put_format(&mp_actions, "; "
>>>>>>>>>> +                      "%sreg1 = %s; "
>>>>>>>>>> +                      "eth.src = %s; "
>>>>>>>>>> +                      "outport = %s; "
>>>>>>>>>> +                      "flags.loopback = 1; "
>>>>>>>>>> +                      "next;",
>>>>>>>>>> +                      is_ipv4 ? "" : "xx",
>>>>>>>>>> +                      lrp_addr_s[i],
>>>>>>>>>> +                      out_ports[i]->lrp_networks.ea_s,
>>>>>>>>>> +                      out_ports[i]->json_key);
>>>>>>>>>> +
>>>>>>>>>> +        /* Add flow in table 6 to determin the right output port
>>>>>>>>>> +         * for this traffic. */
>>>>>>>>>> +        ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH,
>>>>>>>>>> priority,
>>>>>>>>>> +                      ds_cstr(&mp_match), ds_cstr(&mp_actions));
>>>>>>>>>> +        ds_destroy(&mp_match);
>>>>>>>>>> +        ds_destroy(&mp_actions);
>>>>>>>>>> +    }
>>>>>>>>>> +    ds_destroy(&match);
>>>>>>>>>> +    ds_destroy(&actions);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +static bool
>>>>>>>>>> +verify_nexthop_prefix(const struct nbrec_logical_router_static_ro
>>>>>>>>>> ute
>>>>>>>>>> *route,
>>>>>>>>>> +                      bool *is_ipv4, char **prefix_s, unsigned
>>>>>>>>>> int *plen)
>>>>>>>>>>  {
>>>>>>>>>>      ovs_be32 nexthop;
>>>>>>>>>> -    const char *lrp_addr_s = NULL;
>>>>>>>>>> -    unsigned int plen;
>>>>>>>>>> -    bool is_ipv4;
>>>>>>>>>>
>>>>>>>>>>      /* Verify that the next hop is an IP address with an
>>>>>>>>>> all-ones mask.
>>>>>>>>>> */
>>>>>>>>>> -    char *error = ip_parse_cidr(route->nexthop, &nexthop, &plen);
>>>>>>>>>> +    char *error = ip_parse_cidr(route->nexthop, &nexthop, plen);
>>>>>>>>>>      if (!error) {
>>>>>>>>>> -        if (plen != 32) {
>>>>>>>>>> +        if (*plen != 32) {
>>>>>>>>>>              static struct vlog_rate_limit rl =
>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5,
>>>>>>>>>> 1);
>>>>>>>>>>              VLOG_WARN_RL(&rl, "bad next hop mask %s",
>>>>>>>>>> route->nexthop);
>>>>>>>>>> -            return;
>>>>>>>>>> +            return false;
>>>>>>>>>>          }
>>>>>>>>>> -        is_ipv4 = true;
>>>>>>>>>> +        *is_ipv4 = true;
>>>>>>>>>>      } else {
>>>>>>>>>>          free(error);
>>>>>>>>>>
>>>>>>>>>>          struct in6_addr ip6;
>>>>>>>>>> -        error = ipv6_parse_cidr(route->nexthop, &ip6, &plen);
>>>>>>>>>> +        error = ipv6_parse_cidr(route->nexthop, &ip6, plen);
>>>>>>>>>>          if (!error) {
>>>>>>>>>> -            if (plen != 128) {
>>>>>>>>>> +            if (*plen != 128) {
>>>>>>>>>>                  static struct vlog_rate_limit rl =
>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5, 1);
>>>>>>>>>>                  VLOG_WARN_RL(&rl, "bad next hop mask %s",
>>>>>>>>>> route->nexthop);
>>>>>>>>>> -                return;
>>>>>>>>>> +                return false;
>>>>>>>>>>              }
>>>>>>>>>> -            is_ipv4 = false;
>>>>>>>>>> +            *is_ipv4 = false;
>>>>>>>>>>          } else {
>>>>>>>>>>              static struct vlog_rate_limit rl =
>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5,
>>>>>>>>>> 1);
>>>>>>>>>>              VLOG_WARN_RL(&rl, "bad next hop ip address %s",
>>>>>>>>>> route->nexthop);
>>>>>>>>>>              free(error);
>>>>>>>>>> -            return;
>>>>>>>>>> +            return false;
>>>>>>>>>>          }
>>>>>>>>>>      }
>>>>>>>>>>
>>>>>>>>>> -    char *prefix_s;
>>>>>>>>>> -    if (is_ipv4) {
>>>>>>>>>> +    if (*is_ipv4) {
>>>>>>>>>>          ovs_be32 prefix;
>>>>>>>>>>          /* Verify that ip prefix is a valid IPv4 address. */
>>>>>>>>>> -        error = ip_parse_cidr(route->ip_prefix, &prefix, &plen);
>>>>>>>>>> +        error = ip_parse_cidr(route->ip_prefix, &prefix, plen);
>>>>>>>>>>          if (error) {
>>>>>>>>>>              static struct vlog_rate_limit rl =
>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5,
>>>>>>>>>> 1);
>>>>>>>>>>              VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes
>>>>>>>>>> %s",
>>>>>>>>>>                           route->ip_prefix);
>>>>>>>>>>              free(error);
>>>>>>>>>> -            return;
>>>>>>>>>> +            return false;
>>>>>>>>>>          }
>>>>>>>>>> -        prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix &
>>>>>>>>>> be32_prefix_mask(plen)));
>>>>>>>>>> +        *prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix
>>>>>>>>>> +                                              &
>>>>>>>>>> be32_prefix_mask(*plen)));
>>>>>>>>>>      } else {
>>>>>>>>>>          /* Verify that ip prefix is a valid IPv6 address. */
>>>>>>>>>>          struct in6_addr prefix;
>>>>>>>>>> -        error = ipv6_parse_cidr(route->ip_prefix, &prefix,
>>>>>>>>>> &plen);
>>>>>>>>>> +        error = ipv6_parse_cidr(route->ip_prefix, &prefix,
>>>>>>>>>> plen);
>>>>>>>>>>          if (error) {
>>>>>>>>>>              static struct vlog_rate_limit rl =
>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5,
>>>>>>>>>> 1);
>>>>>>>>>>              VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes
>>>>>>>>>> %s",
>>>>>>>>>>                           route->ip_prefix);
>>>>>>>>>>              free(error);
>>>>>>>>>> -            return;
>>>>>>>>>> +            return false;
>>>>>>>>>>          }
>>>>>>>>>> -        struct in6_addr mask = ipv6_create_mask(plen);
>>>>>>>>>> +        struct in6_addr mask = ipv6_create_mask(*plen);
>>>>>>>>>>          struct in6_addr network = ipv6_addr_bitand(&prefix,
>>>>>>>>>> &mask);
>>>>>>>>>> -        prefix_s = xmalloc(INET6_ADDRSTRLEN);
>>>>>>>>>> -        inet_ntop(AF_INET6, &network, prefix_s,
>>>>>>>>>> INET6_ADDRSTRLEN);
>>>>>>>>>> +        *prefix_s = xmalloc(INET6_ADDRSTRLEN);
>>>>>>>>>> +        inet_ntop(AF_INET6, &network, *prefix_s,
>>>>>>>>>> INET6_ADDRSTRLEN);
>>>>>>>>>> +    }
>>>>>>>>>> +
>>>>>>>>>> +    return true;
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +static void
>>>>>>>>>> +build_static_route_flow(struct hmap *lflows, struct
>>>>>>>>>> ovn_datapath *od,
>>>>>>>>>> +                        struct hmap *ports,
>>>>>>>>>> +                        const struct
>>>>>>>>>> nbrec_logical_router_static_route
>>>>>>>>>> *route)
>>>>>>>>>> +{
>>>>>>>>>> +    const char *lrp_addr_s = NULL;
>>>>>>>>>> +    unsigned int plen;
>>>>>>>>>> +    bool is_ipv4;
>>>>>>>>>> +    char *prefix_s = NULL;
>>>>>>>>>> +
>>>>>>>>>> +    if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s,
>>>>>>>>>> &plen)) {
>>>>>>>>>> +        return;
>>>>>>>>>>      }
>>>>>>>>>>
>>>>>>>>>>      /* Find the outgoing port. */
>>>>>>>>>> @@ -4270,7 +4369,75 @@ build_static_route_flow(struct hmap
>>>>>>>>>> *lflows, struct
>>>>>>>>>> ovn_datapath *od,
>>>>>>>>>>                policy);
>>>>>>>>>>
>>>>>>>>>>  free_prefix_s:
>>>>>>>>>> -    free(prefix_s);
>>>>>>>>>> +    if (prefix_s) {
>>>>>>>>>> +        free(prefix_s);
>>>>>>>>>> +    }
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +static void
>>>>>>>>>> +build_multipath_flow(struct hmap *lflows, struct ovn_datapath
>>>>>>>>>> *od,
>>>>>>>>>> +                     struct hmap *ports,
>>>>>>>>>> +                     const struct nbrec_logical_router_static_ro
>>>>>>>>>> ute
>>>>>>>>>> *route)
>>>>>>>>>> +{
>>>>>>>>>> +    unsigned int plen;
>>>>>>>>>> +    bool is_ipv4;
>>>>>>>>>> +    char *prefix_s = NULL;
>>>>>>>>>> +
>>>>>>>>>> +    if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s,
>>>>>>>>>> &plen)) {
>>>>>>>>>> +        return;
>>>>>>>>>> +    }
>>>>>>>>>> +
>>>>>>>>>> +    /* Find the outgoing port. */
>>>>>>>>>> +    struct ovn_port **out_ports = xmalloc(route->n_multipath_port
>>>>>>>>>> *
>>>>>>>>>> +                                             sizeof(struct
>>>>>>>>>> ovn_port *));
>>>>>>>>>> +    const char **lrp_addr_s = xmalloc(route->n_multipath_port *
>>>>>>>>>> +                                         sizeof(const char *));
>>>>>>>>>> +    for (int i = 0; i < route->n_multipath_port; i++) {
>>>>>>>>>> +        // TODO May need to consider some ports are not found?
>>>>>>>>>> +        out_ports[i] = ovn_port_find(ports,
>>>>>>>>>> route->multipath_port[i]);
>>>>>>>>>> +        if (!out_ports[i]) {
>>>>>>>>>> +            static struct vlog_rate_limit rl =
>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5,
>>>>>>>>>> 1);
>>>>>>>>>> +            VLOG_WARN_RL(&rl, "Bad out port %s for static route
>>>>>>>>>> %s",
>>>>>>>>>> +                         route->multipath_port[i],
>>>>>>>>>> route->ip_prefix);
>>>>>>>>>> +            goto free_ports_lrp_addr;
>>>>>>>>>> +        }
>>>>>>>>>> +
>>>>>>>>>> +        lrp_addr_s[i] = find_lrp_member_ip(out_ports[i],
>>>>>>>>>> route->nexthop);
>>>>>>>>>> +        if (!lrp_addr_s[i]) {
>>>>>>>>>> +            if (is_ipv4) {
>>>>>>>>>> +                if (out_ports[i]->lrp_networks.n_ipv4_addrs) {
>>>>>>>>>> +                    lrp_addr_s[i] = out_ports[i]->
>>>>>>>>>> + lrp_networks.ipv4_addrs[0].addr_s;
>>>>>>>>>> +                }
>>>>>>>>>> +            } else {
>>>>>>>>>> +                if (out_ports[i]->lrp_networks.n_ipv6_addrs) {
>>>>>>>>>> +                    lrp_addr_s[i] = out_ports[i]->
>>>>>>>>>> + lrp_networks.ipv6_addrs[0].addr_s;
>>>>>>>>>> +                }
>>>>>>>>>> +            }
>>>>>>>>>> +        }
>>>>>>>>>> +        if (!lrp_addr_s[i]) {
>>>>>>>>>> +            static struct vlog_rate_limit rl =
>>>>>>>>>> VLOG_RATE_LIMIT_INIT(5,
>>>>>>>>>> 1);
>>>>>>>>>> +            VLOG_WARN_RL(&rl,
>>>>>>>>>> +                         "%s has no path for static route %s;
>>>>>>>>>> next hop
>>>>>>>>>> %s",
>>>>>>>>>> +                         route->multipath_port[i],
>>>>>>>>>> route->ip_prefix,
>>>>>>>>>> +                         route->nexthop);
>>>>>>>>>> +            goto free_ports_lrp_addr;
>>>>>>>>>> +        }
>>>>>>>>>> +    }
>>>>>>>>>> +
>>>>>>>>>> +
>>>>>>>>>> +    char *policy = route->policy ? route->policy : "dst-ip";
>>>>>>>>>> +    add_multipath_route(lflows, route->n_multipath_port,
>>>>>>>>>> +                        out_ports, lrp_addr_s, od,
>>>>>>>>>> +                        prefix_s, plen, route->nexthop, policy);
>>>>>>>>>> +
>>>>>>>>>> +free_ports_lrp_addr:
>>>>>>>>>> +    free(out_ports);
>>>>>>>>>> +    free(lrp_addr_s);
>>>>>>>>>> +    if (prefix_s) {
>>>>>>>>>> +        free(prefix_s);
>>>>>>>>>> +    }
>>>>>>>>>>  }
>>>>>>>>>>
>>>>>>>>>>  static void
>>>>>>>>>> @@ -5344,7 +5511,7 @@ build_lrouter_flows(struct hmap *datapaths,
>>>>>>>>>> struct
>>>>>>>>>> hmap *ports,
>>>>>>>>>>          }
>>>>>>>>>>      }
>>>>>>>>>>
>>>>>>>>>> -    /* Convert the static routes to flows. */
>>>>>>>>>> +    /* Convert the static routes and multipath route to flows. */
>>>>>>>>>>      HMAP_FOR_EACH (od, key_node, datapaths) {
>>>>>>>>>>          if (!od->nbr) {
>>>>>>>>>>              continue;
>>>>>>>>>> @@ -5355,12 +5522,24 @@ build_lrouter_flows(struct hmap
>>>>>>>>>> *datapaths, struct
>>>>>>>>>> hmap *ports,
>>>>>>>>>>
>>>>>>>>>>              route = od->nbr->static_routes[i];
>>>>>>>>>>              build_static_route_flow(lflows, od, ports, route);
>>>>>>>>>> +            /* Logical router ingress table 5-6: Multipath
>>>>>>>>>> Routing.
>>>>>>>>>> +             *
>>>>>>>>>> +             * If router has configured a traffic has multiple
>>>>>>>>>> paths
>>>>>>>>>> +             * to destination. The right output port should be
>>>>>>>>>> firgured
>>>>>>>>>> +             * out by computing IP packet's header */
>>>>>>>>>> +            if (route->n_multipath_port > 1) {
>>>>>>>>>> +                /* Generate multipath routes in table 5,6 for
>>>>>>>>>> +                 * dedicated traffic */
>>>>>>>>>> +                build_multipath_flow(lflows, od, ports, route);
>>>>>>>>>> +            }
>>>>>>>>>>          }
>>>>>>>>>> +        /* Packets are allowed by default in table 6. */
>>>>>>>>>> +        ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, 0, "1",
>>>>>>>>>> "next;");
>>>>>>>>>>      }
>>>>>>>>>>
>>>>>>>>>>      /* XXX destination unreachable */
>>>>>>>>>>
>>>>>>>>>> -    /* Local router ingress table 6: ARP Resolution.
>>>>>>>>>> +    /* Local router ingress table 7: ARP Resolution.
>>>>>>>>>>       *
>>>>>>>>>>       * Any packet that reaches this table is an IP packet whose
>>>>>>>>>> next-hop
>>>>>>>>>> IP
>>>>>>>>>>       * address is in reg0. (ip4.dst is the final destination.)
>>>>>>>>>> This table
>>>>>>>>>> @@ -5555,7 +5734,7 @@ build_lrouter_flows(struct hmap *datapaths,
>>>>>>>>>> struct
>>>>>>>>>> hmap *ports,
>>>>>>>>>>                        "get_nd(outport, xxreg0); next;");
>>>>>>>>>>      }
>>>>>>>>>>
>>>>>>>>>> -    /* Logical router ingress table 7: Gateway redirect.
>>>>>>>>>> +    /* Logical router ingress table 8: Gateway redirect.
>>>>>>>>>>       *
>>>>>>>>>>       * For traffic with outport equal to the l3dgw_port
>>>>>>>>>>       * on a distributed router, this table redirects a subset
>>>>>>>>>> @@ -5595,7 +5774,7 @@ build_lrouter_flows(struct hmap *datapaths,
>>>>>>>>>> struct
>>>>>>>>>> hmap *ports,
>>>>>>>>>>          ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 0,
>>>>>>>>>> "1",
>>>>>>>>>> "next;");
>>>>>>>>>>      }
>>>>>>>>>>
>>>>>>>>>> -    /* Local router ingress table 8: ARP request.
>>>>>>>>>> +    /* Local router ingress table 9: ARP request.
>>>>>>>>>>       *
>>>>>>>>>>       * In the common case where the Ethernet destination has been
>>>>>>>>>> resolved,
>>>>>>>>>>       * this table outputs the packet (priority 0).  Otherwise, it
>>>>>>>>>> composes
>>>>>>>>>> diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
>>>>>>>>>> index a077bfb..b8bdd42 100644
>>>>>>>>>> --- a/ovn/ovn-nb.ovsschema
>>>>>>>>>> +++ b/ovn/ovn-nb.ovsschema
>>>>>>>>>> @@ -1,7 +1,7 @@
>>>>>>>>>>  {
>>>>>>>>>>      "name": "OVN_Northbound",
>>>>>>>>>>      "version": "5.8.0",
>>>>>>>>>> -    "cksum": "2812300190 <(281)%20230-0190> 16766",
>>>>>>>>>> +    "cksum": "1967092589 16903",
>>>>>>>>>>      "tables": {
>>>>>>>>>>          "NB_Global": {
>>>>>>>>>>              "columns": {
>>>>>>>>>> @@ -235,7 +235,9 @@
>>>>>>>>>>
>>>>>>>>>> "dst-ip"]]},
>>>>>>>>>>                                      "min": 0, "max": 1}},
>>>>>>>>>>                  "nexthop": {"type": "string"},
>>>>>>>>>> -                "output_port": {"type": {"key": "string", "min":
>>>>>>>>>> 0,
>>>>>>>>>> "max": 1}}},
>>>>>>>>>> +                "output_port": {"type": {"key": "string", "min":
>>>>>>>>>> 0,
>>>>>>>>>> "max": 1}},
>>>>>>>>>> +                "multipath_port": {"type": {"key": "string",
>>>>>>>>>> "min": 0,
>>>>>>>>>> +                                            "max":
>>>>>>>>>> "unlimited"}}},
>>>>>>>>>>              "isRoot": false},
>>>>>>>>>>          "NAT": {
>>>>>>>>>>              "columns": {
>>>>>>>>>> diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
>>>>>>>>>> index 9869d7e..15feb97 100644
>>>>>>>>>> --- a/ovn/ovn-nb.xml
>>>>>>>>>> +++ b/ovn/ovn-nb.xml
>>>>>>>>>> @@ -1487,6 +1487,15 @@
>>>>>>>>>>          address as the one via which the <ref column="nexthop"/>
>>>>>>>>>> is
>>>>>>>>>> reachable.
>>>>>>>>>>        </p>
>>>>>>>>>>      </column>
>>>>>>>>>> +    <column name="multipath_port">
>>>>>>>>>> +      <p>
>>>>>>>>>> +        The name of the <ref table="Logical_Router_Port"/> via
>>>>>>>>>> which the
>>>>>>>>>> packet
>>>>>>>>>> +        needs to be sent out. When it contains more than two
>>>>>>>>>> ports, it
>>>>>>>>>> means
>>>>>>>>>> +        packet has multiple candidate output ports. OVN uses the
>>>>>>>>>> packet
>>>>>>>>>> header
>>>>>>>>>> +        to determin which port the packet would be delivered to.
>>>>>>>>>> +        Currently, OVN consumes destination IP address to figure
>>>>>>>>>> out
>>>>>>>>>> port.
>>>>>>>>>> +      </p>
>>>>>>>>>> +    </column>
>>>>>>>>>>    </table>
>>>>>>>>>>
>>>>>>>>>>    <table name="NAT" title="NAT rules">
>>>>>>>>>> --
>>>>>>>>>> 1.8.3.1
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> dev mailing list
>>>>>>>>>> d...@openvswitch.org
>>>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> dev mailing list
>>>>>>>>>> d...@openvswitch.org
>>>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to