Re: [ovs-discuss] [OVN] logical flow explosion in lr_in_ip_input table for dnat_and_snat IPs
Sorry Girish, I can't promise for now. I will see if I have time in the next couple of weeks, but welcome anyone to volunteer on this if it is urgent. On Mon, Jun 15, 2020 at 10:56 AM Girish Moodalbail wrote: > Hello Han, > > On Wed, Jun 3, 2020 at 9:39 PM Han Zhou wrote: > >> >> >> On Wed, Jun 3, 2020 at 7:16 PM Girish Moodalbail >> wrote: >> >>> Hello all, >>> >>> While working on an extension, see the diagram below, to the existing >>> OVN logical topology for the ovn-kubernetes project, I am seeing an >>> explosion of the "Reply to ARP requests" logical flows in the >>> `lr_in_ip_input` table for the distributed router (ovn_cluster_router) >>> configured with gateway port (rtol-LS) >>> >>> internet >>>-+--> >>> | >>> | >>> +--localnet-port-+ >>> |LS | >>> +-ltor-LS+ >>>| >>>| >>> +-rtol-LS+ >>> | ovn_cluster_router | >>> | (Distributed Router) | >>> +-rtos-ls0--rtos-ls1rtos-ls2-+ >>> | | | >>> | | | >>> +-+-+ ++--+ +-+-+ >>> | LS0 | | LS1 | | LS2 | >>> +-+-+ +-+-+ +-+-+ >>> | | | >>> p0 p1p2 >>> IA0 IA1 IA2 >>> EA0 EA1 EA2 >>> (Node0) (Node1) (Node2) >>> >>> In the topology above, each of the three logical switch port has an >>> internal address of IAx and an external address of EAx (dnat_and_snat IP). >>> They are all bound to their respective nodes (Nodex). A packet from `p0` >>> heading towards the internet will be SNAT'ed to EA0 on the local hypervisor >>> and then sent out through the LS's localnet-port on that hypervisor. >>> Basically, they are configured for distributed NATing. >>> >>> I am seeing interesting "Reply to ARP requests" flows for arp.tpa set to >>> "EAX". Flows are like this: >>> >>> For EA0 >>> priority=90, match=(inport == "rtos-ls0" && arp.tpa == EA0 && arp.op == >>> 1), action=(/* ARP reply */) >>> priority=90, match=(inport == "rtos-ls1" && arp.tpa == EA0 && arp.op == >>> 1), action=(/* ARP reply */) >>> priority=90, match=(inport == "rtos-ls2" && arp.tpa == EA0 && arp.op == >>> 1), action=(/* ARP reply */) >>> >>> For EA1 >>> priority=90, match=(inport == "rtos-ls0" && arp.tpa == EA1 && arp.op == >>> 1), action=(/* ARP reply */) >>> priority=90, match=(inport == "rtos-ls1" && arp.tpa == EA0 && arp.op == >>> 1), action=(/* ARP reply */) >>> priority=90, match=(inport == "rtos-ls2" && arp.tpa == EA1 && arp.op == >>> 1), action=(/* ARP reply */) >>> >>> Similarly, for EA2. >>> >>> So, we have N * N "Reply to ARP requests" flows for N nodes each with 1 >>> dnat_and_snat ip. >>> This is causing scale issues. >>> >>> If you look at the flows for `EA0`, i am confused as to why is it needed? >>> >>>1. When will one see an ARP request for the EA0 from any of the >>>LS{0,1,2}'s logical switch port. >>>2. If it is needed at all, can't we just remove the `inport` thing >>>altogether since the flow is configured for every port of logical router >>>port except for the distributed gateway port rtol-LS. For this port, we >>>could add an higher priority rule with action set to `next`. >>>3. Say, we don't need east-west NAT connectivity. Is there a way to >>>make these ARPs be learnt dynamically, like we are doing for join and >>>external logical switch (the other thread [1]). >>> >>> Regards, >>> ~Girish >>> >>> [1] >>> https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049994.html >>> >> >> In general, these flows should be per router instead of per router port, >> since the nat addresses are not attached to any router port. For >> distributed gateway ports, there will need per-port flows to match >> is_chassis_resident(gateway-chassis). I think this can be handled by: >> - priority X + 20 flows for each distributed gateway port with >> is_chassis_resident(), reply ARP >> - priority X + 10 flows for each distributed gateway port without >> is_chassis_resident(), drop >> - priority X flows for each router (no need to match inport), reply ARP >> >> This way, there are N * (2D + 1) flows per router. N = number of NAT IPs, >> D = number of distributed gateway ports. This would optimize the above >> scenario where there is only 1 distributed gateway port but many regular >> router ports. Thoughts? >> > > We went ahead and added support for this topology in ovn-kubernetes > project in this commit > > https://github.com/ovn-org/ovn-kubernetes/commit/edb24e6a71142f2e835b67b29c11e1688c645683 > > > Han, was curious to know if the above fix is in your radar? Thanks. > > The number of
[ovs-discuss] problems sending to OFPP_NORMAL
Does anyone know of a way to send dp_packets throught the OFPP_NORMAL port? I do not have the corresponding struct xlate_ctx pointer because to my knowledge there is no way of storing it for use later on, so I can’t use the compose_output_action() function. I tried using ofproto_dpif_send_packet(), it doesn’t work withOFPP_NORMAL but it does with all other ports connected to the given switch, but I’m trying to avoid flooding the packet. Luca ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] [OVN] logical flow explosion in lr_in_ip_input table for dnat_and_snat IPs
Hello Han, On Wed, Jun 3, 2020 at 9:39 PM Han Zhou wrote: > > > On Wed, Jun 3, 2020 at 7:16 PM Girish Moodalbail > wrote: > >> Hello all, >> >> While working on an extension, see the diagram below, to the existing >> OVN logical topology for the ovn-kubernetes project, I am seeing an >> explosion of the "Reply to ARP requests" logical flows in the >> `lr_in_ip_input` table for the distributed router (ovn_cluster_router) >> configured with gateway port (rtol-LS) >> >> internet >>-+--> >> | >> | >> +--localnet-port-+ >> |LS | >> +-ltor-LS+ >>| >>| >> +-rtol-LS+ >> | ovn_cluster_router | >> | (Distributed Router) | >> +-rtos-ls0--rtos-ls1rtos-ls2-+ >> | | | >> | | | >> +-+-+ ++--+ +-+-+ >> | LS0 | | LS1 | | LS2 | >> +-+-+ +-+-+ +-+-+ >> | | | >> p0 p1p2 >> IA0 IA1 IA2 >> EA0 EA1 EA2 >> (Node0) (Node1) (Node2) >> >> In the topology above, each of the three logical switch port has an >> internal address of IAx and an external address of EAx (dnat_and_snat IP). >> They are all bound to their respective nodes (Nodex). A packet from `p0` >> heading towards the internet will be SNAT'ed to EA0 on the local hypervisor >> and then sent out through the LS's localnet-port on that hypervisor. >> Basically, they are configured for distributed NATing. >> >> I am seeing interesting "Reply to ARP requests" flows for arp.tpa set to >> "EAX". Flows are like this: >> >> For EA0 >> priority=90, match=(inport == "rtos-ls0" && arp.tpa == EA0 && arp.op == >> 1), action=(/* ARP reply */) >> priority=90, match=(inport == "rtos-ls1" && arp.tpa == EA0 && arp.op == >> 1), action=(/* ARP reply */) >> priority=90, match=(inport == "rtos-ls2" && arp.tpa == EA0 && arp.op == >> 1), action=(/* ARP reply */) >> >> For EA1 >> priority=90, match=(inport == "rtos-ls0" && arp.tpa == EA1 && arp.op == >> 1), action=(/* ARP reply */) >> priority=90, match=(inport == "rtos-ls1" && arp.tpa == EA0 && arp.op == >> 1), action=(/* ARP reply */) >> priority=90, match=(inport == "rtos-ls2" && arp.tpa == EA1 && arp.op == >> 1), action=(/* ARP reply */) >> >> Similarly, for EA2. >> >> So, we have N * N "Reply to ARP requests" flows for N nodes each with 1 >> dnat_and_snat ip. >> This is causing scale issues. >> >> If you look at the flows for `EA0`, i am confused as to why is it needed? >> >>1. When will one see an ARP request for the EA0 from any of the >>LS{0,1,2}'s logical switch port. >>2. If it is needed at all, can't we just remove the `inport` thing >>altogether since the flow is configured for every port of logical router >>port except for the distributed gateway port rtol-LS. For this port, we >>could add an higher priority rule with action set to `next`. >>3. Say, we don't need east-west NAT connectivity. Is there a way to >>make these ARPs be learnt dynamically, like we are doing for join and >>external logical switch (the other thread [1]). >> >> Regards, >> ~Girish >> >> [1] >> https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049994.html >> > > In general, these flows should be per router instead of per router port, > since the nat addresses are not attached to any router port. For > distributed gateway ports, there will need per-port flows to match > is_chassis_resident(gateway-chassis). I think this can be handled by: > - priority X + 20 flows for each distributed gateway port with > is_chassis_resident(), reply ARP > - priority X + 10 flows for each distributed gateway port without > is_chassis_resident(), drop > - priority X flows for each router (no need to match inport), reply ARP > > This way, there are N * (2D + 1) flows per router. N = number of NAT IPs, > D = number of distributed gateway ports. This would optimize the above > scenario where there is only 1 distributed gateway port but many regular > router ports. Thoughts? > We went ahead and added support for this topology in ovn-kubernetes project in this commit https://github.com/ovn-org/ovn-kubernetes/commit/edb24e6a71142f2e835b67b29c11e1688c645683 Han, was curious to know if the above fix is in your radar? Thanks. The number of OpenFlow flows in each of the hypervisors is insanely high and is consuming a lot of memory. Regards, ~Girish > > Thanks, > Han > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
[ovs-discuss] [OVN] running bfd on ecmp routes?
Hi All, While looking into using ecmp routes for an OVN router I noticed there is no support for BFD on these routes. Would it be possible to add this capability? I would like the next hop to be removed from the openflow group if BFD detection for that next hop goes down. My routes in this case would be on a GR for N/S external next hop and not going across a tunnel as it egresses. Thanks, Tim Rozet Red Hat CTO Networking Team ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss