On Thu, Oct 31, 2019 at 9:12 AM Dumitru Ceara <[email protected]> wrote: > > On Thu, Oct 31, 2019 at 2:05 AM Han Zhou <[email protected]> wrote: > > > > > > > > On Wed, Oct 23, 2019 at 12:11 AM Dumitru Ceara <[email protected]> wrote: > > > > > > ARP request and ND NS packets for router owned IPs were being > > > flooded in the complete L2 domain (using the MC_FLOOD multicast group). > > > However this creates a scaling issue in scenarios where aggregation > > > logical switches are connected to more logical routers (~350). The > > > logical pipelines of all routers would have to be executed before the > > > packet is finally replied to by a single router, the owner of the IP > > > address. > > > > > > This commit limits the broadcast domain by bypassing the L2 Lookup stage > > > for ARP requests that will be replied by a single router. The packets > > > are still flooded in the L2 domain but not on any of the other patch > > > ports towards other routers connected to the switch. This restricted > > > flooding is done by using a new multicast group (MC_ARP_ND_FLOOD). > > > > > > IPs that are owned by the routers and for which this fix applies are: > > > - IP addresses configured on the router ports. > > > - VIPs. > > > - NAT IPs. > > > > > > Reported-at: https://bugzilla.redhat.com/1756945 > > > Reported-by: Anil Venkata <[email protected]> > > > Signed-off-by: Dumitru Ceara <[email protected]> > > > > > > > Thanks Dumitru for addressing the issue. I have only one concern, but I am > > not sure if it would cause real issue. The concern is, this patch changes > > the behavior that originally if there is any ARP request broadcasted by an > > external routers, all OVN routers will learn the MAC-IP bindings from the > > ARP request, but with this change only the one with the requested IP would > > learn it. At the same time, what if an ARP response (or GARP) is coming? > > Would it still trigger the same problem since it has to go through all > > router pipelines? > > Hi Han, > > Indeed, without the patch all connected OVN routers would learn the > MAC-IP binding from an ARP request. However, even with a reasonably > sized topology we can easily end up going over the 4K resubmit limit > for an ARP request broadcast packet because we run all the router > pipelines. I think that's a bigger problem because there's no way to > make sure that the ARP request reaches at least the router that owns > the IP address so we might end up unable to reach part of the network. > With the fix other routers will now have to resolve the host that they > would've previously known from the ARP request but in the end we > should have proper connectivity in the whole network. > > The patch changes the behavior only for ARP requests because replies > are usually (not always) unicast. GARP requests from external hosts > are not affected by the fix because we do a host match on the target > IP and make sure it matches the IPs owned by OVN. In case of an > external GARP the packet will get flooded to all router pipelines. > While we might hit the 4K resubmit issue, the reasoning is that at > least some routers get the packet and learn the mac binding. > > I'll be sending a v3 soon as I need to rebase and add autotests and > also fix an issue I had for packets coming from VXLAN tunnels.
v3 which should cover the newly added IPv6 NAT case and addresses Numan's comments: https://patchwork.ozlabs.org/patch/1187378/ Thanks, Dumitru _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
