On Thu, Oct 31, 2019 at 9:12 AM Dumitru Ceara <[email protected]> wrote:
>
> On Thu, Oct 31, 2019 at 2:05 AM Han Zhou <[email protected]> wrote:
> >
> >
> >
> > On Wed, Oct 23, 2019 at 12:11 AM Dumitru Ceara <[email protected]> wrote:
> > >
> > > ARP request and ND NS packets for router owned IPs were being
> > > flooded in the complete L2 domain (using the MC_FLOOD multicast group).
> > > However this creates a scaling issue in scenarios where aggregation
> > > logical switches are connected to more logical routers (~350). The
> > > logical pipelines of all routers would have to be executed before the
> > > packet is finally replied to by a single router, the owner of the IP
> > > address.
> > >
> > > This commit limits the broadcast domain by bypassing the L2 Lookup stage
> > > for ARP requests that will be replied by a single router. The packets
> > > are still flooded in the L2 domain but not on any of the other patch
> > > ports towards other routers connected to the switch. This restricted
> > > flooding is done by using a new multicast group (MC_ARP_ND_FLOOD).
> > >
> > > IPs that are owned by the routers and for which this fix applies are:
> > > - IP addresses configured on the router ports.
> > > - VIPs.
> > > - NAT IPs.
> > >
> > > Reported-at: https://bugzilla.redhat.com/1756945
> > > Reported-by: Anil Venkata <[email protected]>
> > > Signed-off-by: Dumitru Ceara <[email protected]>
> > >
> >
> > Thanks Dumitru for addressing the issue. I have only one concern, but I am 
> > not sure if it would cause real issue. The concern is, this patch changes 
> > the behavior that originally if there is any ARP request broadcasted by an 
> > external routers, all OVN routers will learn the MAC-IP bindings from the 
> > ARP request, but with this change only the one with the requested IP would 
> > learn it. At the same time, what if an ARP response (or GARP) is coming? 
> > Would it still trigger the same problem since it has to go through all 
> > router pipelines?
>
> Hi Han,
>
> Indeed, without the patch all connected OVN routers would learn the
> MAC-IP binding from an ARP request. However, even with a reasonably
> sized topology we can easily end up going over the 4K resubmit limit
> for an ARP request broadcast packet because we run all the router
> pipelines. I think that's a bigger problem because there's no way to
> make sure that the ARP request reaches at least the router that owns
> the IP address so we might end up unable to reach part of the network.
> With the fix other routers will now have to resolve the host that they
> would've previously known from the ARP request but in the end we
> should have proper connectivity in the whole network.
>
> The patch changes the behavior only for ARP requests because replies
> are usually (not always) unicast. GARP requests from external hosts
> are not affected by the fix because we do a host match on the target
> IP and make sure it matches the IPs owned by OVN. In case of an
> external GARP the packet will get flooded to all router pipelines.
> While we might hit the 4K resubmit issue, the reasoning is that at
> least some routers get the packet and learn the mac binding.
>
> I'll be sending a v3 soon as I need to rebase and add autotests and
> also fix an issue I had for packets coming from VXLAN tunnels.

v3 which should cover the newly added IPv6 NAT case and addresses
Numan's comments:
https://patchwork.ozlabs.org/patch/1187378/

Thanks,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to