On Fri, Feb 7, 2025 at 1:09 PM Ilya Maximets <[email protected]> wrote: > > On 2/7/25 21:45, Ilya Maximets wrote: > > On 2/7/25 15:01, Ilya Maximets wrote: > >> On 2/7/25 05:31, Numan Siddique wrote: > >>> On Thu, Feb 6, 2025 at 3:57 AM Ilya Maximets <[email protected]> wrote: > >>>> > >>>> This patch set introduces ability to directly connect switches, > >>>> including transit switches, in order to achieve higher total port count > >>>> and locality of changes in L2 topologies spread across multiple > >>>> availability zones. And while tailored for this use case, the changes > >>>> do not impose any limitations and should allow for all kinds of other > >>>> different topologies. > >>>> > >>>> Amount of the logic code changes is relatively small, most of the diff > >>>> are new tests for the introduced functionality. > >>>> > >>>> > >>>> Version 2: > >>>> > >>>> * Rebased on top of latest changes on the main branch. > >>>> * Improved validation of the peer in northd. [Mark] > >>>> * Added a test for a switch port with an address set. [Mark] > >>>> > >>>> > >>>> Ilya Maximets (2): > >>>> northd: Add support for spine-leaf logical switch topology. > >>>> ic: Add support for spine-leaf topology for transit switches. > >>> > >>> Hi Ilya, > >>> > >>> Thanks for adding this feature. I applied both the patches to the > >>> main. I had to do a minor rebase. > >> > >> Thanks, Numan and Mark! > >> > >>> > >>> I see one small issue with the spine switch. Since all the spine > >>> switch ports have unknown address, > >>> the packet will be cloned to N - 1 logical switches if N switches connect to it. > >>> > >>> I think we should enable "fdb learn" in the spine switch with the below changes > >>> > >>> ----------------------------------------------- > >>> diff --git a/northd/northd.c b/northd/northd.c > >>> index 880112c3b9..e77936fbe9 100644 > >>> --- a/northd/northd.c > >>> +++ b/northd/northd.c > >>> @@ -5682,6 +5682,7 @@ build_lswitch_learn_fdb_op( > >>> ovs_assert(op->nbsp); > >>> > >>> if (!op->n_ps_addrs && op->has_unknown && (!strcmp(op->nbsp->type, "") || > >>> + !strcmp(op->nbsp->type, "switch") || > >>> (lsp_is_localnet(op->nbsp) && localnet_can_learn_mac(op->nbsp)))) { > >>> ds_clear(match); > >>> ds_clear(actions); > >>> ----------------------------------------------- > >>> > >>> What do you think ? Any concerns or objections ? If not, can you > >>> submit a follow up patch to enable this ? > >> > >> This seems reasonable. I missed the part that we enable FDB automatically > >> for some ports with "unknown". Will look into that and submit a fix. > > > > So, I gave this some more thoughts and I'm not sure if we actually need > > FDB for "switch" ports in a common case. In a non-IC setup OVN actually > > knows the destination, if it's part of OVN network. The whole processing > > of the spine switch will happen on the source node and only the egress > > pipeline of the leaf switch will be executed on the destination node. > > If we have actual VM ports with "unknown", then those will have the FDB > > enabled, but we should not need to consult FDB for the spine switch itself > > otherwise. Does that make sense? > > Thinking more about this again, there is still an issue if we have "unknown" > ports in leaf switches. Since those switches are separate from the spine, > will broadcast in the spine and then every leaf that has "unknown" will also > broadcast within itself. And so, "unknown" VM ports will receive traffic > destined to ports on other leaf switches. So, yes, we still need FDB for > this case. > > > > > OTOH, what we may actually need is FDB learning for remote ports in IC > > setup. In this case, OVN doesn't actually know the whole topology and > > can't figure out to which of the remote ports the packet should go, so it > > will broadcast. The problem here, however, is that FDB stages are in the > > ingress pipeline, and they will not be executed on the remote node that > > actually needs them. So, we may need to create FDB stages in the egress > > pipeline for packets coming from remote ports. Seems like a generic > > issue with "unknown" ports on a transit switch. > > > > Mark, Numan, Dumitru, what do you think? > > But this issue also still stands as we need a way to learn MAC addresses > from packet arriving from a remote port, so we either need FDB in the egress > pipeline of the transit switch for remote ports, or we need an external > service like ovn-ic synchronizing FDB entries between AZs, which is not good. >
Hi Ilya, I agree with you that synchronizing FDB entries between AZs is a really bad idea. FDB in the egress pipeline (and learning the mac for inport of the spine switch) sounds reasonable. Thanks, Han > > > > Best regards, Ilya Maximets > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
