On 5/20/26 6:53 PM, Rukomoinikova Aleksandra wrote:
> On 20.05.2026 19:34, Dumitru Ceara wrote:
>>
>>
>> On Wednesday, May 20, 2026, Rukomoinikova Aleksandra 
>> <[email protected]> wrote:
>>
>>     On 20.05.2026 17:20, Dumitru Ceara wrote:
>>     > On 5/20/26 3:47 PM, Rukomoinikova Aleksandra wrote:
>>     >> On 13.05.2026 12:51, Dumitru Ceara wrote:
>>     >>> Hi Alexandra,
>>     >>>
>>     >>> Thanks for the fix!
>>     >>>
>>     >>> On 5/8/26 10:00 PM, Alexandra Rukomoinikova via dev wrote:
>>     >>>> ARP requests sent through localnet port from physical network are
>>     >>>> processed only on chassis hosting the HA group, while
>>     requests for
>>     >>>> distributed NAT are also distributed across chassis.
>>     >>>>
>>     >>>> Previously, the case where a VIF port exists on the same
>>     logical switch
>>     >>>> connected to the physical network was not handled. Now, for
>>     each VIF port
>>     >>>> on such an external switch, ARP requests are processed on the
>>     chassis that
>>     >>>> hosts that port.
>>     >>>>
>>     >>>> Code simply iterates over all ports within the logical
>>     switch, which is
>>     >>>> considered acceptable since there are typically not a lot
>>     other non-VIF
>>     >>>> ports on an external switch.
>>     >>>>
>>     >>>> Fixed I-P: vif ports are processed incrementally, so if
>>     switch with connection
>>     >>>> to the physical network was created and then a vif port was
>>     added,
>>     >>>> ls_arp processing node would not work process this update.
>>     >>>>
>>     >>> Maybe "would not process this update" instead of "would not
>>     work..."?
>>     >>>> Fixes: 1b4058b ("northd: Process external arps on ha chassis.")
>>     >>> The Fixes format is not exactly correct, the correct way to
>>     format it is:
>>     >>>
>>     >>> git log -1 --pretty=format:"CC: %an <%ae>%nFixes: %h (\"%s\")"
>>     >>> --abbrev=12 COMMIT_REF
>>     >>>
>>     >>> As documented in the submitting-patches.rst file.
>>     >>>
>>     >>>> Reported-at: https://redhat.atlassian.net/browse/FDP-3767
>>     <https://redhat.atlassian.net/browse/FDP-3767>
>>     >>> Our internal automation is "special".  This should actually be:
>>     >>> https://redhat.atlassian.net/browse/FDP-3774
>>     <https://redhat.atlassian.net/browse/FDP-3774>
>>     >>>
>>     >>>> Signed-off-by: Alexandra Rukomoinikova <[email protected]>
>>     >>>> ---
>>     >>>>    northd/northd.c     | 25 ++++++++++++++++++
>>     >>>>    northd/northd.h     |  6 +++++
>>     >>>>    tests/ovn-northd.at <http://ovn-northd.at> | 17 +++++++-----
>>     >>>>    tests/system-ovn.at <http://system-ovn.at> | 64
>>     +++++++++++++++++++++++++++++++++++++++++++++
>>     >>>>    4 files changed, 106 insertions(+), 6 deletions(-)
>>     >>>>
>>     >>>> diff --git a/northd/northd.c b/northd/northd.c
>>     >>>> index 6ff505ca1..f55dbc1d5 100644
>>     >>>> --- a/northd/northd.c
>>     >>>> +++ b/northd/northd.c
>>     >>>> @@ -4812,6 +4812,10 @@ ls_handle_lsp_changes(struct
>>     ovsdb_idl_txn *ovnsb_idl_txn,
>>     >>>>                struct nbrec_logical_switch_port *new_nbsp =
>>     changed_ls->ports[j];
>>     >>>>                op = ovn_port_find_in_datapath(od, new_nbsp);
>>     >>>>
>>     >>>> +            if (!hmapx_is_empty(&od->phys_ports)) {
>>     >>>> +                goto fail;
>>     >>>> +            }
>>     >>>> +
>>     >>> This could go outside the loop, it doesn't depend on ports[j[.
>>     >>>
>>     >>> However, this is a bit worrying, it will fail incremental
>>     processing for
>>     >>> any LSP addition if it's part of a localnet logical switch
>>     which will
>>     >>> have performance implications at scale I guess.
>>     >>>
>>     >>> Can't we handle the case of "changed lsp" (or new/deleted lsp) in
>>     >>> ls_arp_northd_handler() too?  And then incrementally in
>>     >>> lflow_handle_ls_arp_changes().
>>     >>>
>>     >>>>                if (!op) {
>>     >>>>                    if (!lsp_can_be_inc_processed(new_nbsp)) {
>>     >>>>                        goto fail;
>>     >>>> @@ -10066,6 +10070,8 @@
>>     build_drop_arp_nd_flows_for_unbound_router_ports(struct ovn_port *op,
>>     >>>>     * router connection ports that requires chassis residence.
>>     >>>>     * ARP requests coming from localnet/l2gateway ports
>>     >>>>     * allowed for processing on resident chassis only.
>>     >>>> + * If logical switch has VIF port, ARP requests are allowed
>>     >>>> + * to be processed on the chassis hosting this VIF port.
>>     >>>>     */
>>     >>>>    static void
>>     >>>>    build_lswitch_arp_chassis_resident(const struct
>>     ovn_datapath *od,
>>     >>>> @@ -10133,6 +10139,25 @@
>>     build_lswitch_arp_chassis_resident(const struct ovn_datapath *od,
>>     >>>>                }
>>     >>>>            }
>>     >>>>
>>     >>>> +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
>>     >>>> +            if (!port_is_vif(op)) {
>>     >>>> +                continue;
>>     >>>> +            }
>>     >>>> +
>>     >>>> +            for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>>     >>>> +                for (size_t j = 0; j <
>>     op->lsp_addrs[i].n_ipv4_addrs; j++) {
>>     >>>> +                    ds_clear(&match);
>>     >>>> +                    ds_put_format(&match,
>>     >>>> + REGBIT_EXT_ARP " == 1 && arp.tpa == %s "
>>     >>>> +                                  "&& is_chassis_resident(%s)",
>>     >>>> + op->lsp_addrs[i].ipv4_addrs[j].addr_s,
>>     >>>> + op->json_key);
>>     >>>> +                    ovn_lflow_add(lflows, od,
>>     S_SWITCH_IN_APPLY_PORT_SEC, 85,
>>     >>>> + ds_cstr(&match), "next;", ar->lflow_ref);
>>     >>>> +                }
>>     >>>> +            }
>>     >>> I'm a bit worried about these new flows.  Also about the flows
>>     we added
>>     >>> for NATs in 16b79a66d2c3 ("northd: Restrict external ARP
>>     request to
>>     >>> logical_ip for dnat_and_snat.") (CC Ales).
>>     >>>
>>     >>> We already add ARP responder flows for VIFs in table
>>     S_SWITCH_IN_ARP_ND_RSP.
>>     >>>
>>     >>> Shouldn't we rework the original 1b4058b ("northd: Process
>>     external arps
>>     >>> on ha chassis.") and do the drop for ARPs with REGBIT_EXT_ARP
>>     == 1 in
>>     >>> the switch arp responder stage?
>>     >>>
>>     >>> Then we wouldn't need to add another set of logical flows for
>>     each LSP
>>     >>> IP.  There are a few cases we need to carefully account for
>>     then but it
>>     >>> would be more scalable in my opinion.
>>     >>>
>>     >>> What do you think?
>>     >>>
>>     >>>> +        }
>>     >>>> +
>>     >>>>            ovn_lflow_add(lflows, od,
>>     S_SWITCH_IN_APPLY_PORT_SEC, 70,
>>     >>>>                          REGBIT_EXT_ARP" == 1", "drop;",
>>     ar->lflow_ref);
>>     >>>>        }
>>     >>>> diff --git a/northd/northd.h b/northd/northd.h
>>     >>>> index 81ab07600..7092f6001 100644
>>     >>>> --- a/northd/northd.h
>>     >>>> +++ b/northd/northd.h
>>     >>>> @@ -1178,6 +1178,12 @@ od_is_centralized(const struct
>>     ovn_datapath *od)
>>     >>>>        return !od->is_distributed;
>>     >>>>    }
>>     >>>>
>>     >>>> +static inline bool
>>     >>>> +port_is_vif(const struct ovn_port *op)
>>     >>>> +{
>>     >>>> +    return op->sb ? !strcmp(op->sb->type, "") : 0;
>>     >>> Should be ": false" instead of ": 0".
>>     >>>
>>     >>> Also, I'd rely on the NB type instead of the port_binding type
>>     here.
>>     >>>
>>     >>> We could use this helper in a bunch of other places in
>>     northd.c where we
>>     >>> check LSP type.
>>     >>>
>>     >>>> +}
>>     >>>> +
>>     >>>>    struct ovn_port *ovn_port_find(const struct hmap *ports,
>>     const char *name);
>>     >>>>
>>     >>>>    void build_igmp_lflows(struct hmap *igmp_groups,
>>     >>>> diff --git a/tests/ovn-northd.at <http://ovn-northd.at>
>>     b/tests/ovn-northd.at <http://ovn-northd.at>
>>     >>>> index 3f237b076..9d1d00380 100644
>>     >>>> --- a/tests/ovn-northd.at <http://ovn-northd.at>
>>     >>>> +++ b/tests/ovn-northd.at <http://ovn-northd.at>
>>     >>>> @@ -19702,15 +19702,10 @@ check ovn-nbctl lr-add lr1
>>     >>>>    check ovn-nbctl lrp-add lr1 down_link f0:00:00:00:00:f1
>>     192.168.1.1/24 <http://192.168.1.1/24>
>>     >>>>
>>     >>>>    check ovn-nbctl ls-add ls1
>>     >>>> -check ovn-nbctl lsp-add ls1 up_link
>>     >>>>    check ovn-nbctl lsp-add ls1 down_vif1
>>     >>>>    check ovn-nbctl lsp-add ls1 down_vif2
>>     >>>>    check ovn-nbctl lsp-add ls1 down_ext
>>     >>>> -
>>     >>>> -check ovn-nbctl set Logical_Switch_Port up_link \
>>     >>>> -    type=router \
>>     >>>> -    options:router-port=down_link \
>>     >>>> -    addresses=router
>>     >>>> +check ovn-nbctl lsp-add-router-port ls1 up_link down_link
>>     >>>>
>>     >>>>    check ovn-nbctl lsp-set-addresses down_vif1
>>     'f0:00:00:00:00:01 192.168.1.101'
>>     >>>>    check ovn-nbctl lsp-set-addresses down_vif2
>>     'f0:00:00:00:00:02 192.168.1.102'
>>     >>>> @@ -19728,6 +19723,8 @@ AT_CHECK([ovn-sbctl lflow-list ls1 |
>>     grep ls_in_apply_port_sec | ovn_strip_lflow
>>     >>>>      table=??(ls_in_apply_port_sec), priority=50   ,
>>     match=(reg0[[15]] == 1), action=(drop;)
>>     >>>>      table=??(ls_in_apply_port_sec), priority=70   ,
>>     match=(reg0[[22]] == 1), action=(drop;)
>>     >>>>      table=??(ls_in_apply_port_sec), priority=75   ,
>>     match=(reg0[[22]] == 1 && is_chassis_resident("cr-down_link")),
>>     action=(next;)
>>     >>>> +  table=??(ls_in_apply_port_sec), priority=85   ,
>>     match=(reg0[[22]] == 1 && arp.tpa == 192.168.1.101 &&
>>     is_chassis_resident("down_vif1")), action=(next;)
>>     >>>> +  table=??(ls_in_apply_port_sec), priority=85   ,
>>     match=(reg0[[22]] == 1 && arp.tpa == 192.168.1.102 &&
>>     is_chassis_resident("down_vif2")), action=(next;)
>>     >>>>    ])
>>     >>>>
>>     >>>>    # Check nat adding to dgr attached to logical switch
>>     trigger ls-arp flow.
>>     >>>> @@ -19740,6 +19737,8 @@ AT_CHECK([ovn-sbctl lflow-list ls1 |
>>     grep ls_in_apply_port_sec | ovn_strip_lflow
>>     >>>>      table=??(ls_in_apply_port_sec), priority=70   ,
>>     match=(reg0[[22]] == 1), action=(drop;)
>>     >>>>      table=??(ls_in_apply_port_sec), priority=75   ,
>>     match=(reg0[[22]] == 1 && is_chassis_resident("cr-down_link")),
>>     action=(next;)
>>     >>>>      table=??(ls_in_apply_port_sec), priority=85   ,
>>     match=(reg0[[22]] == 1 && arp.tpa == 192.168.0.3 &&
>>     is_chassis_resident("down_vif1")), action=(next;)
>>     >>>> +  table=??(ls_in_apply_port_sec), priority=85   ,
>>     match=(reg0[[22]] == 1 && arp.tpa == 192.168.1.101 &&
>>     is_chassis_resident("down_vif1")), action=(next;)
>>     >>>> +  table=??(ls_in_apply_port_sec), priority=85   ,
>>     match=(reg0[[22]] == 1 && arp.tpa == 192.168.1.102 &&
>>     is_chassis_resident("down_vif2")), action=(next;)
>>     >>>>    ])
>>     >>>>
>>     >>>>    check ovn-nbctl --wait=sb lr-nat-del lr1 dnat_and_snat
>>     192.168.0.3
>>     >>>> @@ -19748,6 +19747,8 @@ AT_CHECK([ovn-sbctl lflow-list ls1 |
>>     grep ls_in_apply_port_sec | ovn_strip_lflow
>>     >>>>      table=??(ls_in_apply_port_sec), priority=50   ,
>>     match=(reg0[[15]] == 1), action=(drop;)
>>     >>>>      table=??(ls_in_apply_port_sec), priority=70   ,
>>     match=(reg0[[22]] == 1), action=(drop;)
>>     >>>>      table=??(ls_in_apply_port_sec), priority=75   ,
>>     match=(reg0[[22]] == 1 && is_chassis_resident("cr-down_link")),
>>     action=(next;)
>>     >>>> +  table=??(ls_in_apply_port_sec), priority=85   ,
>>     match=(reg0[[22]] == 1 && arp.tpa == 192.168.1.101 &&
>>     is_chassis_resident("down_vif1")), action=(next;)
>>     >>>> +  table=??(ls_in_apply_port_sec), priority=85   ,
>>     match=(reg0[[22]] == 1 && arp.tpa == 192.168.1.102 &&
>>     is_chassis_resident("down_vif2")), action=(next;)
>>     >>>>    ])
>>     >>>>
>>     >>>>    # Check changing logical port type to l2gateway.
>>     >>>> @@ -19757,6 +19758,8 @@ AT_CHECK([ovn-sbctl lflow-list ls1 |
>>     grep ls_in_apply_port_sec | ovn_strip_lflow
>>     >>>>      table=??(ls_in_apply_port_sec), priority=50   ,
>>     match=(reg0[[15]] == 1), action=(drop;)
>>     >>>>      table=??(ls_in_apply_port_sec), priority=70   ,
>>     match=(reg0[[22]] == 1), action=(drop;)
>>     >>>>      table=??(ls_in_apply_port_sec), priority=75   ,
>>     match=(reg0[[22]] == 1 && is_chassis_resident("cr-down_link")),
>>     action=(next;)
>>     >>>> +  table=??(ls_in_apply_port_sec), priority=85   ,
>>     match=(reg0[[22]] == 1 && arp.tpa == 192.168.1.101 &&
>>     is_chassis_resident("down_vif1")), action=(next;)
>>     >>>> +  table=??(ls_in_apply_port_sec), priority=85   ,
>>     match=(reg0[[22]] == 1 && arp.tpa == 192.168.1.102 &&
>>     is_chassis_resident("down_vif2")), action=(next;)
>>     >>>>    ])
>>     >>>>
>>     >>>>    # Check changing logical port type to vif.
>>     >>>> @@ -19773,6 +19776,8 @@ AT_CHECK([ovn-sbctl lflow-list ls1 |
>>     grep ls_in_apply_port_sec | ovn_strip_lflow
>>     >>>>      table=??(ls_in_apply_port_sec), priority=50   ,
>>     match=(reg0[[15]] == 1), action=(drop;)
>>     >>>>      table=??(ls_in_apply_port_sec), priority=70   ,
>>     match=(reg0[[22]] == 1), action=(drop;)
>>     >>>>      table=??(ls_in_apply_port_sec), priority=75   ,
>>     match=(reg0[[22]] == 1 && is_chassis_resident("cr-down_link")),
>>     action=(next;)
>>     >>>> +  table=??(ls_in_apply_port_sec), priority=85   ,
>>     match=(reg0[[22]] == 1 && arp.tpa == 192.168.1.101 &&
>>     is_chassis_resident("down_vif1")), action=(next;)
>>     >>>> +  table=??(ls_in_apply_port_sec), priority=85   ,
>>     match=(reg0[[22]] == 1 && arp.tpa == 192.168.1.102 &&
>>     is_chassis_resident("down_vif2")), action=(next;)
>>     >>>>    ])
>>     >>>>
>>     >>>>    # Check changing removing logical port.
>>     >>>> diff --git a/tests/system-ovn.at <http://system-ovn.at>
>>     b/tests/system-ovn.at <http://system-ovn.at>
>>     >>>> index 582ed194b..0648c44b6 100644
>>     >>>> --- a/tests/system-ovn.at <http://system-ovn.at>
>>     >>>> +++ b/tests/system-ovn.at <http://system-ovn.at>
>>     >>>> @@ -21814,3 +21814,67 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed
>>     to query port patch-.*/d
>>     >>>>
>>     >>>>    AT_CLEANUP
>>     >>>>    ])
>>     >>>> +
>>     >>>> +OVN_FOR_EACH_NORTHD([
>>     >>>> +AT_SETUP([VIF port connected to localnet network])
>>     >>>> +#
>>     >>>> +# Topology:
>>     >>>> +#    (fabric) -- localnet-port -- LS --- DGP(chassis2) -- LR
>>     >>>> +#                                 |
>>     >>>> +#                                 |
>>     >>>> +#                               VM (chassis1)
>>     >>>> +#
>>     >>>> +It is expected that ARP requests to this port are allowed on
>>     the chesis that hosts this port.
>>     >>> Missing "#".
>>     >>>
>>     >>> Typo: chesis
>>     >>>
>>     >>>> +
>>     >>>> +ovn_start
>>     >>>> +OVS_TRAFFIC_VSWITCHD_START()
>>     >>>> +ADD_BR([br-int])
>>     >>>> +ADD_BR([br-ext])
>>     >>>> +
>>     >>>> +ovs-ofctl add-flow br-ext action=normal
>>     >>>> +# Set external-ids in br-int needed for ovn-controller
>>     >>>> +ovs-vsctl \
>>     >>> Missing "check".
>>     >>>
>>     >>>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>>     >>>> +        -- set Open_vSwitch .
>>     external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>>     >>>> +        -- set Open_vSwitch .
>>     external-ids:ovn-encap-type=geneve \
>>     >>>> +        -- set Open_vSwitch .
>>     external-ids:ovn-encap-ip=169.0.0.1 \
>>     >>>> +        -- set Open_vSwitch .
>>     external-ids:ovn-bridge-mappings=phynet:br-ext
>>     >>> Missing "\" at end of line which means we essentially won't
>>     configure
>>     >>> the line below.
>>     >>>
>>     >>>> +        -- set bridge br-int fail-mode=secure
>>     other-config:disable-in-band=true
>>     >>>> +
>>     >>>> +# Start ovn-controller
>>     >>>> +start_daemon ovn-controller
>>     >>>> +
>>     >>>> +check ovn-nbctl lr-add lr1
>>     >>>> +check ovn-nbctl ls-add public
>>     >>>> +
>>     >>>> +check ovn-nbctl lrp-add lr1 rp-public 00:00:02:01:02:03
>>     172.31.1.1/24 <http://172.31.1.1/24>
>>     >>>> +check ovn-nbctl lsp-add-router-port public public-rp rp-public
>>     >>>> +check ovn-nbctl lsp-add-localnet-port public localnet phynet
>>     >>>> +check ovn-nbctl lrp-set-gateway-chassis rp-public hv2
>>     >>>> +check ovn-nbctl --wait=hv sync
>>     >>>> +
>>     >>>> +ADD_NAMESPACES(ext)
>>     >>>> +ADD_VETH(ext, ext, br-ext, "172.31.1.3/24
>>     <http://172.31.1.3/24>", "f0:00:00:01:02:04", \
>>     >>>> +         "172.31.1.1")
>>     >>>> +
>>     >>>> +ADD_NAMESPACES(lsp)
>>     >>>> +ADD_VETH(lsp, lsp, br-int, "172.31.1.2/24
>>     <http://172.31.1.2/24>", "f0:00:00:01:02:03", \
>>     >>>> +         "172.31.1.1")
>>     >>>> +
>>     >>>> +check ovn-nbctl lsp-add public lsp
>>     >>>> +check ovn-nbctl lsp-set-addresses lsp "f0:00:00:01:02:03
>>     172.31.1.2"
>>     >>>> +check ovn-nbctl --wait=hv sync
>>     >>>> +
>>     >>>> +NS_CHECK_EXEC([ext], [ping -q -c 3 -i 0.3 -w 2 172.31.1.2 |
>>     FORMAT_PING], \
>>     >>>> +[0], [dnl
>>     >>>> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
>>     >>>> +])
>>     >>>> +
>>     >>>> +OVN_CLEANUP_CONTROLLER([hv1])
>>     >>>> +OVN_CLEANUP_NORTHD
>>     >>>> +as
>>     >>>> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>>     >>>> +/connection dropped.*/d"])
>>     >>>> +
>>     >>>> +AT_CLEANUP
>>     >>>> +])
>>     >>> Regards,
>>     >>> Dumitru
>>     >>>
>>     >> Hi Dumitru and Ales! I've been working on the implementation
>>     and wanted
>>     >> to discuss the details before moving forward — I want to make
>>     sure I'm
>>     >> not missing anything and that we're on the same page. I've looked
>>     >> through the existing patches, structured all the ARP request
>>     cases, and
>>     >> drafted an implementation plan. Before going further, I wanted to
>>     >> clarify a couple of things. I'm afraid that I got confused and
>>     might be
>>     >> doing something wrong.
>>     >>
>>     > Hi Alexandra,
>>     >
>>     > Thanks again for working on this!
>>     >
>>     >> I was thinking about it like this — the types of ARP requests
>>     we have:
>>     >> Three categories: to router ports, NAT (dnat_and_snat — both
>>     distributed
>>     >> and regular), and proxy ARP — the last one we leave alone, it's
>>     only
>>     >> relevant for internal requests, i think. I also did not single
>>     out the
>>     >> processing of GARPs in separate cases.
>>     >> Where they need to be reachable from: from the external network —
>>     >> handled only on the HA chassis (for dnat_and_snat — on the
>>     chassis where
>>     >> the port is hosted); if from inside — then on all chassis. This
>>     second
>>     >> case is the one I'm worried about losing.
>>     >>
>>     >>
>>     >> Here's how I see the solution:
>>     >> 1. Packet marking. In
>>     >>
>>     
>> https://github.com/ovn-org/ovn/commit/e6ffc4919c388534bf10067ad563d71f8fbc2fa3#diff-ad86f2621d498f3f310f111c1a92fe47589f0c2d51529fa9ba98371b32c00a9e
>>     
>> <https://github.com/ovn-org/ovn/commit/e6ffc4919c388534bf10067ad563d71f8fbc2fa3#diff-ad86f2621d498f3f310f111c1a92fe47589f0c2d51529fa9ba98371b32c00a9e>
>>     >> a flags.localnet flag was added — it is set for all packets
>>     coming from
>>     >> a localnet port, but in practice is only used for arp.op == 1.
>>     So I can
>>     >> drop my own logic with the REGBIT_EXT_ARP register and reuse
>>     your flag,
>>     >> adding a match on arp.op == 1. My original patch also covered l2gw
>>     >> ports, not just localnet. Your patch adds flags.localnet but
>>     only covers
>>     >> localnet ports — should we extend it so that the same marking
>>     is applied
>>     >> to packets coming through l2gw ports as well? and in general
>>     expand your
>>     >> patch to l2gw ports too?
>>     > One thing that might need care is the fact that we need to
>>     backport this
>>     > to 26.03 but that should be doable.
>>     >
>>     >> 2. Remove all flows from ls_in_check_port_sec and
>>     ls_in_apply_port_sec.
>>     > +1, in hindsight, they don' tbelong there.
>>     >
>>     >> 3. ls_in_arp_rsp table: Already works with the existing patch —
>>     match
>>     >> (flags.localnet == 1 && is_chassis_resident("vm-port")) ||
>>     >> flags.localnet == 0) - so VIF ports are covered. This table
>>     also handles
>>     >> Virtual IP, proxy ARP and healthcheck flows — They are not
>>     interesting
>>     >> in my patch.
>>     > ack
>>     >
>>     >> 4. The trickiest part is ls_in_l2_lkup.
It's worth separating
>>     NAT and
>>     >> ARP to router ports here. The patch
>>     >>
>>     
>> https://github.com/ovn-org/ovn/commit/8d13579bf5b390c1dcf1e737f918e05407f8692c
>>     
>> <https://github.com/ovn-org/ovn/commit/8d13579bf5b390c1dcf1e737f918e05407f8692c>
>>     >> already implements the correct handling of internal and external
>>     >> requests in this table.

>>     >> For example, in case of 169.254.48.2 - dgp port:
>>     >> table=32(ls_in_l2_lkup ), priority=80 , match=(flags[1] == 0 &&
>>     arp.op
>>     >> == 1 && arp.tpa == 169.254.48.2 &&
>>     >> !is_chassis_resident("cr-vpc-B5481A26-outside")), action=(clone
>>     {outport
>>     >> = "cr-vpc-B5481A26-outside"; output; }; outport =
>>     "_MC_flood_l2"; output;)

>>     >>
>>     >> table=32(ls_in_l2_lkup ), priority=80 , match=(flags[1] == 0 &&
>>     arp.op
>>     >> == 1 && arp.tpa == 169.254.48.2 &&
>>     >> is_chassis_resident("cr-vpc-B5481A26-outside")), action=(clone
>>     {outport
>>     >> = "vpc-B5481A26-outside"; output; }; outport = "_MC_flood_l2";
>>     output;)
>>     >>
>>     >>
>>     >> But this logic with this patch only works in the case when
>>     there is no
>>     >> localnet port, I believe that this condition can be completely
>>     removed.
>>     >> I want to do the following, even if this is localnet port in
>>     switch:
>>     >> Router port:
>>     >> external request:
>>     >> table=32(ls_in_l2_lkup ), priority=80 , match=(flags.localnet
>>     == 1 &&
>>     >> flags[1] == 0 && arp.op == 1 && arp.tpa == 169.254.48.2 &&
>>     >> is_chassis_resident("cr-vpc-B5481A26-outside")), action=(clone
>>     {outport
>>     >> = "vpc-B5481A26-outside"; output; }; outport = "_MC_flood_l2";
>>     output;)
>>     >> - so i want to add flags.localnet to this match
>>     >> internal request:
>>     >> table=32(ls_in_l2_lkup ), priority=80 , match=(flags.localnet
>>     == 0 &&
>>     >> flags[1] == 0 && arp.op == 1 && arp.tpa == 169.254.48.2 &&
>>     >> !is_chassis_resident("cr-vpc-B5481A26-outside")), action=(clone
>>     {outport
>>     >> = "cr-vpc-B5481A26-outside"; output; }; outport = "_MC_flood_l2";
>>     >> output;) - Here it seems to me that the is_chassis_resident
>>     check needs
>>     >> to be removed - it is meaningless and does not take into
>>     account the
>>     >> case when the virtual machine is on the same node where the
>>     cr-port is?
>>     >> Although while I was writing this, I realized that if the
>>     packet is on
>>     >> the host where the CR port is located, but it is not coming from a
>>     >> localnet port, then this also needs to be taken into account.
>>     > Right, so we'd need a third flow (or a more complex match with
>>     "OR")?
>>     >
>>     >>
>>     >> If it's a distributed NAT, then the first flow will be
>>     >> is_chassis_resident("nat-port")
>>     >> All other packets with the flags.localnet == 1 flag set must be
>>     dropped.
>>     >> Therefore, in ls_in_l2_lkup, a rule should be added:
>>     flags.localnet == 1
>>     >> && !is_chassis_resident("cr-port") → drop. All remaining
>>     packets on the
>>     >> chassis where the cr-port resides will be forwarded to unknown
>>     flooding.
>>     >>
>>     > I didn't dig too much here but on a quick read that seems ok.
>>     >
>>     >> I’d be happy to hear any thoughts if something already seems
>>     like a bad
>>     >> or strange idea. I’ll continue working on the implementation. Also,
>>     >> sorry that fixing this issue has taken longer than expected — I
>>     hope
>>     >> this is not critical for you.
>>     > Actually, this is a critical issue for us because it breaks some
>>     of our
>>     > product use cases.
>>     >
>>     > An alternative would be to revert the original patch that
>>     introduced the
>>     > bug (adding the test that you including in this patch) until you
>>     finish
>>     > your new implementation.
>>     >
>>     > What do you think?
>>     >
>>     >> Thanks!
>>     >>
>>     Hi again, i reverted patch and all fixes, so i will wait until all
>>     tests
>>     pass and I'll send the patch.
>>
>>     https://github.com/Sashhkaa/ovn/actions/runs/26175523112
>>     <https://github.com/Sashhkaa/ovn/actions/runs/26175523112>
>>
>>
>> Hi Alexandra,
>>
>> Would it be possible to also include the system test you had in this 
>> patch? Like that we’ll makw sure we don’t break this by accident again 
>> in the future.
>>
>> Thanks,
>> Dumitru
>>
>>
> Do you mean the system test that is in this patch with the fix? i will 
> add it
> 

Yes, that is what I meant, thank you!

>>
>>     > Thanks,
>>     > Dumitru
>>     >
>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to