On 6/18/26 5:17 PM, Dumitru Ceara wrote:
> On 6/16/26 11:54 AM, Ales Musil via dev wrote:
>> Add a new pipeline stage ls_in_arp_nd_pre_lookup (table 26)
>> and EVPN suppression response flows in ls_in_arp_rsp for
>> logical switches with EVPN enabled (dynamic-routing-vni set).
>>
>> The pre-lookup stage calls chk_evpn_arp(arp.tpa) for broadcast
>> ARP requests and chk_evpn_arp(nd.target) for multicast ND
>> solicitations. When the EVPN ARP side table has a matching
>> entry, the resolved MAC is stored in eth.dst and reg9[5] is
>> set.
>>
>> The response flows in ls_in_arp_rsp at priority 40 match when
>> reg9[5] == 1 and generate proxy ARP replies or ND NA replies
>> using the MAC from eth.dst. This prevents unnecessary
>> flooding of ARP/ND requests to remote VTEPs for EVPN-learned
>> addresses.
>>
>> Key design points:
>> - MAC stored in eth.dst (loaded by the EVPN ARP side table).
>> On a miss, eth.dst is left unchanged.
>> - ARP response uses eth.dst <-> eth.src swap to put the
>> resolved MAC into eth.src and the original sender's MAC
>> into eth.dst for the reply.
>> - ND response relies on nd_na{} reading eth.dst for the
>> NA source MAC (nd.tll).
>> - ARP match: arp.op == 1 && reg9[5] == 1.
>> - ND match: nd_ns && reg9[5] == 1.
>> - COPP_ND_NA meter on the ND NA response flow.
>>
>> Reported-at: https://redhat.atlassian.net/browse/FDP-3429
>> Assisted-by: Claude Opus 4.6, Claude Code
>> Signed-off-by: Ales Musil <[email protected]>
>> ---
>
> Hi Ales,
>
> Thanks for the patch!
>
>> Documentation/ref/ovn-logical-flows.7.rst | 72 ++++++++++++++------
>> NEWS | 6 ++
>> lib/ovn-util.c | 4 +-
>> lib/ovn-util.h | 2 +-
>> northd/northd.c | 83 +++++++++++++++++++++++
>> northd/northd.h | 18 ++---
>> ovn-sb.ovsschema | 6 +-
>> tests/ovn-northd.at | 39 +++++++++++
>> tests/ovn.at | 4 +-
>> tests/system-ovn.at | 77 +++++++++++++++++++++
>> 10 files changed, 275 insertions(+), 36 deletions(-)
>>
>> diff --git a/Documentation/ref/ovn-logical-flows.7.rst
>> b/Documentation/ref/ovn-logical-flows.7.rst
>> index ce4dd5355..2c13478a7 100644
>> --- a/Documentation/ref/ovn-logical-flows.7.rst
>> +++ b/Documentation/ref/ovn-logical-flows.7.rst
>> @@ -717,7 +717,7 @@ Ingress Table 19: Hairpin
>> - If logical switch has attached logical switch port of *vtep* type, then a
>> priority-1000 flow that matches on ``reg0[14]`` register bit for the
>> traffic
>> received from HW VTEP (ramp) ports. This traffic is passed to ingress
>> table
>> - :ref:`Destination Lookup <ls-in-32>`.
>> + :ref:`Destination Lookup <ls-in-33>`.
>>
>> - A priority-1 flow that hairpins traffic matched by non-default flows in
>> the
>> :ref:`Pre-Hairpin <ls-in-17>` table. Hairpinning is done at L2, Ethernet
>> @@ -978,7 +978,28 @@ refer to either the parent or child ports as applicable
>> to this logical switch.
>>
>> .. _ls-in-26:
>>
>> -Ingress Table 26: ARP/ND responder
>> +Ingress Table 26: ARP/ND Pre-Lookup
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +For logical switches with EVPN enabled (``dynamic-routing-vni`` is set),
>> +this table performs a pre-lookup in the EVPN ARP side table using the
>> +``chk_evpn_arp()`` action. If the target IP address matches an
>> +EVPN-learned entry, the resolved MAC is loaded into ``eth.dst``
>> +and a regbit is set so that the ARP/ND responder table can generate a
>> +proxy reply.
>> +
>> +- Priority-5 flows match broadcast ARP requests
>> + (``arp.op == 1 && eth.bcast``) and multicast ND
>> + solicitations (``nd_ns_mcast``), and call ``chk_evpn_arp(arp.tpa)``
>> + or ``chk_evpn_arp(nd.target)`` respectively.
>> +
>> +- A priority-0 fallback flow advances to the next table.
>> +
>> +For switches without EVPN, only the priority-0 fallback flow is present.
>> +
>> +.. _ls-in-27:
>> +
>> +Ingress Table 27: ARP/ND responder
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> This table implements ARP/ND responder in a logical switch for known IPs.
>> The
>> @@ -1208,12 +1229,23 @@ proxy ARP/ND behavior. It contains these logical
>> flows:
>> These flows are required to respond to an ARP request if an ARP request is
>> sent for the IP *vip*.
>>
>> +- For logical switches with EVPN enabled, priority-40 flows provide ARP/ND
>> + suppression for EVPN-learned addresses. These flows match when the EVPN
>> + ARP pre-lookup (table 26) found a hit (``reg9[5] == 1``):
>> +
>> + - An ARP suppression flow matches ``arp.op == 1 && reg9[5] == 1`` and
>> + generates an ARP reply using the MAC from ``eth.dst`` (loaded by
>> + ``chk_evpn_arp()`` in the pre-lookup stage).
>> +
>> + - An ND suppression flow matches ``nd_ns && reg9[5] == 1`` and
>> + generates an ND NA reply using the MAC from ``eth.dst``.
>> +
>> - One priority-0 fallback flow that matches all packets and advances to the
>> next
>> table.
>>
>> -.. _ls-in-27:
>> +.. _ls-in-28:
>>
>> -Ingress Table 27: DHCP option processing
>> +Ingress Table 28: DHCP option processing
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> This table adds the DHCPv4 options to a DHCPv4 packet from the logical ports
>> @@ -1246,9 +1278,9 @@ options. This table also adds flows for the logical
>> ports of type ``external``.
>>
>> - A priority-0 flow that matches all packets to advances to table 16.
>>
>> -.. _ls-in-28:
>> +.. _ls-in-29:
>>
>> -Ingress Table 28: DHCP responses
>> +Ingress Table 29: DHCP responses
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> This table implements DHCP responder for the DHCP replies generated by the
>> @@ -1301,9 +1333,9 @@ previous table.
>>
>> - A priority-0 flow that matches all packets to advances to table 17.
>>
>> -.. _ls-in-29:
>> +.. _ls-in-30:
>>
>> -Ingress Table 29 DNS Lookup
>> +Ingress Table 30 DNS Lookup
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> This table looks up and resolves the DNS names to the corresponding
>> configured
>> @@ -1321,9 +1353,9 @@ IP address(es).
>> other kinds of packets, it just stores 0 into reg0[4]. Either way, it
>> continues to the next table.
>>
>> -.. _ls-in-30:
>> +.. _ls-in-31:
>>
>> -Ingress Table 30 DNS Responses
>> +Ingress Table 31 DNS Responses
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> This table implements DNS responder for the DNS replies generated by the
>> @@ -1346,9 +1378,9 @@ previous table.
>> (This terminates ingress packet processing; the packet does not go to the
>> next
>> ingress table.)
>>
>> -.. _ls-in-31:
>> +.. _ls-in-32:
>>
>> -Ingress table 31 External ports
>> +Ingress table 32 External ports
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> Traffic from the ``external`` logical ports enter the ingress datapath
>> pipeline
>> @@ -1373,9 +1405,9 @@ traffic from these ports.
>>
>> - A priority-0 flow that matches all packets to advances to table 20.
>>
>> -.. _ls-in-32:
>> +.. _ls-in-33:
>>
>> -Ingress Table 32 Destination Lookup
>> +Ingress Table 33 Destination Lookup
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> This table implements switching behavior. It contains these logical flows:
>> @@ -1532,9 +1564,9 @@ This table implements switching behavior. It contains
>> these logical flows:
>> If there is no entry for ``eth.dst`` in the MAC learning table, then it
>> stores
>> ``none`` in the ``outport``.
>>
>> -.. _ls-in-33:
>> +.. _ls-in-34:
>>
>> -Ingress Table 33 Destination unknown
>> +Ingress Table 34 Destination unknown
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> This table handles the packets whose destination was not found or and
>> looked up
>> @@ -1695,12 +1727,12 @@ In addition, the following flows are added.
>>
>> - A priority 34000 logical flow is added for each logical port which has
>> DHCPv4
>> options defined to allow the DHCPv4 reply packet and which has DHCPv6
>> options
>> - defined to allow the DHCPv6 reply packet from :ref:`Ingress Table 28: DHCP
>> - responses <ls-in-28>`. This is indicated by setting the allow bit.
>> + defined to allow the DHCPv6 reply packet from :ref:`Ingress Table 29: DHCP
>> + responses <ls-in-29>`. This is indicated by setting the allow bit.
>>
>> - A priority 34000 logical flow is added for each logical switch datapath
>> configured with DNS records with the match ``udp.dst = 53`` to allow the
>> DNS
>> - reply packet from :ref:`Ingress Table 30: DNS responses <ls-in-30>`. This
>> is
>> + reply packet from :ref:`Ingress Table 31: DNS responses <ls-in-31>`. This
>> is
>> indicated by setting the allow bit.
>>
>> - A priority 34000 logical flow is added for each logical switch datapath
>> with
>> @@ -1843,7 +1875,7 @@ in ``ct_label.nf_id`` during request processing.
>> function group, a priority-99 flow matches ``reg8[21] == 1 && reg8[22] ==
>> 1 &&
>> reg0[22..29] == id`` and sets ``outport=P; reg8[23] = 1;
>> next(pipeline=ingress, table=T)`` where *P* is the ``outport`` of that
>> network
>> - function and *T* is the ingress table :ref:`Destination Lookup
>> <ls-in-32>`.
>> + function and *T* is the ingress table :ref:`Destination Lookup
>> <ls-in-33>`.
>> This redirects request packets matching ``to-lport`` ACLs with
>> network_function_group to the specific network function selected by the
>> Pre
>> Network Function stage. The packets are injected back to the ingress
>> pipeline
>> diff --git a/NEWS b/NEWS
>> index 748ae30eb..d426d671a 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -15,6 +15,12 @@ Post v26.03.0
>> * Add ECMP/multi-homing support for EVPN FDB entries. FDB entries
>> backed by a kernel nexthop group are load-balanced via OpenFlow
>> select groups with weighted buckets.
>> + * Add EVPN ARP/ND suppression for logical switches. When a
>> + broadcast ARP request or multicast ND solicitation targets
>> + an IP address that was learned via EVPN, ovn-northd now
>> + generates proxy-reply flows using a dedicated side table
>> + and the new chk_evpn_arp() action, preventing unnecessary
>> + flooding to remote VTEPs.
>> - Added "override-connected" option to Logical Router Static Routes to
>> mark
>> static routes as higher-priority than connected routes, which in turn
>> led
>> to changes in administrative distance for specific route types. Please
>> see
>> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
>> index cc5431a11..90ab27fc6 100644
>> --- a/lib/ovn-util.c
>> +++ b/lib/ovn-util.c
>> @@ -1007,8 +1007,8 @@ ip_address_and_port_from_lb_key(const char *key, char
>> **ip_address,
>> *
>> * NOTE: If OVN_NORTHD_PIPELINE_CSUM is updated make sure to double check
>> * whether an update of OVN_INTERNAL_MINOR_VER is required. */
>> -#define OVN_NORTHD_PIPELINE_CSUM "951247664 11305"
>> -#define OVN_INTERNAL_MINOR_VER 14
>> +#define OVN_NORTHD_PIPELINE_CSUM "3951531131 11381"
>> +#define OVN_INTERNAL_MINOR_VER 15
>>
>> /* Returns the OVN version. The caller must free the returned value. */
>> char *
>> diff --git a/lib/ovn-util.h b/lib/ovn-util.h
>> index bfca178e4..4d1761dc4 100644
>> --- a/lib/ovn-util.h
>> +++ b/lib/ovn-util.h
>> @@ -340,7 +340,7 @@ BUILD_ASSERT_DECL(
>> #define SCTP_ABORT_CHUNK_FLAG_T (1 << 0)
>>
>> /* The number of tables for the ingress and egress pipelines. */
>> -#define LOG_PIPELINE_INGRESS_LEN 34
>> +#define LOG_PIPELINE_INGRESS_LEN 35
>> #define LOG_PIPELINE_EGRESS_LEN 16
>>
>> static inline uint32_t
>> diff --git a/northd/northd.c b/northd/northd.c
>> index f5aa5cca3..a5534e89c 100644
>> --- a/northd/northd.c
>> +++ b/northd/northd.c
>> @@ -204,6 +204,7 @@ BUILD_ASSERT_DECL(ACL_OBS_STAGE_MAX < (1 << 2));
>> #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[2]"
>> #define REGBIT_LOOKUP_NEIGHBOR_IP_RESULT "reg9[3]"
>> #define REGBIT_DST_NAT_IP_LOCAL "reg9[4]"
>> +#define REGBIT_EVPN_LOOKUP_MAC "reg9[5]"
>> #define REGBIT_KNOWN_LB_SESSION "reg9[6]"
>> #define REGBIT_DHCP_RELAY_REQ_CHK "reg9[7]"
>> #define REGBIT_DHCP_RELAY_RESP_CHK "reg9[8]"
>> @@ -10754,6 +10755,85 @@ build_lswitch_arp_nd_responder_default(struct
>> ovn_datapath *od,
>> lflow_ref);
>> }
>>
>> +/* Ingress table ls_in_arp_nd_pre_lookup: EVPN ARP/ND pre-lookup.
>> + *
>> + * For EVPN-enabled switches, calls chk_evpn_arp() to look up the
>> + * IP in the EVPN ARP side table. On a hit, the resolved MAC is
>> + * stored in eth.dst and REGBIT_EVPN_LOOKUP_MAC is set. The
>> + * response flow in ls_in_arp_rsp reads the MAC from eth.dst.
>> + */
>> +static void
>> +build_lswitch_arp_nd_evpn_lookup(struct ovn_datapath *od,
>> + struct lflow_table *lflows,
>> + struct lflow_ref *lflow_ref)
>> +{
>> + ovs_assert(od->nbs);
>> +
>> + /* Default: pass through. */
>> + ovn_lflow_add(lflows, od, S_SWITCH_IN_ARP_ND_PRE_LOOKUP, 0, "1",
>> + "next;", lflow_ref);
>> +
>> + if (!od->has_evpn_vni) {
>> + return;
>> + }
>> +
>> + /* IPv4: broadcast ARP requests only. */
>> + ovn_lflow_add(lflows, od, S_SWITCH_IN_ARP_ND_PRE_LOOKUP, 5,
>> + "arp.op == 1 && eth.bcast",
Actually, I think we should also match on "from_evpn_vtep == 0" here to
avoid replying to GARP requests coming from the fabric.
>> + REGBIT_EVPN_LOOKUP_MAC " = chk_evpn_arp(arp.tpa); next;",
>> + lflow_ref);
>> +
>> + /* IPv6: multicast ND solicitations only. */
>> + ovn_lflow_add(lflows, od, S_SWITCH_IN_ARP_ND_PRE_LOOKUP, 5,
>> + "nd_ns_mcast",
>> + REGBIT_EVPN_LOOKUP_MAC " = chk_evpn_arp(nd.target);
>> next;",
>> + lflow_ref);
Same here.
>> +}
>> +
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev