On 17 Jun 2026, at 12:25, Ilya Maximets wrote:
> TCA_ACT_MAX_NUM limits the actions number from the OVS perspective,
> i.e., the size of the actions array in tc_flower structure. However,
> these actions can be represented by multiple actual TC actions in the
> netlink message. For example, output to internal ports requires extra
> "skbedit" action to change the packet type before sending it to the
> "mirred" action. We also emit extra "csum" actions and put tunnel
> attributes, etc.
>
> In the end, the number of actual TC actions is frequently larger
> than the number of actions in tc_flower structure. Meaning that the
> number of priorities in use can be larger than TCA_ACT_MAX_NUM.
>
> If that happens, we fail to parse the full ECHO reply for the netlink
> transaction and report a mismatch between requested and acknowledged
> configuration. The traffic should work fine, but we have the annoying
> warning in the logs and the revalidators will try updating that one
> datapath flow triggering more warnings.
>
> We need to always expect the full range of priorities while parsing
> kernel replies. And we need to check that the number of used
> priorities doesn't actually exceed the TCA_ACT_MAX_PRIO before sending
> requests to the kernel, otherwise they can be "silently" truncated in
> the absence of the kernel commit:
> 5f5f92912b43 ("tc: Return an error if filters try to attach too many
> actions")
>
> Also adding some comments around the indexing, as it is confusing
> that priorities for TC flower actions are counted from 1, but the
> action table indexes for the police action are starting from zero.
> Both using the same constant for the upper bound, but it is inclusive
> in one case and not in the other.
>
> Notes:
>
> Technically, there should be no need for the artificial limit imposed
> by TCA_ACT_MAX_NUM, since the issue it was trying to work around was
> fixed in the kernel commit:
> 369609fc6272 ("tc: Ensure we have enough buffer space when sending filter
> netlink notifications")
> But that fix is not available in 5.x kernels at the moment.
>
> TCA_ACT_MAX_NUM by itself is a wrong concept. It supposed to limit
> the number of actual TC actions, but it is applied in two different
> places limiting tc_flower actions array instead, while there is
> not a 1 to 1 mapping between them and the actual TC actions generated.
> It sort of works for the most part, but fails when the mapping is
> significantly imbalanced or the number of actions is on the edge of
> the message size. To be fair, it was not meant as a precise limit.
> And we may actually want some limit at offload-tc-netdev level for
> performance reasons. But the concept is flawed.
>
> Fixing the logic around TCA_ACT_MAX_NUM or removing it outright is
> a little orthogonal to this patch though, as we'd still need both the
> TCA_ACT_MAX_PRIO check to avoid sending more than 32 actions to the
> kernel, and the full range of parsing for actions in replies.
>
> Fixes: d0fbb09f907b ("tc: Limit the max action number to 16")
> Fixes: 0c70132cd288 ("tc: Make the actions order consistent")
> Reported-at: https://redhat.atlassian.net/browse/FDP-4005
> Signed-off-by: Ilya Maximets <[email protected]>
Thanks for the quick fix, the change looks good to me. One small nit
about not using the ovs-p1 naming convention, see below.
Acked-by: Eelco Chaudron <[email protected]>
> ---
> lib/tc.c | 9 +++++--
> tests/system-offloads-traffic.at | 43 ++++++++++++++++++++++++++++++++
> 2 files changed, 50 insertions(+), 2 deletions(-)
>
> diff --git a/lib/tc.c b/lib/tc.c
> index 4a9c6c267..1ae026b0e 100644
> --- a/lib/tc.c
> +++ b/lib/tc.c
> @@ -2133,7 +2133,8 @@ nl_parse_flower_actions(struct nlattr **attrs, struct
> tc_flower *flower,
> bool terse)
> {
> const struct nlattr *actions = attrs[TCA_FLOWER_ACT];
> - static struct nl_policy actions_orders_policy[TCA_ACT_MAX_NUM + 1] = {};
> + /* Priorities of flower actions start at one: [1, TCA_ACT_MAX_PRIO]. */
> + static struct nl_policy actions_orders_policy[TCA_ACT_MAX_PRIO + 1] = {};
> struct nlattr *actions_orders[ARRAY_SIZE(actions_orders_policy)];
> const int max_size = ARRAY_SIZE(actions_orders_policy);
> int previous_action_count = 0;
> @@ -2402,6 +2403,7 @@ tc_dump_tc_action_start(char *name, struct nl_dump
> *dump)
> int
> parse_netlink_to_tc_policer(struct ofpbuf *reply, uint32_t police_idx[])
> {
> + /* Action tables are indexed from zero: [0, TCA_ACT_MAX_PRIO - 1]. */
> static struct nl_policy actions_orders_policy[TCA_ACT_MAX_PRIO] = {};
> struct ofpbuf b = ofpbuf_const_initializer(reply->data, reply->size);
> struct nlattr *actions_orders[ARRAY_SIZE(actions_orders_policy)];
> @@ -3540,7 +3542,10 @@ nl_msg_put_flower_acts(struct ofpbuf *request, struct
> tc_flower *flower)
> }
> nl_msg_end_nested(request, offset);
>
> - return 0;
> + /* Some tc_actions generate more than one actual TC action. Make sure
> + * we're within the kernel's limit. 'act_index - 1' is the last used
> + * priority / index. */
> + return act_index - 1 <= TCA_ACT_MAX_PRIO ? 0 : -EOPNOTSUPP;
> }
>
> static void
> diff --git a/tests/system-offloads-traffic.at
> b/tests/system-offloads-traffic.at
> index 13a8054a4..55c26a0ce 100644
> --- a/tests/system-offloads-traffic.at
> +++ b/tests/system-offloads-traffic.at
> @@ -1245,3 +1245,46 @@ AT_CHECK([ovs-appctl --format json --pretty
> dpif/offload/show \
>
> OVS_TRAFFIC_VSWITCHD_STOP
> AT_CLEANUP
> +
> +AT_SETUP([offloads - many output ports - actions limit])
> +OVS_TRAFFIC_VSWITCHD_START([], [],
> + [-- set Open_vSwitch . other_config:hw-offload=true])
> +
> +AT_CHECK([ovs-appctl vlog/set tc:file:dbg dpif_netlink:file:dbg
> vconn:file:info])
> +
> +m4_for([id], [1], [17], [1], [
> + ADD_NAMESPACES(atns-id)
> + ADD_VETH(p-id, atns-id, br0, "10.1.1.id/24")
> +])
> +
> +AT_CHECK([ovs-ofctl add-flow br0 "arp,actions=NORMAL"])
> +dnl Forward all non-matching IP traffic towards ovs-p-1 (ICMP replies).
> +AT_CHECK([ovs-ofctl add-flow br0 "priority=0,ip,actions=output:ovs-p-1"])
All other tests use the ovs-p1 naming convention, it would be nice to
do the same here (not a blocker). The m4 empty-quote trick makes the
macros look a little odd, but it works correctly. Here is an example
diff:
diff --git a/tests/system-offloads-traffic.at b/tests/system-offloads-traffic.at
index 55c26a0ce..d3abe12df 100644
--- a/tests/system-offloads-traffic.at
+++ b/tests/system-offloads-traffic.at
@@ -1253,18 +1253,18 @@ OVS_TRAFFIC_VSWITCHD_START([], [],
AT_CHECK([ovs-appctl vlog/set tc:file:dbg dpif_netlink:file:dbg
vconn:file:info])
m4_for([id], [1], [17], [1], [
- ADD_NAMESPACES(atns-id)
- ADD_VETH(p-id, atns-id, br0, "10.1.1.id/24")
+ ADD_NAMESPACES(atns[]id)
+ ADD_VETH(p[]id, atns[]id, br0, "10.1.1.id/24")
])
AT_CHECK([ovs-ofctl add-flow br0 "arp,actions=NORMAL"])
-dnl Forward all non-matching IP traffic towards ovs-p-1 (ICMP replies).
-AT_CHECK([ovs-ofctl add-flow br0 "priority=0,ip,actions=output:ovs-p-1"])
+dnl Forward all non-matching IP traffic towards ovs-p1 (ICMP replies).
+AT_CHECK([ovs-ofctl add-flow br0 "priority=0,ip,actions=output:ovs-p1"])
m4_define([OUTPUT_ACTIONS],
- [m4_for([n], [2], [$1], [1], [,output:ovs-p-n])])
+ [m4_for([n], [2], [$1], [1], [,output:ovs-p[]n])])
-m4_define([DP_FLOW], ["in_port(ovs-p-1),eth(),eth_type(0x0800)"])
+m4_define([DP_FLOW], ["in_port(ovs-p1),eth(),eth_type(0x0800)"])
dnl Gradually checking broadcast to more and more ports. The last one should
dnl hit the actions limit and fall back to kernel datapath. Actions include
@@ -1272,9 +1272,9 @@ dnl LOCAL output to check the mismatch between the number
of OVS and TC flower
dnl actions (output to internal port requires additional skbedit).
m4_for([id], [2], [17], [1], [
AT_CHECK([ovs-ofctl add-flow br0 \
- "priority=id,in_port=ovs-p-1,ip,actions=LOCAL[]OUTPUT_ACTIONS(id)"])
+ "priority=id,in_port=ovs-p1,ip,actions=LOCAL[]OUTPUT_ACTIONS(id)"])
AT_CHECK([ovs-appctl revalidator/wait])
- NS_CHECK_EXEC([atns-1], [ping -q -c 3 -i 0.1 -W 2 10.1.1.id | FORMAT_PING],
[0], [dnl
+ NS_CHECK_EXEC([atns1], [ping -q -c 3 -i 0.1 -W 2 10.1.1.id | FORMAT_PING],
[0], [dnl
3 packets transmitted, 3 received, 0% packet loss, time 0ms
])
m4_if(id, [17], [
> +
> +m4_define([OUTPUT_ACTIONS],
> + [m4_for([n], [2], [$1], [1], [,output:ovs-p-n])])
> +
> +m4_define([DP_FLOW], ["in_port(ovs-p-1),eth(),eth_type(0x0800)"])
> +
> +dnl Gradually checking broadcast to more and more ports. The last one should
> +dnl hit the actions limit and fall back to kernel datapath. Actions include
> +dnl LOCAL output to check the mismatch between the number of OVS and TC
> flower
> +dnl actions (output to internal port requires additional skbedit).
> +m4_for([id], [2], [17], [1], [
> + AT_CHECK([ovs-ofctl add-flow br0 \
> + "priority=id,in_port=ovs-p-1,ip,actions=LOCAL[]OUTPUT_ACTIONS(id)"])
> + AT_CHECK([ovs-appctl revalidator/wait])
> + NS_CHECK_EXEC([atns-1], [ping -q -c 3 -i 0.1 -W 2 10.1.1.id |
> FORMAT_PING], [0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> + m4_if(id, [17], [
> + AT_CHECK([ovs-appctl dpctl/dump-flows --names type=tc | grep DP_FLOW],
> [1])
> + AT_CHECK([ovs-appctl dpctl/dump-flows --names type=ovs | grep -q
> DP_FLOW])
> + ], [
> + AT_CHECK([ovs-appctl dpctl/dump-flows --names type=tc | grep -q
> DP_FLOW])
> + AT_CHECK([ovs-appctl dpctl/dump-flows --names type=ovs | grep DP_FLOW],
> [1])
> + ])
> +])
> +
> +OVS_TRAFFIC_VSWITCHD_STOP
> +AT_CLEANUP
> --
> 2.54.0
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev