On 11/1/25 7:23 AM, Numan Siddique wrote:
> Hello OVS folks,
> 
> In our deployments we are seeing a lot of datapath flow offload issues
> with tc resulting in packets getting handled in the host and packet
> drops.
> 
> We recently observed such an issue and only restart of ovs-vswitchd fixed it.
> 
> I debugged a bit and found that all the datapath flows offloaded by
> ovs-vswitchd to tc fails if the recirculation id is greater than
> 268,435,455 (which is 0x0fffffff).
> 
> We see the below error messages:
> 
> --------------------------------------------------
> 2025-11-01T03:12:18.415Z|93221|netlink_socket(handler53)|DBG|nl_sock_recv__
> (Success): nl(len:692, type=2(error), flags=200[MATCH], seq=7af,
> pid=3613179965 error(-22(Invalid argument), in-reply-to(nl(len:624,
> type=44(family-defined), flags=409[REQUEST][ECHO][ATOMIC], seq=7af,
> pid=3613179965))
> 2025-11-01T03:12:18.415Z|93222|netlink_socket(handler53)|DBG|received
> NAK error=22 - Specified chain index exceeds upper limit
> 2025-11-01T03:12:18.415Z|93223|dpif_netlink(handler53)|ERR|failed to
> offload flow: Invalid argument: ovn-f3902a-0
> 2025-11-01T03:12:18.415Z|93224|dpif_netlink(handler53)|DBG|system@ovs-system:
> put[create] ufid:e287c507-e111-44be-90dd-469c242cb873
> recirc_id(0x2660dc6d),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x915,src=10.32.35.9,dst=10.32.5.25,ttl=59/0,tp_src=34744/0,tp_dst=6081/0,geneve({class=0x102/0,type=0x80/0,len=4/0,0x79a041a/0}),flags(-df+csum+key)),in_port(6),skb_mark(0/0),ct_state(0x21/0x3f),ct_zone(0x17f5/0),ct_mark(0/0x1),ct_label(0/0),ct_tuple4(src=172.27.61.139/0.0.0.0,dst=172.27.58.113/0.0.0.0,proto=6/0,tp_src=49588/0,tp_dst=4240/0),eth(src=be:28:87:5d:2e:28,dst=fe:6c:ee:aa:33:be),eth_type(0x0800),ipv4(src=172.27.61.139,dst=172.27.58.113,proto=6,tos=0/0,ttl=64/0,frag=no),tcp(src=49588/0x8000,dst=4240/0xf800),tcp_flags(0/0),
> actions:ct(commit,zone=6133,mark=0/0x1,nat(src)),20
> -------------------------------------------------------------------
> 
> I was able to reproduce the issue locally with OVS main and Fedora
> kernel 6.16.10-200.fc42.  I had to hack the code though.
> 
> ----
> diff --git a/ofproto/ofproto-dpif-rid.c b/ofproto/ofproto-dpif-rid.c
> index f01468025..1d577d73b 100644
> --- a/ofproto/ofproto-dpif-rid.c
> +++ b/ofproto/ofproto-dpif-rid.c
> @@ -34,8 +34,7 @@ static struct ovs_list expiring OVS_GUARDED_BY(mutex)
>  static struct ovs_list expired OVS_GUARDED_BY(mutex)
>      = OVS_LIST_INITIALIZER(&expired);
> 
> -static uint32_t next_id OVS_GUARDED_BY(mutex) = 1; /* Possible next free id. 
> */
> -
> +static uint32_t next_id OVS_GUARDED_BY(mutex) = 0x0fffffff; /*
> Possible next free id. */
>  #define RECIRC_POOL_STATIC_IDS 1024
> 
>  static void recirc_id_node_free(struct recirc_id_node *);
> -----
> 
> Looks like kernel expects the tc flower chain id to be encoded with in
> the first 28 bits [1], where as ovs-vswitchd is using the value of
> recirc_id as chain id and if the recirc_id overflows 28 bits,  the
> issue is seen.
> 
> Is my analysis correct ?  I'm not too familiar with the classifier and
> the offload code base.  Hope the experts can take a look at it.

Hi, Numan.  Yes, your analysis seems correct.  The GOTO_CHAIN action
is an "extended action", where first 4 bits are reserved for the action
type and the rest are a value:
  
https://elixir.bootlin.com/linux/v6.17.6/source/tools/include/uapi/linux/pkt_cls.h#L50-L64

This means, we can't offload recirculations to chains above 28 bits.

There are two things here that need fixing:

1. OVS doesn't seem to check that chain id fits into the action, blindly
   ORing it in.  That should be fixed, so we are not trying to send such
   flows into kernel in the first place.

2. Somehow limit the recircualtion id space to 28 bits when the HW
   offload is enabled.  I don't like this, as we'll be just adding yet
   another hack for HW offload to work, but I'm not sure what would be
   a different solution here.  Note: id-pool would solve the problem
   by allocating densely packed IDs, but that may cause collisions as
   the whole process of retiring old IDs is a bit racy and we rely on
   time to guess when we can actually stop using them.  Needs more
   investigation.

Best regards, Ilya Maximets.

> 
> 
> [1] -  
> https://github.com/torvalds/linux/blob/v6.18-rc3/net/sched/cls_api.c#L3137
> 
> Thanks
> Numan

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to