On Mon, Nov 3, 2025 at 2:12 PM Numan Siddique <[email protected]> wrote:
>
> On Mon, Nov 3, 2025 at 2:02 PM Ilya Maximets <[email protected]> wrote:
> >
> > On 11/3/25 7:29 PM, Numan Siddique wrote:
> > > On Mon, Nov 3, 2025 at 5:55 AM Ilya Maximets <[email protected]> wrote:
> > >>
> > >> On 11/1/25 7:23 AM, Numan Siddique wrote:
> > >>> Hello OVS folks,
> > >>>
> > >>> In our deployments we are seeing a lot of datapath flow offload issues
> > >>> with tc resulting in packets getting handled in the host and packet
> > >>> drops.
> > >>>
> > >>> We recently observed such an issue and only restart of ovs-vswitchd 
> > >>> fixed it.
> > >>>
> > >>> I debugged a bit and found that all the datapath flows offloaded by
> > >>> ovs-vswitchd to tc fails if the recirculation id is greater than
> > >>> 268,435,455 (which is 0x0fffffff).
> > >>>
> > >>> We see the below error messages:
> > >>>
> > >>> --------------------------------------------------
> > >>> 2025-11-01T03:12:18.415Z|93221|netlink_socket(handler53)|DBG|nl_sock_recv__
> > >>> (Success): nl(len:692, type=2(error), flags=200[MATCH], seq=7af,
> > >>> pid=3613179965 error(-22(Invalid argument), in-reply-to(nl(len:624,
> > >>> type=44(family-defined), flags=409[REQUEST][ECHO][ATOMIC], seq=7af,
> > >>> pid=3613179965))
> > >>> 2025-11-01T03:12:18.415Z|93222|netlink_socket(handler53)|DBG|received
> > >>> NAK error=22 - Specified chain index exceeds upper limit
> > >>> 2025-11-01T03:12:18.415Z|93223|dpif_netlink(handler53)|ERR|failed to
> > >>> offload flow: Invalid argument: ovn-f3902a-0
> > >>> 2025-11-01T03:12:18.415Z|93224|dpif_netlink(handler53)|DBG|system@ovs-system:
> > >>> put[create] ufid:e287c507-e111-44be-90dd-469c242cb873
> > >>> recirc_id(0x2660dc6d),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x915,src=10.32.35.9,dst=10.32.5.25,ttl=59/0,tp_src=34744/0,tp_dst=6081/0,geneve({class=0x102/0,type=0x80/0,len=4/0,0x79a041a/0}),flags(-df+csum+key)),in_port(6),skb_mark(0/0),ct_state(0x21/0x3f),ct_zone(0x17f5/0),ct_mark(0/0x1),ct_label(0/0),ct_tuple4(src=172.27.61.139/0.0.0.0,dst=172.27.58.113/0.0.0.0,proto=6/0,tp_src=49588/0,tp_dst=4240/0),eth(src=be:28:87:5d:2e:28,dst=fe:6c:ee:aa:33:be),eth_type(0x0800),ipv4(src=172.27.61.139,dst=172.27.58.113,proto=6,tos=0/0,ttl=64/0,frag=no),tcp(src=49588/0x8000,dst=4240/0xf800),tcp_flags(0/0),
> > >>> actions:ct(commit,zone=6133,mark=0/0x1,nat(src)),20
> > >>> -------------------------------------------------------------------
> > >>>
> > >>> I was able to reproduce the issue locally with OVS main and Fedora
> > >>> kernel 6.16.10-200.fc42.  I had to hack the code though.
> > >>>
> > >>> ----
> > >>> diff --git a/ofproto/ofproto-dpif-rid.c b/ofproto/ofproto-dpif-rid.c
> > >>> index f01468025..1d577d73b 100644
> > >>> --- a/ofproto/ofproto-dpif-rid.c
> > >>> +++ b/ofproto/ofproto-dpif-rid.c
> > >>> @@ -34,8 +34,7 @@ static struct ovs_list expiring OVS_GUARDED_BY(mutex)
> > >>>  static struct ovs_list expired OVS_GUARDED_BY(mutex)
> > >>>      = OVS_LIST_INITIALIZER(&expired);
> > >>>
> > >>> -static uint32_t next_id OVS_GUARDED_BY(mutex) = 1; /* Possible next 
> > >>> free id. */
> > >>> -
> > >>> +static uint32_t next_id OVS_GUARDED_BY(mutex) = 0x0fffffff; /*
> > >>> Possible next free id. */
> > >>>  #define RECIRC_POOL_STATIC_IDS 1024
> > >>>
> > >>>  static void recirc_id_node_free(struct recirc_id_node *);
> > >>> -----
> > >>>
> > >>> Looks like kernel expects the tc flower chain id to be encoded with in
> > >>> the first 28 bits [1], where as ovs-vswitchd is using the value of
> > >>> recirc_id as chain id and if the recirc_id overflows 28 bits,  the
> > >>> issue is seen.
> > >>>
> > >>> Is my analysis correct ?  I'm not too familiar with the classifier and
> > >>> the offload code base.  Hope the experts can take a look at it.
> > >>
> > >> Hi, Numan.  Yes, your analysis seems correct.  The GOTO_CHAIN action
> > >> is an "extended action", where first 4 bits are reserved for the action
> > >> type and the rest are a value:
> > >>   
> > >> https://elixir.bootlin.com/linux/v6.17.6/source/tools/include/uapi/linux/pkt_cls.h#L50-L64
> > >>
> > >> This means, we can't offload recirculations to chains above 28 bits.
> > >>
> > >> There are two things here that need fixing:
> > >>
> > >> 1. OVS doesn't seem to check that chain id fits into the action, blindly
> > >>    ORing it in.  That should be fixed, so we are not trying to send such
> > >>    flows into kernel in the first place.
> > >>
> > >> 2. Somehow limit the recircualtion id space to 28 bits when the HW
> > >>    offload is enabled.  I don't like this, as we'll be just adding yet
> > >>    another hack for HW offload to work, but I'm not sure what would be
> > >>    a different solution here.  Note: id-pool would solve the problem
> > >>    by allocating densely packed IDs, but that may cause collisions as
> > >>    the whole process of retiring old IDs is a bit racy and we rely on
> > >>    time to guess when we can actually stop using them.  Needs more
> > >>    investigation.
> > >>
> > >> Best regards, Ilya Maximets.

I'll take a look into the fix based on your suggestions.  If you or
someone has already started looking into me, please let me know.

Numan

> > >
> > >
> > > Thanks for the reply, Ilya.
> > >
> > > In one of our deployment which uses OVS 3.2.0,  we see the below logs
> > > and packet drops to the VM,
> > >
> > > ---
> > >
> > > 2025-10-30T04:26:28.474Z|78613|tc(handler25)|WARN|Kernel flower
> > > acknowledgment does not match request!  Set dpif_netlink to dbg to see
> > > which rule caused this error.
> > >
> > > 2025-10-30T04:26:29.113Z|78614|tc(handler25)|WARN|Kernel flower
> > > acknowledgment does not match request!  Set dpif_netlink to dbg to see
> > > which rule caused this error.
> > > ------
> > >
> > > Any pointers on why we are seeing the above WARN message  ?  OVS 3.2.0
> > > is missing the below backport -
> > > https://github.com/openvswitch/ovs/commit/1857c569ee9a6432ac46d31a31f882402c215437
> > > Could it be because of this ?
> >
> > It's likely.  To confirm you'll need to enbale debug logs for the tc module,
> > may also enbale dbg for the dpif_netlink, as the logs suggest.
> >
> > > Do we need to move to 3.2.2 at least for successful offloads ?
> >
> > If the issue above is indeed your issue, then update will remove the 
> > warning.
> > However, this flow will not be offloaded, as it requires modification of the
> > tp_src of the outer tunnel header which TC doesn't support.  Such flows 
> > should
> > not be common though, so I'm not sure if you actually need them offloaded.
> > It depends on the setup.  But also, you need to confirm that it is your 
> > issue
> > first.
>
> Got it.  Thanks.
>
> >
> > FWIW, 3.2.0 is very old and is missing a lot of fixes, so I'd suggest 
> > updating
> > it anyway.  Also, 3.2 is EoL, so going to at least 3.3 is recommended.
>
> Ack.
>
>
> >
> > > The kernel version is - 5.14.0-162.6.1.el9
> > >
> > >
> > > In the below datapath flow dump, we see that there is a flow for the
> > > first packet and the final action of this dp flow is -
> > > recirc(0x94691).
> > >
> > > ------------
> > > recirc_id(0),in_port(18),ct_state(-new-est-rpl-trk),ct_mark(0/0x2),eth(src=b0:cf:0e:b1:5f:ff,dst=5e:8e:4a:f0:44:25),eth_type(0x8100),vlan(vid=120,pcp=0),encap(eth_type(0x0800),ipv4(src=192.0.0.0/224.0.0.0,dst=160.211.64.157,proto=1,ttl=47,frag=no)),
> > > packets:2217, bytes:186228, used:0.330s,
> > > actions:pop_vlan,ct(zone=24,nat),recirc(0x94691)
> > > recirc_id(0),in_port(18),ct_state(-new-est-rpl-trk),ct_mark(0/0x2),eth(src=b0:cf:0e:b1:5f:ff,dst=5e:8e:4a:f0:44:25),eth_type(0x8100),vlan(vid=120,pcp=0),encap(eth_type(0x0800),ipv4(src=96.0.0.0/252.0.0.0,dst=160.211.64.157,proto=1,ttl=44,frag=no)),
> > > packets:1005, bytes:84420, used:0.850s,
> > > actions:pop_vlan,ct(zone=24,nat),recirc(0x94691)
> > > ----------
> > >
> > > But in the dp flows, we never found a flow with recirc_id(0x94691).
> > > After a few minutes,  we took the dump of dp flows and we noticed that
> > > there was a flow matching recirc(0x94691), but it was totally
> > > unrelated to the packet in question.
> > >
> > >
> > > We also saw the below message in the ovs logs.
> > >
> > > -----------
> > > 2025-10-30T02:54:42.331Z|41074|ofproto_dpif_upcall(handler25)|INFO|received
> > > packet on unassociated datapath port 18 (no recirculation data for
> > > recirc_id 0x94691)
> > >
> > > 2025-10-30T03:15:42.380Z|43176|ofproto_dpif_upcall(handler25)|INFO|received
> > > packet on unassociated datapath port 18 (no recirculation data for
> > > recirc_id 0x94691)
> > > ----------
> > >
> > > IMO the packet drops were due to the missing dp flow for the recirc_id 
> > > 0x94691.
> > >
> > > Do you have any pointers on what could be going wrong ?
> >
> > Was OVS recently restarted when this was observed?
>
> No it was not restarted.
>
> That may explain the missing
> > records for the recirculation ID in userspace.  But otherwise it's hard to 
> > guess
> > what could've gone wrong here.
>
> Got it.  Its hard to guess as there are a lot of other factors.
>
> >
> > Since you also have the recirc_id overflow issue, it might be possible that 
> > the
> > ID got truncated somehwere and hence it's incorrect.
> >
>
> I see.  I'll try to dig further.
>
> Numan
>
> > >
> > > Thanks for your time
> > >
> > > Numan
> > >
> > >
> > >
> > >>
> > >>>
> > >>>
> > >>> [1] -  
> > >>> https://github.com/torvalds/linux/blob/v6.18-rc3/net/sched/cls_api.c#L3137
> > >>>
> > >>> Thanks
> > >>> Numan
> > >>
> >
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to