> -----Original Message-----
> From: Jakub Kicinski <[email protected]>
> Sent: Wednesday, January 14, 2026 9:55 PM
> To: Haiyang Zhang <[email protected]>
> Cc: Haiyang Zhang <[email protected]>; linux-
> [email protected]; [email protected]; KY Srinivasan
> <[email protected]>; Wei Liu <[email protected]>; Dexuan Cui
> <[email protected]>; Long Li <[email protected]>; Andrew Lunn
> <[email protected]>; David S. Miller <[email protected]>; Eric
> Dumazet <[email protected]>; Paolo Abeni <[email protected]>; Konstantin
> Taranov <[email protected]>; Simon Horman <[email protected]>; Erni
> Sri Satya Vennela <[email protected]>; Shradha Gupta
> <[email protected]>; Saurabh Sengar
> <[email protected]>; Aditya Garg
> <[email protected]>; Dipayaan Roy
> <[email protected]>; Shiraz Saleem
> <[email protected]>; [email protected]; linux-
> [email protected]; Paul Rosswurm <[email protected]>
> Subject: Re: [EXTERNAL] Re: [PATCH V2,net-next, 1/2] net: mana: Add
> support for coalesced RX packets on CQE
> 
> On Wed, 14 Jan 2026 18:27:50 +0000 Haiyang Zhang wrote:
> > > > And, the coalescing can add up to 2 microseconds into one-way
> latency.
> > >
> > > I am asking you how the _device_ (hypervisor?) decides when to
> coalesce
> > > and when to send a partial CQE (<4 packets in 4 pkt CQE). You are
> using
> > > the coalescing uAPI, so I'm trying to make sure this is the correct
> API.
> > > CQE configuration can also be done via ringparam.
> >
> > When coalescing is enabled, the device waits for packets which can
> > have the CQE coalesced with previous packet(s). That coalescing process
> > is finished (and a CQE written to the appropriate CQ) when the CQE is
> > filled with 4 pkts, or time expired, or other device specific logic is
> > satisfied.
> 
> See, what I'm afraid is happening here is that you are enabling
> completion coalescing (how long the device keeps the CQE pending).
> Which is _not_ what rx_max_coalesced_frames controls for most NICs.
> For most NICs rx_max_coalesced_frames controls IRQ generation logic.
> 
> The NIC first buffers up CQEs for typically single digit usecs, and
> then once CQE timer exipred and writeback happened it starts an IRQ
> coalescing timer. Once the IRQ coalescing timer expires IRQ is
> triggered, which schedules NAPI. (broad strokes, obviously many
> differences and optimizations exist)
> 
> Is my guess correct? Are you controlling CQE coalescing>
> 
> Can you control the timeout instead of the frame count?

Our NIC's timeout value cannot be controlled by driver. Also, the
timeout may be changed in future NIC HW.

So, I use the ethtool/rx-frames, which is either 1 or 4 on our
NIC, to switch the CQE coalescing feature on/off.

Thanks,
- Haiyang


Reply via email to