O Mon, Jun 30, 2025 at 06:22:11PM +0200, Paul Menzel wrote:
> Dear Josh,
> 
> 
> Am 30.06.25 um 18:08 schrieb Hay, Joshua A:
> 
> > > Am 25.06.25 um 18:11 schrieb Joshua Hay:
> > > > This series fixes a stability issue in the flow scheduling Tx send/clean
> > > > path that results in a Tx timeout.
> > > > 
> > > > The existing guardrails in the Tx path were not sufficient to prevent
> > > > the driver from reusing completion tags that were still in flight (held
> > > > by the HW).  This collision would cause the driver to erroneously clean
> > > > the wrong packet thus leaving the descriptor ring in a bad state.
> > > > 
> > > > The main point of this refactor is replace the flow scheduling buffer
> > > 
> > > … to replace …?
> > 
> > Thanks, will fix in v2
> > 
> > > > ring with a large pool/array of buffers.  The completion tag then simply
> > > > is the index into this array.  The driver tracks the free tags and pulls
> > > > the next free one from a refillq.  The cleaning routines simply use the
> > > > completion tag from the completion descriptor to index into the array to
> > > > quickly find the buffers to clean.
> > > > 
> > > > All of the code to support the refactor is added first to ensure traffic
> > > > still passes with each patch.  The final patch then removes all of the
> > > > obsolete stashing code.
> > > 
> > > Do you have reproducers for the issue?
> > 
> > This issue cannot be reproduced without the customer specific device
> > configuration, but it can impact any traffic once in place.
> 
> Interesting. Then it’d be great if you could describe that setup in more
> detail.
> 

Hey Paul,

The hardware can process packets and return completions out of order;
this depends on HW configuration that is difficult to replicate.

To match completions with packets, each packet with pending completions
must be associated to a unique ID.  The previous code would occasionally
reassigned the same ID to multiple pending packets, resulting in
resource leaks and eventually panics.

The new code uses a much simpler data structure to assign IDs that is immune to 
duplicate assignment, and also much more efficient at runtime.
> > > > Joshua Hay (5):
> > > >     idpf: add support for Tx refillqs in flow scheduling mode
> > > >     idpf: improve when to set RE bit logic
> > > >     idpf: replace flow scheduling buffer ring with buffer pool
> > > >     idpf: stop Tx if there are insufficient buffer resources
> > > >     idpf: remove obsolete stashing code
> > > > 
> > > >    .../ethernet/intel/idpf/idpf_singleq_txrx.c   |   6 +-
> > > >    drivers/net/ethernet/intel/idpf/idpf_txrx.c   | 626 
> > > > ++++++------------
> > > >    drivers/net/ethernet/intel/idpf/idpf_txrx.h   |  76 +--
> > > >    3 files changed, 239 insertions(+), 469 deletions(-)
> 
> 
> Kind regards,
> 
> Paul

Reply via email to