On Tue, Feb 10, 2026 at 10:28:10AM +0100, Morten Brørup wrote: > > From: Bruce Richardson [mailto:[email protected]] > > Sent: Tuesday, 10 February 2026 10.04 > > > > On Tue, Feb 10, 2026 at 12:08:44AM +0100, Morten Brørup wrote: > > > > +static inline void > > > > +write_txd(volatile void *txd, uint64_t qw0, uint64_t qw1) > > > > +{ > > > > + uint64_t *txd_qw = __rte_assume_aligned(RTE_CAST_PTR(void *, > > > > txd), 16); > > > > + > > > > + txd_qw[0] = rte_cpu_to_le_64(qw0); > > > > + txd_qw[1] = rte_cpu_to_le_64(qw1); > > > > +} > > > > > > How about using __rte_aligned() instead, something like this > > (untested): > > > > > > struct __rte_aligned(16) txd_t { > > > uint64_t qw0; > > > uint64_t qw1; > > > }; > > > > I can see if this works for us... > > > > > > > > *RTE_CAST_PTR(volatile struct txd_t *, txd) = { > > rte_cpu_to_le_64(qw0), > > > rte_cpu_to_le_64(qw1) }; > > > > > > > > > And why strip the "volatile"? > > > > > > > For the descriptor writes, it doesn't matter the order in which the > > descriptors and the descriptor fields are actually written, since the > > NIC > > relies upon the tail pointer update - which includes a fence - to > > inform it > > of when the descriptors are ready. The volatile is necessary for reads, > > though, which is why the ring is marked as such, but for Tx it prevents > > the > > compiler from opportunistically e.g. converting two 64-byte writes into > > a > > 128-byte write. > > Makes sense. > Suggest that you spread out a few comments about this at the relevant > locations in the source code. > Adding an explanation as part of the write_txd function, which is where the volatile gets cast away.
/Bruce

