> From: Bruce Richardson [mailto:[email protected]] > Sent: Tuesday, 10 February 2026 10.04 > > On Tue, Feb 10, 2026 at 12:08:44AM +0100, Morten Brørup wrote: > > > +static inline void > > > +write_txd(volatile void *txd, uint64_t qw0, uint64_t qw1) > > > +{ > > > + uint64_t *txd_qw = __rte_assume_aligned(RTE_CAST_PTR(void *, > > > txd), 16); > > > + > > > + txd_qw[0] = rte_cpu_to_le_64(qw0); > > > + txd_qw[1] = rte_cpu_to_le_64(qw1); > > > +} > > > > How about using __rte_aligned() instead, something like this > (untested): > > > > struct __rte_aligned(16) txd_t { > > uint64_t qw0; > > uint64_t qw1; > > }; > > I can see if this works for us... > > > > > *RTE_CAST_PTR(volatile struct txd_t *, txd) = { > rte_cpu_to_le_64(qw0), > > rte_cpu_to_le_64(qw1) }; > > > > > > And why strip the "volatile"? > > > > For the descriptor writes, it doesn't matter the order in which the > descriptors and the descriptor fields are actually written, since the > NIC > relies upon the tail pointer update - which includes a fence - to > inform it > of when the descriptors are ready. The volatile is necessary for reads, > though, which is why the ring is marked as such, but for Tx it prevents > the > compiler from opportunistically e.g. converting two 64-byte writes into > a > 128-byte write.
Makes sense. Suggest that you spread out a few comments about this at the relevant locations in the source code.

