On Fri, 2025-12-05 at 09:35 +0900, Alexandre Courbot wrote:
>
> With one caveat: `new` now returns a 4K object on the stack, which we
> definitely want to avoid. So maybe we can have a wrapper for things we
> want to align the 4K:
>
> #[repr(C)]
> pub(crate) struct GspPageAligned<T> {
> pub(crate) inner: T,
> padding: [u8; GSP_PAGE_SIZE - core::mem::size_of::<T>()],
> }
>
> We would then allocate the CoherentAllocation using a
> `GspPageAligned<GspArgumentsCached>`, and initialize its useful data
> with:
>
> dma_write!(rmargs[0].inner = fw::GspArgumentsCached::new(&cmdq))?;
I'm confused. Aren't we already avoiding the stack? This is the code today:
let rmargs = CoherentAllocation::<GspArgumentsCached>::alloc_coherent(
dev,
1,
GFP_KERNEL | __GFP_ZERO,
)?;
dma_write!(rmargs[0] = fw::GspArgumentsCached::new(&cmdq))?;
The only difference with what's there today vs what you suggest is the
".inner", and I think I can
avoid even that if I make GspPageAligned a tuple instead of a named struct.
>
> > I had to remove the #[repr(transparent)]. Is that
> > okay? The code compiles and seems to work.
>
> As long as the struct is `repr(C)`, the layout will be what we expect.
> Actually this made me realize that `repr(C)` is technically what we want for
> our
> bindings abstractions, not `repr(transparent)` - both happen to have the
> same effect since the wrapper struct is `repr(C)` anyway, but the latter
> is more restrictive than we need.
>
> Glad we found an elegant way to address this!
Actually, I think a more elegant solution would be a new variant of
CoherentAllocation::alloc_coherent() that takes a size to allocate instead of
using size_of::<T>.
In fact, I wonder if it makes sense to always grow the size of the allocation
to the nearest page,
since dma_alloc_attrs() always allocates whole pages anyway. Perhaps
CoherentAllocation<T> needs an
allocated() method in addition to a size() method. size() returns count*sizeof
as today, and
allocated() returns that value rounded up to the nearest PAGE_SIZE.