On Fri Dec 5, 2025 at 6:45 AM JST, Timur Tabi wrote:
> On Thu, 2025-12-04 at 21:18 +0000, Timur Tabi wrote:
>> If you can tell me how to fix the CoherentAllocation allocation call so that
>> it allocates 4K, that
>> would be a better fix for this problem. It's only necessary for
>> Turing/GA100, however.
>
> Actually, I think I figured it out.
>
> /// Arguments for GSP startup.
> #[repr(C)]
> pub(crate) struct GspArgumentsCached(
> bindings::GSP_ARGUMENTS_CACHED,
> [u8; GSP_PAGE_SIZE -
> core::mem::size_of::<bindings::GSP_ARGUMENTS_CACHED>()], // Padding to make
> it 4K
> );
>
> impl GspArgumentsCached {
> /// Creates the arguments for starting the GSP up using `cmdq` as its
> command queue.
> pub(crate) fn new(cmdq: &Cmdq) -> Self {
> Self(
> bindings::GSP_ARGUMENTS_CACHED {
> messageQueueInitArguments:
> MessageQueueInitArguments::new(cmdq).0,
> bDmemStack: 1,
> ..Default::default()
> },
> [0u8; GSP_PAGE_SIZE -
> core::mem::size_of::<bindings::GSP_ARGUMENTS_CACHED>()],
> )
> }
> }
>
> If you think this is okay, I'll put it in v3.
Yes, that looks like a good way to address cases like this (please also
make sure to detail in a comment that RM is requiring that padding).
Having the constraint encoded into the type guarantees that we cannot
miss it anywhere.
With one caveat: `new` now returns a 4K object on the stack, which we
definitely want to avoid. So maybe we can have a wrapper for things we
want to align the 4K:
#[repr(C)]
pub(crate) struct GspPageAligned<T> {
pub(crate) inner: T,
padding: [u8; GSP_PAGE_SIZE - core::mem::size_of::<T>()],
}
We would then allocate the CoherentAllocation using a
`GspPageAligned<GspArgumentsCached>`, and initialize its useful data
with:
dma_write!(rmargs[0].inner = fw::GspArgumentsCached::new(&cmdq))?;
> I had to remove the #[repr(transparent)]. Is that
> okay? The code compiles and seems to work.
As long as the struct is `repr(C)`, the layout will be what we expect.
Actually this made me realize that `repr(C)` is technically what we want for our
bindings abstractions, not `repr(transparent)` - both happen to have the
same effect since the wrapper struct is `repr(C)` anyway, but the latter
is more restrictive than we need.
Glad we found an elegant way to address this!