On Fri Dec 5, 2025 at 6:45 AM JST, Timur Tabi wrote:
> On Thu, 2025-12-04 at 21:18 +0000, Timur Tabi wrote:
>> If you can tell me how to fix the CoherentAllocation allocation call so that 
>> it allocates 4K, that
>> would be a better fix for this problem.  It's only necessary for 
>> Turing/GA100, however.
>
> Actually, I think I figured it out.  
>
> /// Arguments for GSP startup.
> #[repr(C)]
> pub(crate) struct GspArgumentsCached(
>     bindings::GSP_ARGUMENTS_CACHED,
>     [u8; GSP_PAGE_SIZE - 
> core::mem::size_of::<bindings::GSP_ARGUMENTS_CACHED>()], // Padding to make
> it 4K
> );
>
> impl GspArgumentsCached {
>     /// Creates the arguments for starting the GSP up using `cmdq` as its 
> command queue.
>     pub(crate) fn new(cmdq: &Cmdq) -> Self {
>         Self(
>             bindings::GSP_ARGUMENTS_CACHED {
>                 messageQueueInitArguments: 
> MessageQueueInitArguments::new(cmdq).0,
>                 bDmemStack: 1,
>                 ..Default::default()
>             },
>             [0u8; GSP_PAGE_SIZE - 
> core::mem::size_of::<bindings::GSP_ARGUMENTS_CACHED>()],
>         )
>     }
> }
>
> If you think this is okay, I'll put it in v3.

Yes, that looks like a good way to address cases like this (please also
make sure to detail in a comment that RM is requiring that padding).
Having the constraint encoded into the type guarantees that we cannot
miss it anywhere.

With one caveat: `new` now returns a 4K object on the stack, which we
definitely want to avoid. So maybe we can have a wrapper for things we
want to align the 4K:

  #[repr(C)]
  pub(crate) struct GspPageAligned<T> {
    pub(crate) inner: T,
    padding: [u8; GSP_PAGE_SIZE - core::mem::size_of::<T>()],
  }

We would then allocate the CoherentAllocation using a
`GspPageAligned<GspArgumentsCached>`, and initialize its useful data
with:

  dma_write!(rmargs[0].inner = fw::GspArgumentsCached::new(&cmdq))?;

> I had to remove the #[repr(transparent)].  Is that
> okay?  The code compiles and seems to work.

As long as the struct is `repr(C)`, the layout will be what we expect.
Actually this made me realize that `repr(C)` is technically what we want for our
bindings abstractions, not `repr(transparent)` - both happen to have the
same effect since the wrapper struct is `repr(C)` anyway, but the latter
is more restrictive than we need.

Glad we found an elegant way to address this!

Reply via email to