Re: [PATCH v2] gpu: nova-core: fix stack overflow in GSP memory allocation

Danilo Krummrich Fri, 13 Feb 2026 13:34:59 -0800

On Fri Feb 13, 2026 at 8:40 PM CET, Tim Kovalenko via B4 Relay wrote:
> @@ -159,7 +158,7 @@ struct Msgq {
>  #[repr(C)]
>  struct GspMem {
>      /// Self-mapping page table entries.
> -    ptes: PteArray<{ GSP_PAGE_SIZE / size_of::<u64>() }>,
> +    ptes: [u64; GSP_PAGE_SIZE / size_of::<u64>()],
>      /// CPU queue: the driver writes commands here, and the GSP reads them. 
> It also contains the
>      /// write and read pointers that the CPU updates.
>      ///
> @@ -201,7 +200,29 @@ fn new(dev: &device::Device<device::Bound>) -> 
> Result<Self> {
>  
>          let gsp_mem =
>              CoherentAllocation::<GspMem>::alloc_coherent(dev, 1, GFP_KERNEL 
> | __GFP_ZERO)?;
> -        dma_write!(gsp_mem[0].ptes = PteArray::new(gsp_mem.dma_handle())?)?;
> +        const NUM_PAGES: usize = GSP_PAGE_SIZE / size_of::<u64>();
> +
> +        // One by one GSP Page write to the memory to avoid stack overflow 
> when allocating
> +        // the whole array at once.
> +        let item = gsp_mem.item_from_index(0)?;
> +        for i in 0..NUM_PAGES {
> +            let pte_value = gsp_mem
> +                .dma_handle()
> +                .checked_add(num::usize_as_u64(i) << GSP_PAGE_SHIFT)
> +                .ok_or(EOVERFLOW)?;
> +
> +            // SAFETY: `item_from_index` ensures that `item` is always a 
> valid pointer and can be
> +            // dereferenced. The compiler also further validates the 
> expression on whether `field`
> +            // is a member of `item` when expanded by the macro.
> +            //
> +            // Further, this is dma_write! macro expanded and modified to 
> allow for individual
> +            // page write.
> +            unsafe {


Both of the statements below are unsafe and should be within an individual
unsafe block with their corresponding justification.

> +                let ptr_field = core::ptr::addr_of_mut!((*item).ptes[i]);

This should use &raw mut instead.

> +                gsp_mem.field_write(ptr_field, pte_value);

Formally, we won't be able to justify the safety requirement of this method. :)

The good thing is, we don't have to:

I understand it seems like the problem here is that dma_read!() does not support
index projections. Well, it actually is a problem, which I think will be
resolved by Gary's work. However, I think the real problem here is a different
one:

This code does not need volatile writes in the first place. We just allocated
the DMA memory and haven't published the corresponding address to the device
yet.

So, for such initialization code we shouldn't have to use dma_write!() /
dma_read!() in the first place.

I think the proper solution for this is to provide an API that allows for
initialization with a "normal" reference / slice.

For instance, we could provide a `alloc_coherent_init()` function that takes a
closure which has `&mut [T]` argument, such that the closure can do the
initialization before dma::CoherentAllocation::alloc_coherent() even returns.

Another option would be a new type, e.g. dma::InitCoherentAllocation which has
safe as_slice() and as_slice_mut() methods, but does *not* provide a method to
get the DMA address. Subsequently, it can be converted to a "real"
dma::CoherentAllocation.

With this, I would also keep the PteArray type and change PteArray::new() to
PteArray::init() taking a &mut self.

This way the PteArray init logic remains nicely structured and isolated.

Thanks,
Danilo

> +            }
> +        }
> +
>          dma_write!(gsp_mem[0].cpuq.tx = MsgqTxHeader::new(MSGQ_SIZE, 
> RX_HDR_OFF, MSGQ_NUM_PAGES))?;
>          dma_write!(gsp_mem[0].cpuq.rx = MsgqRxHeader::new())?;

Re: [PATCH v2] gpu: nova-core: fix stack overflow in GSP memory allocation

Reply via email to