Thank you for your contribution! Sashiko AI review found 2 potential issue(s) 
to consider:

Pre-existing issues:
- [High] Early returns in `receive_msg` skip advancing the GSP read pointer, 
leading to a permanently stalled command queue.
- [High] CPU->GSP messaging path relies on a fragile control dependency for 
memory ordering instead of an explicit DMA barrier.
--

commit 4410cc89c9104f4a90dab2676e5da7ce4fd1a204
Author: Gary Guo <[email protected]>

gpu: nova-core: fix barrier usage in GSP->CPU messaging path

This commit replaces the Rust SeqCst barrier with explicit dma_mb(Read) and
dma_mb(Full) barriers in the GSP->CPU messaging path. The SeqCst barrier does
not correctly order DMA operations on architectures like ARM, so using the
proper dma_mb() ensures correct LOAD->LOAD and LOAD->STORE ordering.

> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs 
> b/drivers/gpu/nova-core/gsp/cmdq.rs
> index 94c2790d943d0..229d49a670779 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs

[ ... ]

> @@ -339,6 +341,9 @@ fn driver_write_area_size(&self) -> usize {
>              (MSGQ_NUM_PAGES, tx)
>          };
>  
> +        // ORDERING: Ensure data load is ordered after load of GSP write 
> pointer.
> +        dma_mb(Read);
> +

[Severity: High]
This is a pre-existing issue, but since we are updating memory barriers here,
does the CPU->GSP messaging path in driver_write_area() also need an explicit
barrier instead of relying on a fragile control dependency?

drivers/gpu/nova-core/gsp/cmdq.rs:driver_write_area() {
    ...
        let (tail_end, wrap_end) = if rx == 0 {
            ...
        } else if rx <= tx {
            ...
        } else {
            ...
        };

        // ORDERING: control dependency provides necessary LOAD->STORE ordering.
        // dma_mb(Full) may be used here if we don't want to rely on control 
dependency.
    ...
}

Relying on control dependencies for DMA memory ordering is unsafe in Rust,
as the compiler does not guarantee preserving them and may optimize away
branches (e.g., using conditional selects like csel on ARM64).

Control dependencies also do not emit architectural barriers (like dmb osh
on ARM64), which are necessary to order memory accesses across the system
interconnect for device memory.

If the CPU's store to the command buffer is observed by the GSP before the
CPU's load of rx, could the CPU overwrite data the GSP has not yet consumed?

Should an explicit dma_mb(Full) be added here to guarantee LOAD->STORE
ordering?

>          // SAFETY:
>          // - `data` was created from a valid pointer, and `rx` and `tx` are 
> in the
>          //   `0..MSGQ_NUM_PAGES` range per the invariants of `gsp_write_ptr` 
> and `cpu_read_ptr`,

[ ... ]

> @@ -436,6 +441,9 @@ fn cpu_read_ptr(&self) -> u32 {
>  
>      // Informs the GSP that it can send `elem_count` new pages into the 
> message queue.
>      fn advance_cpu_read_ptr(&mut self, elem_count: u32) {
> +        // ORDERING: Ensure read pointer is properly ordered with a 
> LOAD->STORE ordering.
> +        dma_mb(Full);
> +
>          super::fw::gsp_mem::advance_cpu_read_ptr(&self.0, elem_count)
>      }

[Severity: High]
This is a pre-existing issue, but looking at how advance_cpu_read_ptr()
is used, do the early returns in receive_msg() skip advancing the GSP read
pointer?

drivers/gpu/nova-core/gsp/cmdq.rs:receive_msg() {
    ...
        let function = message.header.function().map_err(|_| EINVAL)?;

        // Extract the message. Store the result as we want to advance the read 
pointer even in
        // case of failure.
        let result = if function == M::FUNCTION {
            let (cmd, contents_1) = 
M::Message::from_bytes_prefix(message.contents.0).ok_or(EIO)?;
    ...
        // Advance the read pointer past this message.
        self.gsp_mem.advance_cpu_read_ptr(...);
    ...
}

The comments note that the result is stored to ensure the read pointer is
advanced even on failure. However, the use of the ? operator on
message.header.function() and M::Message::from_bytes_prefix() causes an
immediate return.

If an early return happens here, will subsequent polls read the exact same
malformed message?

Could this cause repeated failures and a permanently stalled command queue?

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=4

Reply via email to