Re: [PATCH 1/2] gpu: nova-core: fix GSP RPC send sequence numbers to match Open RM

Danilo Krummrich Tue, 17 Feb 2026 03:52:47 -0800

On Tue Feb 17, 2026 at 9:15 AM CET, Alexandre Courbot wrote:
>> diff --git a/drivers/gpu/nova-core/gsp/boot.rs 
>> b/drivers/gpu/nova-core/gsp/boot.rs
>> index 02eec2961b5f..f769e234dae6 100644
>> --- a/drivers/gpu/nova-core/gsp/boot.rs
>> +++ b/drivers/gpu/nova-core/gsp/boot.rs
>> @@ -403,7 +403,11 @@ pub(crate) fn boot(
>>  
>>          dev_dbg!(dev, "RISC-V active? {}\n", 
>> gsp_falcon.is_riscv_active(bar));
>>  
>> -        // Now that GSP is active, send system info and registry
>> +        // Now that GSP is active, send system info and registry.
>> +        //
>> +        // These are async (fire-and-forget) RPCs: no response comes back 
>> from GSP.
>> +        // GSP does not include them in its sequence number counting today, 
>> but a
>> +        // future GSP firmware update will fold them into the normal 
>> sequence space.
>
> From the point of view of the caller this is a superfluous
> implementation detail. This comment is actually confusing as it mentions
> a sequence number that the caller does not need to provide, so I'd just
> remove it altogether.


The comment is not a doc-comment, hence it can more be seen as a comment for the
audience of nova-core developers in general, rather than a description of a
specific API. So, I think it is OK to talk about an implementation detail here.

To me personally it seems useful and I'd like to keep it.

>
>>          self.cmdq
>>              .send_command(bar, commands::SetSystemInfo::new(pdev, 
>> chipset))?;
>>          self.cmdq.send_command(bar, commands::SetRegistry::new())?;
>> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs 
>> b/drivers/gpu/nova-core/gsp/cmdq.rs
>> index 16895f5281b7..7d6d7d81287c 100644
>> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
>> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
>> @@ -58,6 +58,13 @@ pub(crate) trait CommandToGsp {
>>      /// Function identifying this command to the GSP.
>>      const FUNCTION: MsgFunction;
>>  
>> +    /// Whether this command is async (fire-and-forget), meaning no 
>> response is expected from GSP.
>> +    ///
>> +    /// Async commands get inner `rpc.sequence` set to 0. Sync commands get 
>> inner `rpc.sequence`
>> +    /// set to the transport counter, matching Open RM. The outer `seqNum` 
>> always increments
>> +    /// regardless.
>
> This here is too much detail as well IMHO, even more if you follow the
> last suggestion in this review. :)

As this one is a doc-comment, agreed.

>
>> +    const IS_ASYNC: bool = false;
>> +
>>      /// Type generated by [`CommandToGsp::init`], to be written into the 
>> command queue buffer.
>>      type Command: FromBytes + AsBytes;
>>  
>> @@ -439,7 +446,8 @@ struct GspMessage<'a> {
>>  pub(crate) struct Cmdq {
>>      /// Device this command queue belongs to.
>>      dev: ARef<device::Device>,
>> -    /// Current command sequence number.
>> +    /// Transport-level sequence number, incremented for every send. Used 
>> for the outer
>> +    /// GSP_MSG_QUEUE_ELEMENT.seqNum. Also used as the inner rpc.sequence 
>> for sync commands.
>
> Note that these types (especially `GSP_MSG_QUEUE_ELEMENT`) are never
> visible in this module, so mentioning them here is assuming context that
> the reader can't easily get.
>
> Instead, how about just "Used for [`GspMsgElement`]'s sequence counter"?

Same here.

>>      seq: u32,
>>      /// Memory area shared with the GSP for communicating commands and 
>> messages.
>>      gsp_mem: DmaGspMem,
>> @@ -514,8 +522,13 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, 
>> command: M) -> Result
>>          // Extract area for the command itself.
>>          let (cmd, payload_1) = 
>> M::Command::from_bytes_mut_prefix(dst.contents.0).ok_or(EIO)?;
>>  
>> +        // The outer seqNum always increments (transport-level, unique per 
>> message).
>> +        // The inner rpc.sequence is 0 for async (fire-and-forget) 
>> commands, or the
>> +        // sync counter for command/response pairs, matching Open RM 
>> behavior.
>> +        let rpc_seq = if M::IS_ASYNC { 0 } else { self.seq };
>> +
>>          // Fill the header and command in-place.
>> -        let msg_element = GspMsgElement::init(self.seq, command_size, 
>> M::FUNCTION);
>> +        let msg_element = GspMsgElement::init(self.seq, rpc_seq, 
>> command_size, M::FUNCTION);
>>          // SAFETY: `msg_header` and `cmd` are valid references, and not 
>> touched if the initializer
>>          // fails.
>>          unsafe {
>> diff --git a/drivers/gpu/nova-core/gsp/commands.rs 
>> b/drivers/gpu/nova-core/gsp/commands.rs
>> index e6a9a1fc6296..c8a73bd30051 100644
>> --- a/drivers/gpu/nova-core/gsp/commands.rs
>> +++ b/drivers/gpu/nova-core/gsp/commands.rs
>> @@ -50,6 +50,7 @@ pub(crate) fn new(pdev: &'a pci::Device<device::Bound>, 
>> chipset: Chipset) -> Sel
>>  
>>  impl<'a> CommandToGsp for SetSystemInfo<'a> {
>>      const FUNCTION: MsgFunction = MsgFunction::GspSetSystemInfo;
>> +    const IS_ASYNC: bool = true;
>>      type Command = GspSetSystemInfo;
>>      type InitError = Error;
>>  
>> @@ -101,6 +102,7 @@ pub(crate) fn new() -> Self {
>>  
>>  impl CommandToGsp for SetRegistry {
>>      const FUNCTION: MsgFunction = MsgFunction::SetRegistry;
>> +    const IS_ASYNC: bool = true;
>>      type Command = PackedRegistryTable;
>>      type InitError = Infallible;
>>  
>> diff --git a/drivers/gpu/nova-core/gsp/fw.rs 
>> b/drivers/gpu/nova-core/gsp/fw.rs
>> index 927bcee6a5a5..e417ed58419f 100644
>> --- a/drivers/gpu/nova-core/gsp/fw.rs
>> +++ b/drivers/gpu/nova-core/gsp/fw.rs
>> @@ -260,6 +260,26 @@ pub(crate) enum MsgFunction {
>>      UcodeLibOsPrint = bindings::NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT,
>>  }
>>  
>> +impl MsgFunction {
>> +    /// Returns true if this is a GSP-initiated async event 
>> (NV_VGPU_MSG_EVENT_*), as opposed to
>> +    /// a command response (NV_VGPU_MSG_FUNCTION_*).
>> +    #[expect(dead_code)]
>> +    pub(crate) fn is_event(&self) -> bool {
>> +        matches!(
>> +            self,
>> +            Self::GspInitDone
>> +                | Self::GspRunCpuSequencer
>> +                | Self::PostEvent
>> +                | Self::RcTriggered
>> +                | Self::MmuFaultQueued
>> +                | Self::OsErrorLog
>> +                | Self::GspPostNoCat
>> +                | Self::GspLockdownNotice
>> +                | Self::UcodeLibOsPrint //
>> +        )
>> +    }
>
> Mmm using a method for this is quite fragile, as we are always at risk
> of forgetting to handle a newly-added function. I wanted to eventually
> split `MsgFunction` between its command and events constituants, maybe
> that would be a good opportunity to do so.

I also think we should leverage the type system for such things.

> Also, this is marked with `dead_code`? Looks like it belongs to the next
> patch in this case.
>
>> +}
>> +
>>  impl fmt::Display for MsgFunction {
>>      fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
>>          match self {
>> @@ -816,7 +836,7 @@ fn new() -> Self {
>>  }
>>  
>>  impl bindings::rpc_message_header_v {
>> -    fn init(cmd_size: usize, function: MsgFunction) -> impl Init<Self, 
>> Error> {
>> +    fn init(cmd_size: usize, function: MsgFunction, sequence: u32) -> impl 
>> Init<Self, Error> {
>>          type RpcMessageHeader = bindings::rpc_message_header_v;
>>  
>>          try_init!(RpcMessageHeader {
>> @@ -829,6 +849,7 @@ fn init(cmd_size: usize, function: MsgFunction) -> impl 
>> Init<Self, Error> {
>>                  .and_then(|v| v.try_into().map_err(|_| EINVAL))?,
>>              rpc_result: 0xffffffff,
>>              rpc_result_private: 0xffffffff,
>> +            sequence,
>>              ..Zeroable::init_zeroed()
>>          })
>>      }
>> @@ -847,26 +868,31 @@ impl GspMsgElement {
>>      ///
>>      /// # Arguments
>>      ///
>> -    /// * `sequence` - Sequence number of the message.
>> +    /// * `transport_seq` - Transport-level sequence number for the outer 
>> message header
>> +    ///   (`GSP_MSG_QUEUE_ELEMENT.seqNum`). Must be unique per message.
>> +    /// * `rpc_seq` - RPC-level sequence number for the inner RPC header
>> +    ///   (`rpc_message_header_v.sequence`). Set to 0 for async 
>> (fire-and-forget) commands,
>> +    ///   or to the sync counter for command/response pairs.
>>      /// * `cmd_size` - Size of the command (not including the message 
>> element), in bytes.
>>      /// * `function` - Function of the message.
>>      #[allow(non_snake_case)]
>>      pub(crate) fn init(
>> -        sequence: u32,
>> +        transport_seq: u32,
>> +        rpc_seq: u32,
>
> The caller site of `init` mentions that `rpc_seq` will always either be
> equal to `transport_seq`, or be 0, depending on whether this is an async
> command or not, yet this constructor allows any value to be used. I'd
> like to enforce this invariant by making it impossible to build invalid
> combinations of sequences.
>
> As a matter of fact we already have this information in
> `CommandToGsp::IS_ASYNC`, so the simpler way would be to pass *that* as
> an extra argument to `init`, and let it handle the sequencing
> internally. It also has the benefit of not making this message-layer
> detail leak into the command queue code, making the discussion about
> this detail even more irrelevant outside of the `fw` module.
>
> An even better solution would be to to pass the `CommandToGsp` as a
> generic argument - that way `init` could extract both the `FUNCTION` and
> `IS_ASYNC`. Maybe use a public generic proxy method that calls into a
> non-generic private one to limit monomorphization.

I think we should go with the generic argument; but I don't think we need a
proxy method to fight monomorphization, i.e. I don't think we need to do work
for the optimizer in this case.

Re: [PATCH 1/2] gpu: nova-core: fix GSP RPC send sequence numbers to match Open RM

Reply via email to