On 2025-12-15 at 12:22 +1100, Joel Fernandes <[email protected]> wrote...
> Hi Alistair,
> 
> > On Dec 15, 2025, at 8:43 AM, Alistair Popple <[email protected]> wrote:
> > 
> > On 2025-12-12 at 19:10 +1100, Dirk Behme <[email protected]> wrote...
> >>> On 12.12.25 08:59, Joel Fernandes wrote:
> >>> Hi Alex,
> >>> 
> >>>> On Nov 22, 2025, at 12:00 AM, Alexandre Courbot <[email protected]> 
> >>>> wrote:
> >>>> 
> >>>> The size of messages' payload is miscalculated, leading to extra data
> >>>> passed to the message handler. While this is not a problem with our
> >>>> current set of commands, others with a variable-length payload may
> >>>> misbehave. Fix this.
> >>>> 
> >>>> Signed-off-by: Alexandre Courbot <[email protected]>
> >>>> ---
> >>>> drivers/gpu/nova-core/gsp/cmdq.rs | 11 +++++++----
> >>>> drivers/gpu/nova-core/gsp/fw.rs   |  2 +-
> >>>> 2 files changed, 8 insertions(+), 5 deletions(-)
> >>>> 
> >>>> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs 
> >>>> b/drivers/gpu/nova-core/gsp/cmdq.rs
> >>>> index 6f946d14868a..dab73377c526 100644
> >>>> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> >>>> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> >>>> @@ -588,21 +588,24 @@ fn wait_for_msg(&self, timeout: Delta) -> 
> >>>> Result<GspMessage<'_>> {
> >>>>            header.length(),
> >>>>        );
> >>>> 
> >>>> +        // The length of the message that follows the header.
> >>>> +        let msg_length = header.length() - size_of::<GspMsgElement>();
> >>> 
> >>> Is this immune to under flow without one of the checked subtraction 
> >>> wrappers? Either way, we should not tolerate the underflow I think. Which 
> >>> means it can panic when the rust overflow checks are enabled. Since the 
> >>> header length comes from firmware, this cannot be guaranteed to not 
> >>> underflow in the event of a malformed message.
> > 
> > I think we're guaranteed not to underflow here - check out the 
> > implementation for header.length():
> > 
> >    /// Returns the total length of the message.
> >    pub(crate) fn length(&self) -> usize {
> >        // `rpc.length` includes the length of the GspRpcHeader but not the 
> > message header.
> >        size_of::<Self>() - size_of::<bindings::rpc_message_header_v>()
> >            + num::u32_as_usize(self.inner.rpc.length)
> >    }
> > 
> > So the above calculation expands to:
> > 
> > msg_length = size_of::<Self>() - size_of::<bindings::rpc_message_header_v>()
> >            + num::u32_as_usize(self.inner.rpc.length) - 
> > size_of::<GspMsgElement>();
> > 
> > Where self.inner.rpc.length is guaranteed to be >= 
> > size_of::<rpc_message_header_v>() by the construction of the type.
> 
> But this length field is coming from the firmware, correct? The guarantee is 
> provided by firmware, not by Rust code calculating the length.

Oh you're right - I had this around the wrong way, I forgot our constructors 
were only used for creating messages not receiving them. Obviously killing time 
reading mailing lists in airports is not such a good idea :-)

> Maybe Rust validating that the length matches, or is greater than or equal 
> to, the message header would be one way to avoid doing the checked 
> subtraction. I would still be comfortable doing the checked subtraction in 
> case the firmware payload in the message buffer is corrupted and the length 
> field is corrupted.
> 
> I think Rust cannot trust fields coming from the firmware and needs to check 
> them to prevent undefined behavior. Or maybe the policy is to include safety 
> comments, like we do when expecting C code to behave in a certain way. I do 
> not know. But we should identify the policy for this and stick to it for 
> future such issues.
> 
> I think one way to verify that there is a Rust guarantee about the length 
> field is to do the unchecked subtraction, compile the code, and then see if 
> the generated code has any panics in it (With the overflow checking config 
> enabled.)
> 
> If it does, then dead code elimination could not determine at compile time 
> that the subtraction would not overflow. Right?

Agreed. Whilst we can't avoid needing to trust the firmware at some level 
crashing the kernel in response to a bad message would be bad and make 
debugging it a pain.

> > 
> >> Would this be a possible use case for the untrusted data proposal
> >> 
> >> https://lwn.net/Articles/1034603/
> >> 
> >> ?
> > 
> > Responding here because Joel appears to have sent a HTML only response ;-)
> 
> Sorry. :)
> 
> > 
> > I agree with Joel's points - this does sound useful but as a separate 
> > project.
> > I'd imagine we'd want to it one layer lower though - ie. in the 
> > construction of
> > the GspMsgElement.

So I guess we'd need our own implementation of a from_bytes trait or some such 
that would also validate the fields when deserialising the struct.

> Agreed, thanks.
> 
>  - Joel 
> 
> 
> 
> > 
> >> Cheers
> >> 
> >> Dirk

Reply via email to