Thank you for your contribution! Sashiko AI review found 5 potential issue(s) 
to consider:

Pre-existing issues:
- [Critical] The driver blindly trusts the `length` field of GSP messages and 
continuation records, leading to multiple integer underflows and massive 
out-of-bounds memory copies.
- [Critical] Missing bounds check on GSP continuation message length allows a 
heap buffer overflow.
- [High] Returning `NULL` on oversized GSP messages leads to infinite loops in 
the message queue worker and a false success bubbling up to callers, causing 
subsequent NULL pointer dereferences.
- [High] A continuation message wrapping around the message queue ring buffer 
causes memory corruption and an overrun due to incorrect pointer math.
- [High] Missing read memory barrier (rmb or dma_rmb) after reading the ring 
buffer write pointer.
--

commit b655637a9811533563a43c05f87d130fe957a519
Author: Hongling Zeng <[email protected]>
Subject: nouveau/gsp/rpc: Cleanup incorrect IS_ERR_OR_NULL in rpc.c

This commit cleans up IS_ERR_OR_NULL() usage for functions that supposedly
never return NULL. It changes the checks after r535_gsp_msgq_peek(),
r535_gsp_msgq_recv_one_elem(), and r535_gsp_msgq_recv() to use IS_ERR()
instead.

> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c
> index b4b2b5c00ab4e..5b89c8835a2e4 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c
> @@ -350,7 +350,7 @@ r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 gsp_rpc_len, 
> int *retries)
>               u32 size;
>  
>               rpc = r535_gsp_msgq_peek(gsp, sizeof(*rpc), info.retries);
> -             if (IS_ERR_OR_NULL(rpc)) {
> +             if (IS_ERR(rpc)) {

[Severity: High]
This is a pre-existing issue, but does the polling loop in
r535_gsp_msgq_wait() need a read memory barrier?

r535_gsp_msgq_peek() calls r535_gsp_msgq_wait(), which polls the GSP write
pointer *gsp->msgq.wptr from DMA coherent memory:

    while (rptr == wptr || ...) {
        ...
        wptr = *gsp->msgq.wptr;
    }
    
Without an rmb() or dma_rmb() before returning, could weakly ordered CPUs
speculatively fetch the payload data before the write pointer read is
ordered, resulting in stale data being read from the ring buffer?

>                       kvfree(buf);
>                       return rpc;
>               }
> @@ -359,7 +359,7 @@ r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 gsp_rpc_len, 
> int *retries)
>               info.continuation = true;
>  
>               rpc = r535_gsp_msgq_recv_one_elem(gsp, &info);
> -             if (IS_ERR_OR_NULL(rpc)) {
> +             if (IS_ERR(rpc)) {

[Severity: Critical]
This isn't a bug introduced by this patch, but does the firmware's
message length need to be validated to prevent integer underflows?

If rpc->length is smaller than sizeof(struct nvfw_gsp_rpc), does
this underflow the length calculations and cause an out-of-bounds memcpy()?

In r535_gsp_msgq_recv_one_elem(), the payload length calculation uses:

    if (info->continuation)
        memcpy(buf, mqe->data + sizeof(struct nvfw_gsp_rpc),
               len - sizeof(struct nvfw_gsp_rpc));
               
Additionally, in this loop in r535_gsp_msgq_recv(), the offset underflows:

    size = info.gsp_rpc_len - sizeof(*rpc);
    expected -= size;
    info.gsp_rpc_buf += size;
    
There is a similar underflow in r535_gsp_msg_dump():

    msg->length - sizeof(*msg)
    
Could a malicious or buggy firmware send a small length and cause these
calculations to wrap around, leading to massive memory corruption?

[Severity: Critical]
This is a pre-existing issue, but does this code check if the continuation
message actually fits within the remaining allocated buffer?

The buffer was allocated based on the initial expected message length.
If a continuation chunk payload (info.gsp_rpc_len - sizeof(*rpc)) is
larger than the remaining capacity (expected), does this overwrite the
bounds of the kvmalloc'd buffer?

[Severity: High]
This isn't a bug introduced by this patch, but is there a miscalculation
when a continuation message wraps around the ring buffer?

In r535_gsp_msgq_recv_one_elem(), the code copies a wrapped continuation
payload in two parts. The first part writes len - sizeof(struct nvfw_gsp_rpc)
bytes. However, for the second part it copies:

    if (expected) {
        mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + 0 * 0x1000);
        memcpy(buf + len, mqe, expected);
    }

Should the destination pointer be buf + len - sizeof(struct nvfw_gsp_rpc)
instead of buf + len? Does using buf + len leave a gap of uninitialized
memory and subsequently overrun the allocated buffer by
sizeof(struct nvfw_gsp_rpc) bytes?

>                       kvfree(buf);
>                       return rpc;
>               }
> @@ -499,7 +499,7 @@ r535_gsp_msg_recv(struct nvkm_gsp *gsp, int fn, u32 
> gsp_rpc_len)
>  
>  retry:
>       rpc = r535_gsp_msgq_peek(gsp, sizeof(*rpc), &retries);
> -     if (IS_ERR_OR_NULL(rpc))
> +     if (IS_ERR(rpc))
>               return rpc;
>  
>       rpc = r535_gsp_msgq_recv(gsp, gsp_rpc_len, &retries);

[Severity: High]
This is a pre-existing issue, but is it safe for r535_gsp_msgq_recv() to
return NULL when a message is oversized?

r535_gsp_msgq_recv() has the following check:

    if (WARN_ON(rpc->length > max_rpc_size))
        return NULL;
        
If an oversized message is received, this returns NULL instead of advancing
the message queue read pointer and returning an error pointer like
ERR_PTR(-EIO).

Does this cause the message queue worker to get stuck in an infinite loop
repeatedly processing the same invalid message? Also, could this NULL bubble
up as a false success to callers that check for errors via IS_ERR(), causing
subsequent NULL pointer dereferences?

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=2

Reply via email to