Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
Pre-existing issues: - [High] Returning NULL on oversized RPC messages causes a permanent queue stall and a NULL pointer dereference in callers. - [High] Missing lower-bound validation on `rpc->length` causes massive integer underflows, leading to out-of-bounds memory accesses and kernel heap corruption. - [High] Incorrect destination offset when copying wrapped continuation payloads causes a 32-byte gap of uninitialized memory and a buffer overwrite. - [High] Time-Of-Check to Time-Of-Use (TOCTOU) race when reading `rpc->length` from shared memory allows kernel heap buffer overflow. -- commit 91659e1846d19fb0409c6de8d1d0817e283454ef Author: Hongling Zeng <[email protected]> nouveau/gsp/rpc: Cleanup incorrect IS_ERR_OR_NULL in rpc.c This commit cleans up incorrect IS_ERR_OR_NULL() usage for functions that never return NULL, replacing them with IS_ERR() instead. > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c > index 2ca6de4094e63..0a00c7d831c91 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c > @@ -350,7 +350,7 @@ r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 gsp_rpc_len, > int *retries) > u32 size; > > rpc = r535_gsp_msgq_peek(gsp, sizeof(*rpc), info.retries); > - if (IS_ERR_OR_NULL(rpc)) { > + if (IS_ERR(rpc)) { [Severity: High] This isn't a bug introduced by this patch, but since you are cleaning up error checking in r535_gsp_msgq_recv(), what happens when an incoming message's length exceeds max_rpc_size? Earlier in r535_gsp_msgq_recv(), the code does this: if (WARN_ON(rpc->length > max_rpc_size)) return NULL; Returning NULL here skips advancing the queue's read pointer (*gsp->msgq.rptr), which seems like it could permanently stall the message queue as subsequent reads will hit the exact same malformed message. Additionally, this NULL return propagates back to callers like r535_gsp_get_static_info(), which only check IS_ERR(rpc): rpc = nvkm_gsp_rpc_rd(gsp, NV_VGPU_MSG_FUNCTION_GET_GSP_STATIC_INFO, sizeof(*rpc)); if (IS_ERR(rpc)) return PTR_ERR(rpc); gsp->internal.client.object.handle = rpc->hInternalClient; Will this cause a NULL pointer dereference when accessing rpc->hInternalClient? [Severity: High] This is a pre-existing issue, but could the direct read of rpc->length from shared memory lead to a Time-Of-Check to Time-Of-Use (TOCTOU) race? In r535_gsp_msgq_recv(), rpc->length is read and validated to determine the allocation size: if (WARN_ON(rpc->length > max_rpc_size)) return NULL; buf = kvmalloc(max_t(u32, rpc->length, expected), GFP_KERNEL); However, it is read again from shared memory shortly after without READ_ONCE() or caching the previously validated value: info.gsp_rpc_len = rpc->length; If the value in shared memory is modified after the validation but before the second read, could this bypass the size validation and result in a massive out-of-bounds memcpy() into an undersized heap buffer? > kvfree(buf); > return rpc; > } > @@ -359,7 +359,7 @@ r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 gsp_rpc_len, > int *retries) > info.continuation = true; > > rpc = r535_gsp_msgq_recv_one_elem(gsp, &info); [Severity: High] This is a pre-existing issue, but does r535_gsp_msgq_recv_one_elem() handle wrapped continuation payloads correctly? When a continuation message wraps around the end of the ring buffer, the code skips the 32-byte RPC header for the first part: if (info->continuation) memcpy(buf, mqe->data + sizeof(struct nvfw_gsp_rpc), len - sizeof(struct nvfw_gsp_rpc)); However, for the wrapped part, it uses buf + len: expected -= len; if (expected) { mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + 0 * 0x1000); memcpy(buf + len, mqe, expected); } Since the first memcpy skipped 32 bytes of header, wouldn't using buf + len leave a 32-byte gap of uninitialized memory in the destination buffer, and subsequently overwrite past the end of the allocated buffer by 32 bytes? > - if (IS_ERR_OR_NULL(rpc)) { > + if (IS_ERR(rpc)) { [Severity: High] This isn't a bug introduced by this patch, but does the lack of a minimum size check on rpc->length allow an integer underflow? If the GSP provides an rpc->length that is smaller than the RPC header size (sizeof(struct nvfw_gsp_rpc) which is 32 bytes), wouldn't the subtraction in r535_gsp_msgq_recv_one_elem(): len - sizeof(struct nvfw_gsp_rpc) underflow to a massive unsigned integer, passing an enormous size to memcpy() and causing an out-of-bounds read? > kvfree(buf); > return rpc; > } > @@ -499,7 +499,7 @@ r535_gsp_msg_recv(struct nvkm_gsp *gsp, int fn, u32 > gsp_rpc_len) > > retry: > rpc = r535_gsp_msgq_peek(gsp, sizeof(*rpc), &retries); > - if (IS_ERR_OR_NULL(rpc)) > + if (IS_ERR(rpc)) > return rpc; > > rpc = r535_gsp_msgq_recv(gsp, gsp_rpc_len, &retries); [Severity: High] This is a pre-existing issue, but is there a similar underflow risk here in r535_gsp_msg_recv() when calling notification callbacks? Later in r535_gsp_msg_recv(), the length is subtracted again: if (ntfy->func) ntfy->func(ntfy->priv, ntfy->fn, rpc->data, rpc->length - sizeof(*rpc)); Without a lower-bound validation on rpc->length, could this also underflow and pass a massive size to the notification callbacks? -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=2
