Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
Pre-existing issues: - [High] Corrupt or oversized GSP messages cause NULL pointer dereferences or false successes because they return NULL instead of an error pointer. - [High] Leaking the original large RPC payload message in `r535_gsp_rpc_push()` on error paths during chunked transmission. -- commit be006c1667937e9a2375afa2299598f63205492b Author: Hongling Zeng <[email protected]> Subject: nouveau/gsp/rm/alloc: Cleanup IS_ERR_OR_NULL usage This commit changes the error checking in r535_gsp_rpc_rm_free() to use IS_ERR() instead of IS_ERR_OR_NULL() for the return value of nvkm_gsp_rpc_get(). > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/alloc.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/alloc.c > index 46e3a29f2ad7f..c9f86c0e9b255 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/alloc.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/alloc.c [ ... ] > @@ -35,7 +35,7 @@ r535_gsp_rpc_rm_free(struct nvkm_gsp_object *object) > client->object.handle, object->handle); > > rpc = nvkm_gsp_rpc_get(gsp, NV_VGPU_MSG_FUNCTION_FREE, sizeof(*rpc)); > - if (WARN_ON(IS_ERR_OR_NULL(rpc))) > + if (WARN_ON(IS_ERR(rpc))) [Severity: High] This isn't a bug introduced by this patch, but while we are refining the use of IS_ERR() for RPC messages, could corrupt or oversized GSP messages cause a NULL pointer dereference or false success elsewhere? Looking at r535_gsp_msgq_recv() in drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c, if an RPC message length exceeds the maximum size, it issues a WARN_ON and returns NULL instead of an ERR_PTR: r535_gsp_msgq_recv() { ... if (WARN_ON(rpc->length > max_rpc_size)) return NULL; ... } This NULL propagates up through r535_gsp_msg_recv() and nvkm_gsp_rpc_push() back to wrappers. Callers expecting a payload use IS_ERR() to check for errors, but since IS_ERR(NULL) is false, do these callers incorrectly treat the failure as a success? For example, in r535_gsp_get_static_info(): r535_gsp_get_static_info() { ... rpc = nvkm_gsp_rpc_rd(gsp, NV_VGPU_MSG_FUNCTION_GET_GSP_STATIC_INFO, sizeof(*rpc)); if (IS_ERR(rpc)) return PTR_ERR(rpc); gsp->internal.client.object.client = &gsp->internal.client; gsp->internal.client.object.parent = NULL; gsp->internal.client.object.handle = rpc->hInternalClient; ... } Could this bypass lead to a direct NULL pointer dereference, or cause silent hardware operation failures in functions like nvkm_gsp_rpc_wr() and nvkm_gsp_rm_alloc_wr() that similarly only check IS_ERR()? [Severity: High] This is also a pre-existing issue, but while reviewing the RPC push logic, is there a memory leak in r535_gsp_rpc_push() for large payloads on error paths? When the payload size exceeds the maximum, r535_gsp_rpc_push() splits it into continuation chunks. The original message memory, allocated by nvkm_gsp_rpc_get(), is not passed to the command queue (which would normally take ownership). Instead, r535_gsp_rpc_push() attempts to free it manually only upon successful reply handling: r535_gsp_rpc_push() { ... /* Wait for reply. */ repv = r535_gsp_rpc_handle_reply(gsp, fn, policy, payload_size + sizeof(*rpc)); if (!IS_ERR(repv)) kvfree(msg); } else { repv = r535_gsp_rpc_send(gsp, payload, policy, gsp_rpc_len); } done: mutex_unlock(&gsp->cmdq.mutex); return repv; } If sending the chunks fails or if the reply contains an error (such as the GSP rejecting the command), the function jumps to the done label and returns without freeing the message. Since callers typically overwrite their pointer with the return value of nvkm_gsp_rpc_push() and assume ownership was passed, does this permanently leak the original payload memory if userspace issues large RPCs that get rejected? > return -EIO; > > rpc->params.hRoot = client->object.handle; -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=3
