Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues: - [Medium] Multicast events are restricted to `init_net`, silently dropping events for listeners in other network namespaces despite `.netnsok = true`. Pre-existing issues: - [High] Lockless access to `struct drm_ras_node` pointers retrieved from xarray leads to Use-After-Free. -- commit aab90b1a7792dc491d0dcac6947d349046f95876 Author: Riana Tauro <[email protected]> drm/drm_ras: Add drm_ras netlink error event This commit defines a new netlink event 'error-event' and a multicast group 'error-report' for the DRM RAS subsystem. It provides a function for drivers to report device and node error information to subscribed userspace listeners. Cc: Jakub Kicinski <[email protected]> Cc: Zack McKevitt <[email protected]> Cc: Lijo Lazar <[email protected]> Cc: Hawking Zhang <[email protected]> Cc: David S. Miller <[email protected]> Cc: Paolo Abeni <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: Riana Tauro <[email protected]> Reviewed-by: Raag Jadav <[email protected]> > diff --git a/drivers/gpu/drm/drm_ras.c b/drivers/gpu/drm/drm_ras.c > index d6eab29a1394c..77f912a4d101f 100644 > --- a/drivers/gpu/drm/drm_ras.c > +++ b/drivers/gpu/drm/drm_ras.c [ ... ] > @@ -222,6 +255,60 @@ static int doit_reply_value(struct genl_info *info, u32 > node_id, > return genlmsg_reply(msg, info); > } > > +/** > + * drm_ras_nl_error_event() - Report an error event > + * @node: Node structure > + * @error_id: ID of the error > + * @error_name: Name of the error > + * @value: Value associated with the error > + * @flags: GFP flags for memory allocation > + * > + * Report an error-event to userspace using the error-report multicast group. > + * > + * Return: 0 on success, or negative errno on failure. > + */ > +int drm_ras_nl_error_event(struct drm_ras_node *node, u32 error_id, const > char *error_name, > + u32 value, gfp_t flags) > +{ > + struct genl_info info; > + struct sk_buff *msg; > + struct nlattr *hdr; > + int ret; > + > + if (!error_name) > + return -EINVAL; > + > + if (!genl_has_listeners(&drm_ras_nl_family, &init_net, > DRM_RAS_NLGRP_ERROR_REPORT)) [Severity: Medium] Will this silently drop events for users subscribing from non-init network namespaces? The drm_ras_nl_family sets .netnsok = true, but this check is hardcoded to init_net. > + return 0; > + > + genl_info_init_ntf(&info, &drm_ras_nl_family, DRM_RAS_CMD_ERROR_EVENT); > + > + msg = genlmsg_new(NLMSG_GOODSIZE, flags); > + if (!msg) > + return -ENOMEM; > + > + hdr = genlmsg_iput(msg, &info); > + if (!hdr) { > + ret = -EMSGSIZE; > + goto free_msg; > + } > + > + ret = msg_put_error_event_attrs(msg, node, error_id, error_name, value); > + if (ret) > + goto cancel_msg; > + > + genlmsg_end(msg, hdr); > + genlmsg_multicast(&drm_ras_nl_family, msg, 0, > DRM_RAS_NLGRP_ERROR_REPORT, flags); [Severity: Medium] Similarly, genlmsg_multicast() implicitly uses init_net under the hood. Does this prevent containerized listeners in other network namespaces from receiving the multicast events? > + return 0; > + > +cancel_msg: > + genlmsg_cancel(msg, hdr); > +free_msg: > + nlmsg_free(msg); > + return ret; > +} > +EXPORT_SYMBOL(drm_ras_nl_error_event); [Severity: High] This isn't a bug introduced by this patch, but does locklessly accessing drm_ras_node pointers retrieved from drm_ras_xa lead to a use-after-free? In drivers/gpu/drm/drm_ras.c, get_node_error_counter() retrieves the node: drivers/gpu/drm/drm_ras.c:get_node_error_counter() { ... node = xa_load(&drm_ras_xa, node_id); if (!node) return -ENOENT; ... return node->query_error_counter(node, error_id, name, value); } A similar lockless pattern is present in drm_ras_nl_list_nodes_dumpit(): drivers/gpu/drm/drm_ras.c:drm_ras_nl_list_nodes_dumpit() { ... xa_for_each_start(&drm_ras_xa, id, node, ctx->restart) { ... ret = nla_put_string(skb, DRM_RAS_A_NODE_ATTRS_DEVICE_NAME, node->device_name); ... } If a concurrent device teardown or hot-unplug removes and frees the node while these functions are executing, will accessing node->query_error_counter or node->device_name dereference a freed pointer? -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=1
