On Wed, May 20, 2026 at 03:21:49PM +0530, Tauro, Riana wrote:
> On 5/15/2026 1:58 AM, Raag Jadav wrote:
> > cleanup_node_param() is not registered in case of counter allocation
> > failure, which results in stale memory of previous node that isn't
> > cleaned up on unwind. Fix this using drm managed allocation, which is
> > guaranteed to be cleaned up on unwind.
> >
> > Fixes: b40db12b542f ("drm/xe/xe_drm_ras: Add support for XE DRM RAS")
> > Signed-off-by: Raag Jadav <[email protected]>
> > ---
> > drivers/gpu/drm/xe/xe_drm_ras.c | 5 +----
> > 1 file changed, 1 insertion(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_drm_ras.c
> > b/drivers/gpu/drm/xe/xe_drm_ras.c
> > index c21c8b428de6..89640ffb1c33 100644
> > --- a/drivers/gpu/drm/xe/xe_drm_ras.c
> > +++ b/drivers/gpu/drm/xe/xe_drm_ras.c
> > @@ -80,7 +80,7 @@ static struct xe_drm_ras_counter
> > *allocate_and_copy_counters(struct xe_device *x
> > struct xe_drm_ras_counter *counter;
> > int i;
> > - counter = kcalloc(DRM_XE_RAS_ERR_COMP_MAX, sizeof(*counter),
> > GFP_KERNEL);
> > + counter = drmm_kcalloc(&xe->drm, DRM_XE_RAS_ERR_COMP_MAX,
> > sizeof(*counter), GFP_KERNEL);
> > if (!counter)
> > return ERR_PTR(-ENOMEM);
>
> The intention was to clean up nodes if there is a failure, to prevent
> memory
> from persisting throughout the drm device lifecycle. We actually discussed
> this offline afair.
>
> So there was a change from from v5 to v6 in the initial patch [v6,2/5]
> drm/xe/xe_drm_ras: Add support for XE DRM RAS - Patchwork
> <https://patchwork.freedesktop.org/patch/704873/?series=155188&rev=6>
Yes, the idea was to prevent the driver from updating stale counter
(which isn't exposed to the user). But rethinking about it now, this
can be achieved by simply keeping info as NULL I guess?
Raag
> > @@ -135,9 +135,6 @@ static void cleanup_node_param(struct xe_drm_ras *ras,
> > const enum drm_xe_ras_err
> > {
> > struct drm_ras_node *node = &ras->node[severity];
> > - kfree(ras->info[severity]);
> > - ras->info[severity] = NULL;
> > -
> > kfree(node->device_name);
> > node->device_name = NULL;
> > }