Thank you for your contribution! Sashiko AI review found 1 potential issue(s) 
to consider:
- [Medium] ras_send_error_event() passes internal hardware-specific XE enums to 
xe_ras_get_counter() instead of the expected DRM enums, resulting in queries 
for the wrong error counters.
--

commit dc2a656220167a5ce967cead702288ba6f6f06e4
Author: Riana Tauro <[email protected]>

drm/xe/xe_ras: Add error-event support for CRI

This commit adds error-event support for correctable errors in CRI,
reporting errors that cross a threshold to userspace via interrupts.

> diff --git a/drivers/gpu/drm/xe/xe_ras.c b/drivers/gpu/drm/xe/xe_ras.c
> --- a/drivers/gpu/drm/xe/xe_ras.c
> +++ b/drivers/gpu/drm/xe/xe_ras.c

[ ... ]

> @@ -131,6 +161,27 @@ static inline const char *comp_to_str(u8 component)
>       return xe_ras_components[component];
>  }
>  
> +static void ras_send_error_event(struct xe_device *xe, u8 severity, u8 
> component)
> +{
> +     u8 drm_severity, drm_component;
> +     u32 value;
> +     int ret;
> +
> +     drm_severity = xe_to_drm_ras_severity(severity);
> +     if (drm_severity == DRM_XE_RAS_ERR_SEV_MAX)
> +             return;
> +
> +     drm_component = xe_to_drm_ras_component(component);
> +     if (drm_component == DRM_XE_RAS_ERR_COMP_MAX)
> +             return;
> +
> +     ret = xe_ras_get_counter(xe, severity, component, &value);

[Severity: Medium]
Does this pass the wrong enum type to xe_ras_get_counter?

In this function, severity and component are the raw XE hardware enums. We
convert them to DRM enums just above as drm_severity and drm_component.

Looking at xe_ras_get_counter(), it expects the DRM enums because it converts
them back to XE enums internally:

int xe_ras_get_counter(struct xe_device *xe, u8 severity, u8 component, u32 
*value)
{
        struct xe_ras_error_class counter = {0};

        counter.common.severity = drm_to_xe_ras_severity(severity);
        counter.common.component = drm_to_xe_ras_component(component);
        ...
}

Should this be passing drm_severity and drm_component instead to ensure the
correct telemetry counters are read?

> +     if (ret)
> +             return;
> +
> +     xe_drm_ras_event(xe, drm_component, drm_severity, value, GFP_KERNEL);
> +}
> +

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=3

Reply via email to