Thank you for your contribution! Sashiko AI review found 1 potential issue(s) 
to consider:
- [Medium] The `threshold` and `reserved` fields are incorrectly swapped in 
`struct xe_ras_set_threshold_response`.
--

commit b88a93d0104e4bd53859ba2c4b041330bc44ceef
Author: Raag Jadav <[email protected]>

drm/xe/ras: Add support for error threshold

System controller allows getting/setting per counter threshold for
correctable errors, which it uses to raise error events to the driver.
Get/set it using the respective mailbox command.

> diff --git a/drivers/gpu/drm/xe/xe_ras_types.h 
> b/drivers/gpu/drm/xe/xe_ras_types.h
> index 6688e11f57a86..747b651880cd5 100644
> --- a/drivers/gpu/drm/xe/xe_ras_types.h
> +++ b/drivers/gpu/drm/xe/xe_ras_types.h
[ ... ]
> +struct xe_ras_get_threshold_response {
> +     /** @counter: Counter ID */
> +     struct xe_ras_error_class counter;
> +     /** @threshold: Current threshold of the counter */
> +     u32 threshold;
> +     /** @reserved: Reserved for future use */
> +     u32 reserved[4];
> +} __packed;
[ ... ]
> +struct xe_ras_set_threshold_request {
> +     /** @counter: Counter to set threshold for */
> +     struct xe_ras_error_class counter;
> +     /** @threshold: Threshold to be set */
> +     u32 threshold;
> +     /** @reserved: Reserved for future use */
> +     u32 reserved;
> +} __packed;
> +
> +/**
> + * struct xe_ras_set_threshold_response - Response structure for set 
> threshold
> + */
> +struct xe_ras_set_threshold_response {
> +     /** @counter: Counter ID */
> +     struct xe_ras_error_class counter;
> +     /** @reserved: Reserved */
> +     u32 reserved;
> +     /** @threshold: Updated threshold */
> +     u32 threshold;
> +     /** @status: Operation status */
> +     u32 status;

[Severity: Medium]
Are the reserved and threshold fields swapped in this structure compared to
the firmware specification?

In xe_ras_set_threshold_request and xe_ras_get_threshold_response, the
threshold payload immediately follows the counter struct. Placing reserved
before threshold here shifts the threshold and status fields to different
offsets.

If this is a mistake, it could cause ras_status_to_errno() to read the
wrong offset and silently ignore firmware errors when xe_ras_set_threshold()
checks response.status, as well as log an incorrect threshold value.

> +     /** @reserved1: Reserved for future use */
> +     u32 reserved1[2];
> +} __packed;
> +
>  #endif

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=3

Reply via email to