Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: - [Medium] Unconditional emission of spurious zero-delta RAS netlink events for unsupported or uncounted errors. --
commit 64cb8a4b396dc586545ef894e855aeb177686e56 Author: Riana Tauro <[email protected]> drm/xe/xe_drm_ras: Add error-event support for PVC This patch reports drm_ras error events to userspace when an error occurs. It also adds support for core-compute and SoC errors in PVC. > diff --git a/drivers/gpu/drm/xe/xe_hw_error.c > b/drivers/gpu/drm/xe/xe_hw_error.c > index 4a4b363fc844b..a833cecc74ec5 100644 > --- a/drivers/gpu/drm/xe/xe_hw_error.c > +++ b/drivers/gpu/drm/xe/xe_hw_error.c [ ... ] > @@ -495,6 +495,9 @@ static void hw_error_source_handler(struct xe_tile *tile, > const enum hardware_er > gt_hw_error_handler(tile, hw_err, error_id); > if (err_bit == XE_SOC_ERROR) > soc_hw_error_handler(tile, hw_err, error_id); > + > + value = atomic_read(&info[error_id].counter); > + xe_drm_ras_event(xe, error_id, severity, value, GFP_ATOMIC); [Severity: Medium] Does this unconditionally emit spurious zero-delta RAS netlink events for unsupported or uncounted errors? When gt_hw_error_handler() or soc_hw_error_handler() are called, they might not increment the error counter. For example, gt_hw_error_handler() returns early for non-PVC platforms: drivers/gpu/drm/xe/xe_hw_error.c:gt_hw_error_handler() { ... if (xe->info.platform != XE_PVC) return; ... } Similarly, soc_hw_error_handler() skips counter increments for correctable errors: drivers/gpu/drm/xe/xe_hw_error.c:soc_hw_error_handler() { ... if (hw_err == HARDWARE_ERROR_CORRECTABLE) { xe_mmio_write32(mmio, SOC_GLOBAL_ERR_STAT_REG(master, hw_err), REG_GENMASK(31, 0)); ... goto unmask_gsysevtctl; } ... } In these cases, xe_drm_ras_event() is still called with the unchanged counter value. Could this cause userspace to receive confusing zero-delta event updates? > } > > clear_reg: -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=2
