Define a new netlink event 'error-event' and a new multicast group
'error-notify' in drm_ras. Each event contains device name, node and
error information to identify the error triggering the event.
Add drm_ras_nl_error_event() to trigger an event from the driver.
Wire this support to xe drm_ras to notify userspace whenever a GT or
SoC error occurs in PVC.
$ sudo ./tools/net/ynl/pyynl/cli.py --family drm_ras --output-json \
--subscribe error-notify
{
"name": "error-event",
"msg": {
"device-name": "0000:03:00.0",
"node-id": 1,
"node-name": "uncorrectable-errors",
"error-id": 1,
"error-name": "core-compute",
"error-value": 1
}
}
Riana Tauro (2):
drm/drm_ras: Add drm_ras netlink error event
drm/xe/xe_drm_ras: Add error-event support in XE drm_ras
Documentation/gpu/drm-ras.rst | 21 ++++++
Documentation/netlink/specs/drm_ras.yaml | 50 ++++++++++++++
drivers/gpu/drm/drm_ras.c | 86 ++++++++++++++++++++++++
drivers/gpu/drm/drm_ras_nl.c | 6 ++
drivers/gpu/drm/drm_ras_nl.h | 4 ++
drivers/gpu/drm/xe/xe_drm_ras.c | 29 ++++++++
drivers/gpu/drm/xe/xe_drm_ras.h | 6 ++
drivers/gpu/drm/xe/xe_hw_error.c | 7 +-
include/drm/drm_ras.h | 5 ++
include/uapi/drm/drm_ras.h | 15 +++++
10 files changed, 228 insertions(+), 1 deletion(-)
--
2.47.1