AMD General Reviewed-by: YiPeng Chai <[email protected]>
Best Regards, Thomas ________________________________ From: Xie, Chenglei <[email protected]> Sent: Tuesday, May 12, 2026 3:29 AM To: [email protected] <[email protected]> Cc: Chan, Hing Pong <[email protected]>; Luo, Zhigang <[email protected]>; Deucher, Alexander <[email protected]>; Xie, Chenglei <[email protected]>; Chai, Thomas <[email protected]> Subject: [PATCH] drm/amdgpu: bound SR-IOV RAS CPER dump parsing against used_size The VF copies a PF-provided CPER telemetry blob and walks records using cper_dump->count and each entry's record_length. count is u64 while the loop used u32, so a large count could loop indefinitely. record_length was not limited to the kmemdup'd region, so the first iteration could read far past the allocation; record_length == 0 could spin forever on the same entry. Together that allowed a malicious hypervisor to leak heap past the blob into the CPER ring or hang the guest. Require used_size to cover the fixed header before buf and stay within the telemetry cap. Track remaining bytes in buf, cap iterations with u64 and CPER_MAX_ALLOWED_COUNT, and reject record_length outside [sizeof(cper_hdr), remaining] before writing to the ring. Signed-off-by: Chenglei Xie <[email protected]> Change-Id: Ic21f4523eebc6c4b4f8c6b62b84104b18cf86a48 --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c index 6974b1c5b56c2..c8bec62bdffb2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c @@ -1798,13 +1798,15 @@ amdgpu_virt_write_cpers_to_ring(struct amdgpu_device *adev, struct amd_sriov_ras_cper_dump *cper_dump = NULL; struct cper_hdr *entry = NULL; struct amdgpu_ring *ring = &adev->cper.ring_buf; - uint32_t checksum, used_size, i; + uint32_t checksum, used_size; + u64 remaining, cnt, i; int ret = 0; checksum = host_telemetry->header.checksum; used_size = host_telemetry->header.used_size; - if (used_size > (AMD_SRIOV_MSG_RAS_TELEMETRY_SIZE_KB_V1 << 10)) + if (used_size < offsetof(struct amd_sriov_ras_cper_dump, buf) || + used_size > (AMD_SRIOV_MSG_RAS_TELEMETRY_SIZE_KB_V1 << 10)) return -EINVAL; cper_dump = kmemdup(&host_telemetry->body.cper_dump, used_size, GFP_KERNEL); @@ -1829,11 +1831,19 @@ amdgpu_virt_write_cpers_to_ring(struct amdgpu_device *adev, } entry = (struct cper_hdr *)&cper_dump->buf[0]; + remaining = (u64)used_size - offsetof(struct amd_sriov_ras_cper_dump, buf); + cnt = min_t(u64, cper_dump->count, CPER_MAX_ALLOWED_COUNT); + + for (i = 0; i < cnt; i++) { + if (entry->record_length < sizeof(struct cper_hdr) || + entry->record_length > remaining) { + ret = -EINVAL; + goto out; + } - for (i = 0; i < cper_dump->count; i++) { amdgpu_cper_ring_write(ring, entry, entry->record_length); - entry = (struct cper_hdr *)((char *)entry + - entry->record_length); + remaining -= entry->record_length; + entry = (struct cper_hdr *)((char *)entry + entry->record_length); } if (cper_dump->overflow_count) -- 2.34.1
