Thank you for your contribution! Sashiko AI review found 1 potential issue(s) 
to consider:
- [High] Double-unpin of shared shadow PPGTT memory due to conflict between 
explicit error handling and unconditional cleanup.
--

commit 4d9fa23878c7d84ea81831549d03a73dfab721d4
Author: Wentao Liang <[email protected]>

drm/i915/gvt: fix refcount leak in intel_vgpu_shadow_mm_pin()

This commit adds an explicit unpin call to the error path of
intel_vgpu_shadow_mm_pin() when intel_vgpu_pin_mm() fails. The goal is to
balance the pincount increment that unconditionally happens at the start
of intel_vgpu_pin_mm().

> diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c 
> b/drivers/gpu/drm/i915/gvt/scheduler.c
> index 15fdd514ca836..30d879b37c84d 100644
> --- a/drivers/gpu/drm/i915/gvt/scheduler.c
> +++ b/drivers/gpu/drm/i915/gvt/scheduler.c
> @@ -698,6 +698,7 @@ intel_vgpu_shadow_mm_pin(struct intel_vgpu_workload 
> *workload)
>  
>       ret = intel_vgpu_pin_mm(workload->shadow_mm);
>       if (ret) {
> +             intel_vgpu_unpin_mm(workload->shadow_mm);

[Severity: High]
Does this explicit unpin introduce a double-unpin of the shadow mm?

Looking at the workload dispatch loop in workload_thread(), if
intel_vgpu_shadow_mm_pin() returns an error, the thread jumps to the
completion path:

workload_thread()
    ...
    ret = prepare_workload(workload);
    if (ret)
        goto complete;
    ...
complete:
    complete_current_workload(gvt, engine->id);

The cleanup path then unconditionally unpins the shadow mm:

complete_current_workload()
    ...
    intel_vgpu_shadow_mm_unpin(workload);

Before this patch, a failure in intel_vgpu_pin_mm() incremented the
pincount, and the unconditional cleanup in complete_current_workload()
decremented it, maintaining balance. 

By adding this explicit unpin in the error path, could the pincount
decrement twice and underflow?

Since shadow page tables are shared among workloads for the same vGPU,
would an underflow allow the GVT shrinker to prematurely destroy active
shadow page tables while another workload is still executing, potentially
causing GPU page faults or VM crashes?

>               gvt_vgpu_err("fail to vgpu pin mm\n");
>               return ret;
>       }

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to