On Tue, Oct 7, 2025 at 10:55 PM Pillai, Aurabindo
<[email protected]> wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Mikhail,
>
> schedule_dc_vmin_vmax() has an allocation which is incorrectly using 
> GFP_KERNEL, which is likely the reason for the "sleeping function called from 
> invalid context". We have a fix queued for this week's update (switching it 
> to GFP_NOWAIT).
>

Hi,

Just a quick update regarding the second WARN I mentioned earlier,
triggered at drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:138 amdgpu_vm_set_pasid().

After some additional bisecting, I found that this warning first appears
in the merge commit:
342f141ba9f4c9e39de342d047a5245e8f4cda19
Merge: 0faeb8cf99c0 a490c8d77d50
Author: Dave Airlie <[email protected]>
Date:   Mon Sep 22 08:44:52 2025 +1000
    Merge tag 'amd-drm-next-6.18-2025-09-19' of
https://gitlab.freedesktop.org/agd5f/linux into drm-next

Both merge parents (0faeb8cf from drm-next and a490c8d7 from amd-drm-next)
are clean on my setup — no WARNs or other regressions.

It turns out that this WARN is triggered by an interaction between the
two sides of the merge.
The AMD branch introduced the new amdgpu_vm_assert_locked(vm) check inside
amdgpu_vm_set_pasid(), while the drm-next side still contained a code path
(for example, through amdgpu_driver_open_kms()) that calls
amdgpu_vm_set_pasid() without holding the expected reservation lock.

As a result, the merge commit combined these two changes and started hitting
the dma_resv_assert_held() check in that function.
Both parents on their own are fine, so this is a merge-only side effect —
the stricter locking assertion from AMD’s branch met an older call path
from drm-next that doesn’t yet satisfy it.

I verified that removing just the amdgpu_vm_assert_locked(vm) call
from amdgpu_vm_set_pasid() eliminates the WARN completely,
while keeping all other recent VM locking changes intact.

-- 
Best Regards,
Mike Gavrilov.

Reply via email to