On Tue, Oct 7, 2025 at 10:55 PM Pillai, Aurabindo <[email protected]> wrote: > > [AMD Official Use Only - AMD Internal Distribution Only] > > Hi Mikhail, > > schedule_dc_vmin_vmax() has an allocation which is incorrectly using > GFP_KERNEL, which is likely the reason for the "sleeping function called from > invalid context". We have a fix queued for this week's update (switching it > to GFP_NOWAIT). >
Hi, Just a quick update regarding the second WARN I mentioned earlier, triggered at drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:138 amdgpu_vm_set_pasid(). After some additional bisecting, I found that this warning first appears in the merge commit: 342f141ba9f4c9e39de342d047a5245e8f4cda19 Merge: 0faeb8cf99c0 a490c8d77d50 Author: Dave Airlie <[email protected]> Date: Mon Sep 22 08:44:52 2025 +1000 Merge tag 'amd-drm-next-6.18-2025-09-19' of https://gitlab.freedesktop.org/agd5f/linux into drm-next Both merge parents (0faeb8cf from drm-next and a490c8d7 from amd-drm-next) are clean on my setup — no WARNs or other regressions. It turns out that this WARN is triggered by an interaction between the two sides of the merge. The AMD branch introduced the new amdgpu_vm_assert_locked(vm) check inside amdgpu_vm_set_pasid(), while the drm-next side still contained a code path (for example, through amdgpu_driver_open_kms()) that calls amdgpu_vm_set_pasid() without holding the expected reservation lock. As a result, the merge commit combined these two changes and started hitting the dma_resv_assert_held() check in that function. Both parents on their own are fine, so this is a merge-only side effect — the stricter locking assertion from AMD’s branch met an older call path from drm-next that doesn’t yet satisfy it. I verified that removing just the amdgpu_vm_assert_locked(vm) call from amdgpu_vm_set_pasid() eliminates the WARN completely, while keeping all other recent VM locking changes intact. -- Best Regards, Mike Gavrilov.
