Actually, the drm_dev_enter in psp_cmd_submit_buf
does not protect anything.
And it is not used to prevent concurrent access.
If DRM device is unplugged, it will always check the condition in WARN_ON.
We'd better not keep adding commands to the list.
Simply moving the drm_dev_enter/drm_dev_exit higher level
will not solve the issue.
Because psp_cmd_submit_buf is called in many places, such as
psp_hw_init-->psp_load_fw, psp_suspend-->psp_xgmi_terminate,
amdgpu_device_gpu_recover-->amdgpu_ras_suspend.
So drop drm_dev_enter/drm_dev_exit in psp_cmd_submit_buf.

When removing amdgpu, the calling order as follows:
amdgpu_pci_remove
        drm_dev_unplug
        amdgpu_driver_unload_kms
                amdgpu_device_fini_hw
                        amdgpu_device_ip_fini_early
                                psp_hw_fini
                                        psp_ras_terminate
                                                psp_ta_unloadye
                                                        psp_cmd_submit_buf

[ 4507.740388] Call Trace:
[ 4507.740389]  <TASK>
[ 4507.740391]  psp_ta_unload+0x44/0x70 [amdgpu]
[ 4507.740485]  psp_ras_terminate+0x4d/0x70 [amdgpu]
[ 4507.740575]  psp_hw_fini+0x28/0xa0 [amdgpu]
[ 4507.740662]  amdgpu_device_fini_hw+0x328/0x442 [amdgpu]
[ 4507.740791]  amdgpu_driver_unload_kms+0x51/0x60 [amdgpu]
[ 4507.740875]  amdgpu_pci_remove+0x5a/0x140 [amdgpu]
[ 4507.740962]  ? _raw_spin_unlock_irqrestore+0x27/0x43
[ 4507.740965]  ? __pm_runtime_resume+0x60/0x90
[ 4507.740968]  pci_device_remove+0x39/0xb0
[ 4507.740971]  device_remove+0x46/0x70
[ 4507.740972]  device_release_driver_internal+0xd1/0x160
[ 4507.740974]  driver_detach+0x4a/0x90
[ 4507.740975]  bus_remove_driver+0x6c/0xf0
[ 4507.740976]  driver_unregister+0x31/0x50
[ 4507.740977]  pci_unregister_driver+0x40/0x90
[ 4507.740978]  amdgpu_exit+0x15/0x120 [amdgpu]

Signed-off-by: lyndonli <lyndon...@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 17 +----------------
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 4c617faaa7c9..02f948adae72 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -603,27 +603,14 @@ psp_cmd_submit_buf(struct psp_context *psp,
                   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
 {
        int ret;
-       int index, idx;
+       int index;
        int timeout = 20000;
        bool ras_intr = false;
        bool skip_unsupport = false;
-       bool dev_entered;
 
        if (psp->adev->no_hw_access)
                return 0;
 
-       dev_entered = drm_dev_enter(adev_to_drm(psp->adev), &idx);
-       /*
-        * We allow sending PSP messages LOAD_ASD and UNLOAD_TA without 
acquiring
-        * a lock in drm_dev_enter during driver unload because we must call
-        * drm_dev_unplug as the beginning  of unload driver sequence . It is 
very
-        * crucial that userspace can't access device instances anymore.
-        */
-       if (!dev_entered)
-               WARN_ON(psp->cmd_buf_mem->cmd_id != GFX_CMD_ID_LOAD_ASD &&
-                       psp->cmd_buf_mem->cmd_id != GFX_CMD_ID_UNLOAD_TA &&
-                       psp->cmd_buf_mem->cmd_id != GFX_CMD_ID_INVOKE_CMD);
-
        memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
 
        memcpy(psp->cmd_buf_mem, cmd, sizeof(struct psp_gfx_cmd_resp));
@@ -687,8 +674,6 @@ psp_cmd_submit_buf(struct psp_context *psp,
        }
 
 exit:
-       if (dev_entered)
-               drm_dev_exit(idx);
        return ret;
 }
 
-- 
2.34.1

Reply via email to