Re: [Freedreno] [PATCH] drm/msm/a6xx: Fix GMU lockdep splat
Hi, On Thu, Aug 3, 2023 at 10:34 AM Rob Clark wrote: > > From: Rob Clark > > For normal GPU devfreq, we need to acquire the GMU lock while already > holding devfreq locks. But in the teardown path, we were calling > dev_pm_domain_detach() while already holding the GMU lock, resulting in > this lockdep splat: > >== >WARNING: possible circular locking dependency detected >6.4.3-debug+ #3 Not tainted >-- >ring0/391 is trying to acquire lock: >ff80a025c078 (>lock){+.+.}-{3:3}, at: > qos_notifier_call+0x30/0x74 > >but task is already holding lock: >ff809b8c1ce8 (&(c->notifiers)->rwsem){}-{3:3}, at: > blocking_notifier_call_chain+0x34/0x78 > >which lock already depends on the new lock. > >the existing dependency chain (in reverse order) is: > >-> #4 (&(c->notifiers)->rwsem){}-{3:3}: > down_write+0x58/0x74 > __blocking_notifier_chain_register+0x64/0x84 > blocking_notifier_chain_register+0x1c/0x28 > freq_qos_add_notifier+0x5c/0x7c > dev_pm_qos_add_notifier+0xd4/0xf0 > devfreq_add_device+0x42c/0x560 > devm_devfreq_add_device+0x6c/0xb8 > msm_devfreq_init+0xa8/0x16c [msm] > msm_gpu_init+0x368/0x54c [msm] > adreno_gpu_init+0x248/0x2b0 [msm] > a6xx_gpu_init+0x2d0/0x384 [msm] > adreno_bind+0x264/0x2bc [msm] > component_bind_all+0x124/0x1f4 > msm_drm_bind+0x2d0/0x5f4 [msm] > try_to_bring_up_aggregate_device+0x88/0x1a4 > __component_add+0xd4/0x128 > component_add+0x1c/0x28 > dp_display_probe+0x37c/0x3c0 [msm] > platform_probe+0x70/0xc0 > really_probe+0x148/0x280 > __driver_probe_device+0xfc/0x114 > driver_probe_device+0x44/0x100 > __device_attach_driver+0x64/0xdc > bus_for_each_drv+0xb0/0xd8 > __device_attach+0xe4/0x140 > device_initial_probe+0x1c/0x28 > bus_probe_device+0x44/0xb0 > deferred_probe_work_func+0xb0/0xc8 > process_one_work+0x288/0x3d8 > worker_thread+0x1f0/0x260 > kthread+0xf0/0x100 > ret_from_fork+0x10/0x20 > >-> #3 (dev_pm_qos_mtx){+.+.}-{3:3}: > __mutex_lock+0xc8/0x388 > mutex_lock_nested+0x2c/0x38 > dev_pm_qos_remove_notifier+0x3c/0xc8 > genpd_remove_device+0x40/0x11c > genpd_dev_pm_detach+0x88/0x130 > dev_pm_domain_detach+0x2c/0x3c > a6xx_gmu_remove+0x44/0xdc [msm] > a6xx_destroy+0x7c/0xa4 [msm] > adreno_unbind+0x50/0x64 [msm] > component_unbind+0x44/0x64 > component_unbind_all+0xb4/0xbc > msm_drm_uninit.isra.0+0x124/0x17c [msm] > msm_drm_bind+0x340/0x5f4 [msm] > try_to_bring_up_aggregate_device+0x88/0x1a4 > __component_add+0xd4/0x128 > component_add+0x1c/0x28 > dp_display_probe+0x37c/0x3c0 [msm] > platform_probe+0x70/0xc0 > really_probe+0x148/0x280 > __driver_probe_device+0xfc/0x114 > driver_probe_device+0x44/0x100 > __device_attach_driver+0x64/0xdc > bus_for_each_drv+0xb0/0xd8 > __device_attach+0xe4/0x140 > device_initial_probe+0x1c/0x28 > bus_probe_device+0x44/0xb0 > deferred_probe_work_func+0xb0/0xc8 > process_one_work+0x288/0x3d8 > worker_thread+0x1f0/0x260 > kthread+0xf0/0x100 > ret_from_fork+0x10/0x20 > >-> #2 (_gpu->gmu.lock){+.+.}-{3:3}: > __mutex_lock+0xc8/0x388 > mutex_lock_nested+0x2c/0x38 > a6xx_gpu_set_freq+0x38/0x64 [msm] > msm_devfreq_target+0x170/0x18c [msm] > devfreq_set_target+0x90/0x1e4 > devfreq_update_target+0xb4/0xf0 > update_devfreq+0x1c/0x28 > devfreq_monitor+0x3c/0x10c > process_one_work+0x288/0x3d8 > worker_thread+0x1f0/0x260 > kthread+0xf0/0x100 > ret_from_fork+0x10/0x20 > >-> #1 (>lock){+.+.}-{3:3}: > __mutex_lock+0xc8/0x388 > mutex_lock_nested+0x2c/0x38 > msm_devfreq_get_dev_status+0x4c/0x104 [msm] > devfreq_simple_ondemand_func+0x5c/0x128 > devfreq_update_target+0x68/0xf0 > update_devfreq+0x1c/0x28 > devfreq_monitor+0x3c/0x10c > process_one_work+0x288/0x3d8 > worker_thread+0x1f0/0x260 > kthread+0xf0/0x100 > ret_from_fork+0x10/0x20 > >-> #0 (>lock){+.+.}-{3:3}: > __lock_acquire+0xdf8/0x109c > lock_acquire+0x234/0x284 > __mutex_lock+0xc8/0x388 > mutex_lock_nested+0x2c/0x38 > qos_notifier_call+0x30/0x74 > qos_min_notifier_call+0x1c/0x28 > notifier_call_chain+0xf4/0x114 >
[Freedreno] [PATCH] drm/msm/a6xx: Fix GMU lockdep splat
From: Rob Clark For normal GPU devfreq, we need to acquire the GMU lock while already holding devfreq locks. But in the teardown path, we were calling dev_pm_domain_detach() while already holding the GMU lock, resulting in this lockdep splat: == WARNING: possible circular locking dependency detected 6.4.3-debug+ #3 Not tainted -- ring0/391 is trying to acquire lock: ff80a025c078 (>lock){+.+.}-{3:3}, at: qos_notifier_call+0x30/0x74 but task is already holding lock: ff809b8c1ce8 (&(c->notifiers)->rwsem){}-{3:3}, at: blocking_notifier_call_chain+0x34/0x78 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (&(c->notifiers)->rwsem){}-{3:3}: down_write+0x58/0x74 __blocking_notifier_chain_register+0x64/0x84 blocking_notifier_chain_register+0x1c/0x28 freq_qos_add_notifier+0x5c/0x7c dev_pm_qos_add_notifier+0xd4/0xf0 devfreq_add_device+0x42c/0x560 devm_devfreq_add_device+0x6c/0xb8 msm_devfreq_init+0xa8/0x16c [msm] msm_gpu_init+0x368/0x54c [msm] adreno_gpu_init+0x248/0x2b0 [msm] a6xx_gpu_init+0x2d0/0x384 [msm] adreno_bind+0x264/0x2bc [msm] component_bind_all+0x124/0x1f4 msm_drm_bind+0x2d0/0x5f4 [msm] try_to_bring_up_aggregate_device+0x88/0x1a4 __component_add+0xd4/0x128 component_add+0x1c/0x28 dp_display_probe+0x37c/0x3c0 [msm] platform_probe+0x70/0xc0 really_probe+0x148/0x280 __driver_probe_device+0xfc/0x114 driver_probe_device+0x44/0x100 __device_attach_driver+0x64/0xdc bus_for_each_drv+0xb0/0xd8 __device_attach+0xe4/0x140 device_initial_probe+0x1c/0x28 bus_probe_device+0x44/0xb0 deferred_probe_work_func+0xb0/0xc8 process_one_work+0x288/0x3d8 worker_thread+0x1f0/0x260 kthread+0xf0/0x100 ret_from_fork+0x10/0x20 -> #3 (dev_pm_qos_mtx){+.+.}-{3:3}: __mutex_lock+0xc8/0x388 mutex_lock_nested+0x2c/0x38 dev_pm_qos_remove_notifier+0x3c/0xc8 genpd_remove_device+0x40/0x11c genpd_dev_pm_detach+0x88/0x130 dev_pm_domain_detach+0x2c/0x3c a6xx_gmu_remove+0x44/0xdc [msm] a6xx_destroy+0x7c/0xa4 [msm] adreno_unbind+0x50/0x64 [msm] component_unbind+0x44/0x64 component_unbind_all+0xb4/0xbc msm_drm_uninit.isra.0+0x124/0x17c [msm] msm_drm_bind+0x340/0x5f4 [msm] try_to_bring_up_aggregate_device+0x88/0x1a4 __component_add+0xd4/0x128 component_add+0x1c/0x28 dp_display_probe+0x37c/0x3c0 [msm] platform_probe+0x70/0xc0 really_probe+0x148/0x280 __driver_probe_device+0xfc/0x114 driver_probe_device+0x44/0x100 __device_attach_driver+0x64/0xdc bus_for_each_drv+0xb0/0xd8 __device_attach+0xe4/0x140 device_initial_probe+0x1c/0x28 bus_probe_device+0x44/0xb0 deferred_probe_work_func+0xb0/0xc8 process_one_work+0x288/0x3d8 worker_thread+0x1f0/0x260 kthread+0xf0/0x100 ret_from_fork+0x10/0x20 -> #2 (_gpu->gmu.lock){+.+.}-{3:3}: __mutex_lock+0xc8/0x388 mutex_lock_nested+0x2c/0x38 a6xx_gpu_set_freq+0x38/0x64 [msm] msm_devfreq_target+0x170/0x18c [msm] devfreq_set_target+0x90/0x1e4 devfreq_update_target+0xb4/0xf0 update_devfreq+0x1c/0x28 devfreq_monitor+0x3c/0x10c process_one_work+0x288/0x3d8 worker_thread+0x1f0/0x260 kthread+0xf0/0x100 ret_from_fork+0x10/0x20 -> #1 (>lock){+.+.}-{3:3}: __mutex_lock+0xc8/0x388 mutex_lock_nested+0x2c/0x38 msm_devfreq_get_dev_status+0x4c/0x104 [msm] devfreq_simple_ondemand_func+0x5c/0x128 devfreq_update_target+0x68/0xf0 update_devfreq+0x1c/0x28 devfreq_monitor+0x3c/0x10c process_one_work+0x288/0x3d8 worker_thread+0x1f0/0x260 kthread+0xf0/0x100 ret_from_fork+0x10/0x20 -> #0 (>lock){+.+.}-{3:3}: __lock_acquire+0xdf8/0x109c lock_acquire+0x234/0x284 __mutex_lock+0xc8/0x388 mutex_lock_nested+0x2c/0x38 qos_notifier_call+0x30/0x74 qos_min_notifier_call+0x1c/0x28 notifier_call_chain+0xf4/0x114 blocking_notifier_call_chain+0x4c/0x78 pm_qos_update_target+0x184/0x190 freq_qos_apply+0x4c/0x64 apply_constraint+0xf8/0xfc __dev_pm_qos_update_request+0x138/0x164 dev_pm_qos_update_request+0x44/0x68 msm_devfreq_boost+0x40/0x70 [msm]