On Fri, Jun 05, 2026 at 09:17:19AM +0800, Jie Gan wrote:
>
>
> On 6/4/2026 4:42 PM, Vishnu Santhosh wrote:
> > During driver detach, the device core holds the device mutex throughout
> > the driver's remove callback chain. When the rpmsg endpoint is
> > destroyed as part of that teardown, the GLINK endpoint destroy
> > implementation attempts to unregister the underlying rpmsg device.
> > That unregistration calls device_del(), which tries to re-acquire the
> > same device mutex already held higher up the stack, causing rmmod to
> > hang indefinitely.
> >
> > The deadlock manifests with the following call chain:
> >
> > [<0>] device_del+0x44/0x414 <- tries to acquire same mutex
> > [<0>] device_unregister+0x18/0x34
> > [<0>] rpmsg_unregister_device+0x28/0x4c
> > [<0>] qcom_glink_remove_rpmsg_device+0x70/0xc0
> > [<0>] qcom_glink_destroy_ept+0x58/0xbc
> > [<0>] rpmsg_dev_remove+0x50/0x60
> > [<0>] device_remove+0x4c/0x80
> > [<0>] device_release_driver_internal+0x1cc/0x228 <- acquires device mutex
> > [<0>] driver_detach+0x4c/0x98
> > [<0>] bus_remove_driver+0x6c/0xbc
> > [<0>] driver_unregister+0x30/0x60
> > [<0>] unregister_rpmsg_driver+0x10/0x1c
> > [<0>] fastrpc_exit+0x28/0x38 [fastrpc]
> > [<0>] __arm64_sys_delete_module+0x1b8/0x294
> > [<0>] invoke_syscall+0x48/0x10c
> > [<0>] el0_svc_common.constprop.0+0xc0/0xe0
> > [<0>] do_el0_svc+0x1c/0x28
> > [<0>] el0_svc+0x34/0x108
> > [<0>] el0t_64_sync_handler+0xa0/0xe4
> > [<0>] el0t_64_sync+0x198/0x19c
> >
> > The rpmsg device unregistration inside endpoint destroy is redundant.
> > In both contexts where endpoint destruction is triggered:
> >
> > - Driver detach path: the driver core already tears down the rpmsg
> > device.
> >
> > - Channel close path: the rpmsg device is already unregistered before
> > endpoint destruction is reached.
> >
> > Remove the redundant unregistration to fix the deadlock.
> >
>
> Fixes: a53e356df548 ("rpmsg: glink: fix rpmsg device leak")
>
Reviewed-by: Dmitry Baryshkov <[email protected]>
--
With best wishes
Dmitry