+srcu folks Please don't post subsequent versions In-Reply-To previous versions, it tends to muck up tooling.
On Mon, Mar 23, 2026, Sonam Sanju wrote: > irqfd_resampler_shutdown() and kvm_irqfd_assign() both call > synchronize_srcu_expedited() while holding kvm->irqfds.resampler_lock. > This can deadlock when multiple irqfd workers run concurrently on the > kvm-irqfd-cleanup workqueue during VM teardown or when VMs are rapidly > created and destroyed: > > CPU A (mutex holder) CPU B/C/D (mutex waiters) > irqfd_shutdown() irqfd_shutdown() / kvm_irqfd_assign() > irqfd_resampler_shutdown() irqfd_resampler_shutdown() > mutex_lock(resampler_lock) <---- mutex_lock(resampler_lock) //BLOCKED > list_del_rcu(...) ...blocked... > synchronize_srcu_expedited() // Waiters block workqueue, > // waits for SRCU grace preventing SRCU grace > // period which requires period from completing > // workqueue progress --- DEADLOCK --- > > In irqfd_resampler_shutdown(), the synchronize_srcu_expedited() in > the else branch is called directly within the mutex. In the if-last > branch, kvm_unregister_irq_ack_notifier() also calls > synchronize_srcu_expedited() internally. In kvm_irqfd_assign(), > synchronize_srcu_expedited() is called after list_add_rcu() but > before mutex_unlock(). All paths can block indefinitely because: > > 1. synchronize_srcu_expedited() waits for an SRCU grace period > 2. SRCU grace period completion needs workqueue workers to run > 3. The blocked mutex waiters occupy workqueue slots preventing progress Unless I'm misunderstanding the bug, "fixing" in this in KVM is papering over an underlying flaw. Essentially, this would be establishing a rule that synchronize_srcu_expedited() can *never* be called while holding a mutex. That's not viable. > 4. The mutex holder never releases the lock -> deadlock

