mshv_irqfd_assign() adds the irqfd to the partition's hlist and then
registers the wait entry on the eventfd waitqueue via vfs_poll(). A
narrow window exists between these two operations where the irqfd is
visible to deactivation paths but the wait entry is not yet initialized
on the waitqueue.

Currently this is not reachable because mshv_irqfd_assign() and
mshv_irqfd_deassign() are serialized by the partition mutex, and the
EPOLLHUP wakeup path can only fire after vfs_poll() has registered the
wait entry. However, if future refactoring removes or relaxes that
serialization, mshv_irqfd_shutdown() could call
eventfd_ctx_remove_wait_queue() before the wait entry is on the queue,
causing a NULL pointer dereference (the list_head is zeroed by kzalloc
and not initialized by init_waitqueue_func_entry()).

Add synchronize_srcu_expedited() at the start of mshv_irqfd_shutdown()
as a defensive measure, ensuring the assignment path's SRCU read-side
section (which covers vfs_poll() registration) has completed. This
follows the pattern established by KVM in irqfd_shutdown().

Signed-off-by: Stanislav Kinsburskii <[email protected]>
---
 drivers/hv/mshv_eventfd.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/hv/mshv_eventfd.c b/drivers/hv/mshv_eventfd.c
index 5995a62aff8d8..3ab6338064237 100644
--- a/drivers/hv/mshv_eventfd.c
+++ b/drivers/hv/mshv_eventfd.c
@@ -248,8 +248,12 @@ static void mshv_irqfd_shutdown(struct work_struct *work)
 {
        struct mshv_irqfd *irqfd =
                        container_of(work, struct mshv_irqfd, irqfd_shutdown);
+       struct mshv_partition *pt = irqfd->irqfd_partn;
        u64 cnt;
 
+       /* Make sure irqfd has been initialized in assign path. */
+       synchronize_srcu_expedited(&pt->pt_irq_srcu);
+
        /*
         * Synchronize with the wait-queue and unhook ourselves to prevent
         * further events.



Reply via email to