Only when calling the poll syscall the first time can user receive POLLPRI correctly. After that, user always fails to acquire the event signal.
Reproduce case: 1. Get the monitor code in Documentation/accounting/psi.txt 2. Run it, and wait for the event triggered. 3. Kill and restart the process. If the user doesn't kill the monitor process, it seems the poll_work works fine. After killing and restarting the monitor, the poll_work in kernel will never run again due to the wrong value of poll_scheduled. Therefore, we should reset the value as group_init() does after the last trigger is destroyed. [PATCH V2] In the patch v2, I put the atomic_set(&group->poll_scheduled, 0); into the right place. Here I quoted from Johannes as the best explaination: "The question is why we can end up with poll_scheduled = 1 but the work not running (which would reset it to 0). And the answer is because the scheduling side sees group->poll_kworker under RCU protection and then schedules it, but here we cancel the work and destroy the worker. The cancel needs to pair with resetting the poll_scheduled flag." Signed-off-by: Jason Xing <kerneljasonx...@linux.alibaba.com> Reviewed-by: Caspar Zhang <cas...@linux.alibaba.com> Reviewed-by: Joseph Qi <joseph...@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <sur...@google.com> Acked-by: Johannes Weiner <han...@cmpxchg.org> --- kernel/sched/psi.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 7acc632..acdada0 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -1131,7 +1131,14 @@ static void psi_trigger_destroy(struct kref *ref) * deadlock while waiting for psi_poll_work to acquire trigger_lock */ if (kworker_to_destroy) { + /* + * After the RCU grace period has expired, the worker + * can no longer be found through group->poll_kworker. + * But it might have been already scheduled before + * that - deschedule it cleanly before destroying it. + */ kthread_cancel_delayed_work_sync(&group->poll_work); + atomic_set(&group->poll_scheduled, 0); kthread_destroy_worker(kworker_to_destroy); } kfree(t); -- 1.8.3.1