I have gotten to the bottom of this. 1. The IRQ handler re-entrancy issue predates the timer patch. Adding a simple guard with a WARN_ON_ONCE around the device loop in the sig_io_handler catches it in plain 4.3
diff --git a/arch/um/kernel/irq.c b/arch/um/kernel/irq.c index 23cb935..ac0bbce 100644 --- a/arch/um/kernel/irq.c +++ b/arch/um/kernel/irq.c @@ -30,12 +30,17 @@ static struct irq_fd **last_irq_ptr = &active_fds; extern void free_irqs(void); +static int in_poll_handler = 0; + void sigio_handler(int sig, struct siginfo *unused_si, struct uml_pt_regs *regs) { struct irq_fd *irq_fd; int n; + WARN_ON_ONCE(in_poll_handler == 1); + while (1) { + in_poll_handler = 1; n = os_waiting_for_events(active_fds); if (n <= 0) { if (n == -EINTR) @@ -51,6 +56,7 @@ void sigio_handler(int sig, struct siginfo *unused_si, struct uml_pt_regs *regs) } } } + in_poll_handler = 0; free_irqs(); } This is dangerously broken - you can under heavy IO exhaust the stack, you can get packets out of order, etc. Most IO is reasonably atomic so corruption is not likely, but not impossible (especially if one or more drivers are optimized to use multi-read/multi-write). 2. I cannot catch what is wrong with the current code in signal.c. When I read it, it should not produce re-entrancy. But it does. 3. I found 2-3 minor issues with signal handling and the timer patch which I will submit a hot-fix for, including a proper fix for the hang-in-sleep issue. 4. While I can propose a brutal patch for signal.c which sets guards against reentrancy which works fine, I suggest we actually get to the bottom of this. Why the code in unblock_signals() does not guard correctly against that? A. ------------------------------------------------------------------------------ _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel