On Fri, Nov 20, 2015 at 1:05 PM, Anton Ivanov
<anton.iva...@kot-begemot.co.uk> wrote:
> I have gotten to the bottom of this.
>
> 1. The IRQ handler re-entrancy issue predates the timer patch. Adding a
> simple guard with a WARN_ON_ONCE around the device loop in the
> sig_io_handler catches it in plain 4.3
>
> diff --git a/arch/um/kernel/irq.c b/arch/um/kernel/irq.c
> index 23cb935..ac0bbce 100644
> --- a/arch/um/kernel/irq.c
> +++ b/arch/um/kernel/irq.c
> @@ -30,12 +30,17 @@ static struct irq_fd **last_irq_ptr = &active_fds;
>
>   extern void free_irqs(void);
>
> +static int in_poll_handler = 0;
> +
>   void sigio_handler(int sig, struct siginfo *unused_si, struct
> uml_pt_regs *regs)
>   {
>          struct irq_fd *irq_fd;
>          int n;
>
> +    WARN_ON_ONCE(in_poll_handler == 1);
> +
>          while (1) {
> +        in_poll_handler = 1;
>                  n = os_waiting_for_events(active_fds);
>                  if (n <= 0) {
>                          if (n == -EINTR)
> @@ -51,6 +56,7 @@ void sigio_handler(int sig, struct siginfo *unused_si,
> struct uml_pt_regs *regs)
>                          }
>                  }
>          }
> +    in_poll_handler = 0;
>
>          free_irqs();
>   }
>
> This is dangerously broken - you can under heavy IO exhaust the stack,
> you can get packets out of order, etc. Most IO is reasonably atomic so
> corruption is not likely, but not impossible (especially if one or more
> drivers are optimized to use multi-read/multi-write).
>
> 2. I cannot catch what is wrong with the current code in signal.c. When
> I read it, it should not produce re-entrancy. But it does.
>
> 3. I found 2-3 minor issues with signal handling and the timer patch
> which I will submit a hot-fix for, including a proper fix for the
> hang-in-sleep issue.
>
> 4. While I can propose a brutal patch for signal.c which sets guards
> against reentrancy which works fine, I suggest we actually get to the
> bottom of this. Why the code in unblock_signals() does not guard
> correctly against that?

Thanks for hunting this issue.
I fear I'll have to grab my speleologist's hat to figure out why UML
works this way.
Cc'ing Al, do you have an idea?

-- 
Thanks,
//richard

------------------------------------------------------------------------------
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

Reply via email to