Am 15.11.2010 21:53, Philippe Gerum wrote:
> On Mon, 2010-11-15 at 20:54 +0100, Jan Kiszka wrote:
>> Am 15.11.2010 20:31, Jan Kiszka wrote:
>>> Hi Philippe,
>>>
>>> debugging some variant of I-pipe over an x86-32 target, I think I found
>>> some fairly old flaw in the IRQ virtualization that causes rescheduling
>>> delays (up to deadlocks) for Linux:
>>>
>>> - we are in sysenter_tail (other exit paths should be affected as well)
>>> - we DISABLE_INTERRUPTS, but only virtually
>>> - we go past "testl $_TIF_ALLWORK_MASK, %ecx", nothing to be done
>>> - an IRQ for Linux arrives, it is pushed to the backlog
>>> - __ipipe_unstall_iret_root replays the IRQ as the regs we are about to
>>>   return to have IF set (obviously, we return from a syscall)
>>> - the Linux IRQ handler sets _TIF_NEED_RESCHED, but doesn't perform the
>>>   work on return as __ipipe_sync_stage set the stall flag for the Linux
>>>   domain before calling the handler
>>> - but now the preempted sysenter return also does no reschedule as it
>>>   already passed the check - bang!
>>>
>>> Another variant of this Linux rescheduling issue:
>>>
>>> - we are in a lengthy loop inside the kernel, but we are preemptible
>>>   most of the time
>>> - after disabling Linux IRQs briefly, we are calling
>>>   local_irq_enable() again
>>> - in the meantime, we received a Linux IRQ which is now pending in the
>>>   backlog
>>> - __ipipe_unstall_root triggers __ipipe_sync_stage
>>> - Linux handler is called, sets NEED_RESCHED but does not reschedule
>>>   (see above)
>>> - we do not test for resched again as we are not returning to user
>>>   space, and that for quite some time - bang!
>>
>> And this one actually affects x86-64 as well: Here, ret_from_intr checks
>> for NEED_RESCHED only if IF is set in the flags of the preempted
>> context. But if the Linux domain is alone, we enter __ipipe_do_root_xirq
>> with hard IRQs disabled, thus we push this information incorrectly on
>> the Linux handler stack, preventing the required rescheduling check.
>>
>> I guess the simplest fix for this is to drop the
>> "!__ipipe_pipeline_head_p(ipd)" check from __ipipe_sync_stage.
>>
> 
> We can't do that unfortunately, we would introduce significant latency
> when the interrupt is to be delivered to the high priority domain. This
> check makes sure to keep irqs off when delivering to Xenomai for
> instance, since we don't want the handler to be delayed for pushing
> non-rt interrupts to the log while handling a rt event.
> 
> Not that pretty, but this would be innocuous latency-wise:
> 
> diff --git a/arch/x86/include/asm/ipipe_64.h b/arch/x86/include/asm/ipipe_64.h
> index b9367f6..a3b4fed 100644
> --- a/arch/x86/include/asm/ipipe_64.h
> +++ b/arch/x86/include/asm/ipipe_64.h
> @@ -71,6 +71,7 @@ static inline void __do_root_xirq(ipipe_irq_handler_t 
> handler,
>                            "pushq $0\n\t"
>                            "pushq %%rax\n\t"
>                            "pushfq\n\t"
> +                          "orq   %[x86if],(%%rsp)\n\t"
>                            "pushq %[kernel_cs]\n\t"
>                            "pushq $__xirq_end\n\t"
>                            "pushq %[vector]\n\t"
> @@ -91,7 +92,8 @@ static inline void __do_root_xirq(ipipe_irq_handler_t 
> handler,
>                            : /* no output */
>                            : [kernel_cs] "i" (__KERNEL_CS),
>                              [vector] "rm" (regs->orig_ax),
> -                            [handler] "r" (handler), "D" (regs)
> +                            [handler] "r" (handler), "D" (regs),
> +                            [x86if] "i" (X86_EFLAGS_IF)
>                            : "rax");
>  }
>  
> @@ -109,6 +111,7 @@ static inline void __do_root_virq(ipipe_irq_handler_t 
> handler,
>                            "pushq $0\n\t"
>                            "pushq %%rax\n\t"
>                            "pushfq\n\t"
> +                          "orq   %[x86if],(%%rsp)\n\t"
>                            "pushq %[kernel_cs]\n\t"
>                            "pushq $__virq_end\n\t"
>                            "pushq $-1\n\t"
> @@ -125,7 +128,8 @@ static inline void __do_root_virq(ipipe_irq_handler_t 
> handler,
>                            "call  *%[handler]\n\t"
>                            : /* no output */
>                            : [kernel_cs] "i" (__KERNEL_CS),
> -                            [handler] "r" (handler), "D" (irq), "S" (cookie)
> +                            [handler] "r" (handler), "D" (irq), "S" (cookie),
> +                            [x86if] "i" (X86_EFLAGS_IF)
>                            : "rax");
>       irq_exit();
>       __asm__ __volatile__("cli\n\t"

Of course, this works as well. The alternative would be
"!__ipipe_pipeline_head_p(ipd) || ipd == ipipe_root_domain" - just a bit
shorter.

Jan

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Adeos-main mailing list
[email protected]
https://mail.gna.org/listinfo/adeos-main

Reply via email to