Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2011-07-15 15:10, Jan Kiszka wrote: But... right now it looks like we found our primary regression: nucleus/shadow: shorten the uninterruptible path to secondary mode. It opens a short windows during relax where the migrated task may be active under both schedulers. We are currently evaluating a revert (looks good so far), and I need to work out my theory in more details. Looks like this commit just made a long-standing flaw in Xenomai's interrupt handling more visible: We reschedule over the interrupt stack in the Xenomai interrupt handler tails, at least on x86-64. Not sure if other archs have interrupt stacks, the point is Xenomai's design wrongly assumes there are no such things. We were lucky so far that the values saved on this shared stack were apparently compatible, means we were overwriting them with identical or harmless values. But that's no longer true when interrupts are hitting us in the xnpod_suspend_thread path of a relaxing shadow. Likely the only possible fix is establishing a reschedule hook for Xenomai in the interrupt exit path after the original stack is restored - - just like Linux works. Requires changes to both ipipe and Xenomai unfortunately. Jan -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk4hSDsACgkQitSsb3rl5xSmOACfbZfcNKyO9YDvPE+R5H75d0ky DX0An32BrZW+lpEnxnLLCHSQ5r8itnE9 =n6u8 -END PGP SIGNATURE- ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion
On 2011-07-16 10:52, Philippe Gerum wrote: On Sat, 2011-07-16 at 10:13 +0200, Jan Kiszka wrote: On 2011-07-15 15:10, Jan Kiszka wrote: But... right now it looks like we found our primary regression: nucleus/shadow: shorten the uninterruptible path to secondary mode. It opens a short windows during relax where the migrated task may be active under both schedulers. We are currently evaluating a revert (looks good so far), and I need to work out my theory in more details. Looks like this commit just made a long-standing flaw in Xenomai's interrupt handling more visible: We reschedule over the interrupt stack in the Xenomai interrupt handler tails, at least on x86-64. Not sure if other archs have interrupt stacks, the point is Xenomai's design wrongly assumes there are no such things. Fortunately, no, this is not a design issue, no such assumption was ever made, but the Xenomai core expects this to be handled on a per-arch basis with the interrupt pipeline. And that's already the problem: If Linux uses interrupt stacks, relying on ipipe to disable this during Xenomai interrupt handler execution is at best a workaround. A fragile one unless you increase the pre-thread stack size by the size of the interrupt stack. Lacking support for a generic rescheduling hook became a problem by the time Linux introduced interrupt threads. As you pointed out, there is no way to handle this via some generic Xenomai-only support. ppc64 now has separate interrupt stacks, which is why I disabled IRQSTACKS which became the builtin default at some point. Blackfin goes through a Xenomai-defined irq tail handler as well, because it may not reschedule over nested interrupt stacks. How does this arch prevent that xnpod_schedule in the generic interrupt handler tail does its normal work? Fact is that such pending problem with x86_64 was overlooked since day #1 by /me. We were lucky so far that the values saved on this shared stack were apparently compatible, means we were overwriting them with identical or harmless values. But that's no longer true when interrupts are hitting us in the xnpod_suspend_thread path of a relaxing shadow. Makes sense. It would be better to find a solution that does not make the relax path uninterruptible again for a significant amount of time. On low end platforms we support (i.e. non-x86* mainly), this causes obvious latency spots. I agree. Conceptually, the interruptible relaxation should be safe now after recent fixes. Likely the only possible fix is establishing a reschedule hook for Xenomai in the interrupt exit path after the original stack is restored - - just like Linux works. Requires changes to both ipipe and Xenomai unfortunately. __ipipe_run_irqtail() is in the I-pipe core for such purpose. If instantiated properly for x86_64, and paired with xnarch_escalate() for that arch as well, it could be an option for running the rescheduling procedure when safe. Nope, that doesn't work. The stack is switched later in the return path in entry_64.S. We need a hook there, ideally a conditional one, controlled by some per-cpu variable that is set by Xenomai on return from its interrupt handlers to signal the rescheduling need. Jan ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion
On Sat, 2011-07-16 at 11:15 +0200, Jan Kiszka wrote: On 2011-07-16 10:52, Philippe Gerum wrote: On Sat, 2011-07-16 at 10:13 +0200, Jan Kiszka wrote: On 2011-07-15 15:10, Jan Kiszka wrote: But... right now it looks like we found our primary regression: nucleus/shadow: shorten the uninterruptible path to secondary mode. It opens a short windows during relax where the migrated task may be active under both schedulers. We are currently evaluating a revert (looks good so far), and I need to work out my theory in more details. Looks like this commit just made a long-standing flaw in Xenomai's interrupt handling more visible: We reschedule over the interrupt stack in the Xenomai interrupt handler tails, at least on x86-64. Not sure if other archs have interrupt stacks, the point is Xenomai's design wrongly assumes there are no such things. Fortunately, no, this is not a design issue, no such assumption was ever made, but the Xenomai core expects this to be handled on a per-arch basis with the interrupt pipeline. And that's already the problem: If Linux uses interrupt stacks, relying on ipipe to disable this during Xenomai interrupt handler execution is at best a workaround. A fragile one unless you increase the pre-thread stack size by the size of the interrupt stack. Lacking support for a generic rescheduling hook became a problem by the time Linux introduced interrupt threads. Don't assume too much. What was done for ppc64 was not meant as a general policy. Again, this is a per-arch decision. As you pointed out, there is no way to handle this via some generic Xenomai-only support. ppc64 now has separate interrupt stacks, which is why I disabled IRQSTACKS which became the builtin default at some point. Blackfin goes through a Xenomai-defined irq tail handler as well, because it may not reschedule over nested interrupt stacks. How does this arch prevent that xnpod_schedule in the generic interrupt handler tail does its normal work? It polls some hw status to know whether a rescheduling would be safe. See xnarch_escalate(). Fact is that such pending problem with x86_64 was overlooked since day #1 by /me. We were lucky so far that the values saved on this shared stack were apparently compatible, means we were overwriting them with identical or harmless values. But that's no longer true when interrupts are hitting us in the xnpod_suspend_thread path of a relaxing shadow. Makes sense. It would be better to find a solution that does not make the relax path uninterruptible again for a significant amount of time. On low end platforms we support (i.e. non-x86* mainly), this causes obvious latency spots. I agree. Conceptually, the interruptible relaxation should be safe now after recent fixes. Likely the only possible fix is establishing a reschedule hook for Xenomai in the interrupt exit path after the original stack is restored - - just like Linux works. Requires changes to both ipipe and Xenomai unfortunately. __ipipe_run_irqtail() is in the I-pipe core for such purpose. If instantiated properly for x86_64, and paired with xnarch_escalate() for that arch as well, it could be an option for running the rescheduling procedure when safe. Nope, that doesn't work. The stack is switched later in the return path in entry_64.S. We need a hook there, ideally a conditional one, controlled by some per-cpu variable that is set by Xenomai on return from its interrupt handlers to signal the rescheduling need. Yes, makes sense. The way to make it conditional without dragging bits of Xenomai logic into the kernel innards is not obvious though. It is probably time to officially introduce exo-kernel oriented bits into the Linux thread info. PTDs have too lose semantics to be practical if we want to avoid trashing the I-cache by calling probe hooks within the dual kernel, each time we want to check some basic condition (e.g. resched needed). A backlink to a foreign TCB there would help too. Which leads us to killing the ad hoc kernel threads (and stacks) at some point, which are an absolute pain. Jan -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion
On 2011-07-16 11:56, Philippe Gerum wrote: On Sat, 2011-07-16 at 11:15 +0200, Jan Kiszka wrote: On 2011-07-16 10:52, Philippe Gerum wrote: On Sat, 2011-07-16 at 10:13 +0200, Jan Kiszka wrote: On 2011-07-15 15:10, Jan Kiszka wrote: But... right now it looks like we found our primary regression: nucleus/shadow: shorten the uninterruptible path to secondary mode. It opens a short windows during relax where the migrated task may be active under both schedulers. We are currently evaluating a revert (looks good so far), and I need to work out my theory in more details. Looks like this commit just made a long-standing flaw in Xenomai's interrupt handling more visible: We reschedule over the interrupt stack in the Xenomai interrupt handler tails, at least on x86-64. Not sure if other archs have interrupt stacks, the point is Xenomai's design wrongly assumes there are no such things. Fortunately, no, this is not a design issue, no such assumption was ever made, but the Xenomai core expects this to be handled on a per-arch basis with the interrupt pipeline. And that's already the problem: If Linux uses interrupt stacks, relying on ipipe to disable this during Xenomai interrupt handler execution is at best a workaround. A fragile one unless you increase the pre-thread stack size by the size of the interrupt stack. Lacking support for a generic rescheduling hook became a problem by the time Linux introduced interrupt threads. Don't assume too much. What was done for ppc64 was not meant as a general policy. Again, this is a per-arch decision. Actually, it was the right decision, not only for ppc64: Reusing Linux interrupt stacks for Xenomai does not work. If we interrupt Linux while it is already running over the interrupt stack, the stack becomes taboo on that CPU. From that point on, no RT IRQ must run over the Linux interrupt stack as it would smash it. But then the question is why we should try to use the interrupt stacks for Xenomai at all. It's better to increase the task kernel stacks and disable interrupt stacks when ipipe is enabled. That's what I'm heading for with x86-64 now (THREAD_ORDER 2, no stack switching). What we may do is introducing per-domain interrupt stacks. But that's at best Xenomai 3 / I-pipe 3 stuff. Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core