Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-16 Thread Jan Kiszka
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 2011-07-15 15:10, Jan Kiszka wrote:
 But... right now it looks like we found our primary regression: 
 nucleus/shadow: shorten the uninterruptible path to secondary mode.
 It opens a short windows during relax where the migrated task may be
 active under both schedulers. We are currently evaluating a revert
 (looks good so far), and I need to work out my theory in more
 details.

Looks like this commit just made a long-standing flaw in Xenomai's
interrupt handling more visible: We reschedule over the interrupt stack
in the Xenomai interrupt handler tails, at least on x86-64. Not sure if
other archs have interrupt stacks, the point is Xenomai's design wrongly
assumes there are no such things. We were lucky so far that the values
saved on this shared stack were apparently compatible, means we were
overwriting them with identical or harmless values. But that's no longer
true when interrupts are hitting us in the xnpod_suspend_thread path of
a relaxing shadow.

Likely the only possible fix is establishing a reschedule hook for
Xenomai in the interrupt exit path after the original stack is restored
- - just like Linux works. Requires changes to both ipipe and Xenomai
unfortunately.

Jan

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.16 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4hSDsACgkQitSsb3rl5xSmOACfbZfcNKyO9YDvPE+R5H75d0ky
DX0An32BrZW+lpEnxnLLCHSQ5r8itnE9
=n6u8
-END PGP SIGNATURE-

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-16 Thread Jan Kiszka

On 2011-07-16 10:52, Philippe Gerum wrote:

On Sat, 2011-07-16 at 10:13 +0200, Jan Kiszka wrote:

On 2011-07-15 15:10, Jan Kiszka wrote:

But... right now it looks like we found our primary regression:
nucleus/shadow: shorten the uninterruptible path to secondary mode.
It opens a short windows during relax where the migrated task may be
active under both schedulers. We are currently evaluating a revert
(looks good so far), and I need to work out my theory in more
details.


Looks like this commit just made a long-standing flaw in Xenomai's
interrupt handling more visible: We reschedule over the interrupt stack
in the Xenomai interrupt handler tails, at least on x86-64. Not sure if
other archs have interrupt stacks, the point is Xenomai's design wrongly
assumes there are no such things.


Fortunately, no, this is not a design issue, no such assumption was ever
made, but the Xenomai core expects this to be handled on a per-arch
basis with the interrupt pipeline.


And that's already the problem: If Linux uses interrupt stacks, relying 
on ipipe to disable this during Xenomai interrupt handler execution is 
at best a workaround. A fragile one unless you increase the pre-thread 
stack size by the size of the interrupt stack. Lacking support for a 
generic rescheduling hook became a problem by the time Linux introduced 
interrupt threads.



As you pointed out, there is no way
to handle this via some generic Xenomai-only support.

ppc64 now has separate interrupt stacks, which is why I disabled
IRQSTACKS which became the builtin default at some point. Blackfin goes
through a Xenomai-defined irq tail handler as well, because it may not
reschedule over nested interrupt stacks.


How does this arch prevent that xnpod_schedule in the generic interrupt 
handler tail does its normal work?



Fact is that such pending
problem with x86_64 was overlooked since day #1 by /me.


  We were lucky so far that the values
saved on this shared stack were apparently compatible, means we were
overwriting them with identical or harmless values. But that's no longer
true when interrupts are hitting us in the xnpod_suspend_thread path of
a relaxing shadow.



Makes sense. It would be better to find a solution that does not make
the relax path uninterruptible again for a significant amount of time.
On low end platforms we support (i.e. non-x86* mainly), this causes
obvious latency spots.


I agree. Conceptually, the interruptible relaxation should be safe now 
after recent fixes.





Likely the only possible fix is establishing a reschedule hook for
Xenomai in the interrupt exit path after the original stack is restored
- - just like Linux works. Requires changes to both ipipe and Xenomai
unfortunately.


__ipipe_run_irqtail() is in the I-pipe core for such purpose. If
instantiated properly for x86_64, and paired with xnarch_escalate() for
that arch as well, it could be an option for running the rescheduling
procedure when safe.


Nope, that doesn't work. The stack is switched later in the return path 
in entry_64.S. We need a hook there, ideally a conditional one, 
controlled by some per-cpu variable that is set by Xenomai on return 
from its interrupt handlers to signal the rescheduling need.


Jan

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-16 Thread Philippe Gerum
On Sat, 2011-07-16 at 11:15 +0200, Jan Kiszka wrote:
 On 2011-07-16 10:52, Philippe Gerum wrote:
  On Sat, 2011-07-16 at 10:13 +0200, Jan Kiszka wrote:
  On 2011-07-15 15:10, Jan Kiszka wrote:
  But... right now it looks like we found our primary regression:
  nucleus/shadow: shorten the uninterruptible path to secondary mode.
  It opens a short windows during relax where the migrated task may be
  active under both schedulers. We are currently evaluating a revert
  (looks good so far), and I need to work out my theory in more
  details.
 
  Looks like this commit just made a long-standing flaw in Xenomai's
  interrupt handling more visible: We reschedule over the interrupt stack
  in the Xenomai interrupt handler tails, at least on x86-64. Not sure if
  other archs have interrupt stacks, the point is Xenomai's design wrongly
  assumes there are no such things.
 
  Fortunately, no, this is not a design issue, no such assumption was ever
  made, but the Xenomai core expects this to be handled on a per-arch
  basis with the interrupt pipeline.
 
 And that's already the problem: If Linux uses interrupt stacks, relying 
 on ipipe to disable this during Xenomai interrupt handler execution is 
 at best a workaround. A fragile one unless you increase the pre-thread 
 stack size by the size of the interrupt stack. Lacking support for a 
 generic rescheduling hook became a problem by the time Linux introduced 
 interrupt threads.

Don't assume too much. What was done for ppc64 was not meant as a
general policy. Again, this is a per-arch decision.

 
  As you pointed out, there is no way
  to handle this via some generic Xenomai-only support.
 
  ppc64 now has separate interrupt stacks, which is why I disabled
  IRQSTACKS which became the builtin default at some point. Blackfin goes
  through a Xenomai-defined irq tail handler as well, because it may not
  reschedule over nested interrupt stacks.
 
 How does this arch prevent that xnpod_schedule in the generic interrupt 
 handler tail does its normal work?

It polls some hw status to know whether a rescheduling would be safe.
See xnarch_escalate().

 
  Fact is that such pending
  problem with x86_64 was overlooked since day #1 by /me.
 
We were lucky so far that the values
  saved on this shared stack were apparently compatible, means we were
  overwriting them with identical or harmless values. But that's no longer
  true when interrupts are hitting us in the xnpod_suspend_thread path of
  a relaxing shadow.
 
 
  Makes sense. It would be better to find a solution that does not make
  the relax path uninterruptible again for a significant amount of time.
  On low end platforms we support (i.e. non-x86* mainly), this causes
  obvious latency spots.
 
 I agree. Conceptually, the interruptible relaxation should be safe now 
 after recent fixes.
 
 
  Likely the only possible fix is establishing a reschedule hook for
  Xenomai in the interrupt exit path after the original stack is restored
  - - just like Linux works. Requires changes to both ipipe and Xenomai
  unfortunately.
 
  __ipipe_run_irqtail() is in the I-pipe core for such purpose. If
  instantiated properly for x86_64, and paired with xnarch_escalate() for
  that arch as well, it could be an option for running the rescheduling
  procedure when safe.
 
 Nope, that doesn't work. The stack is switched later in the return path 
 in entry_64.S. We need a hook there, ideally a conditional one, 
 controlled by some per-cpu variable that is set by Xenomai on return 
 from its interrupt handlers to signal the rescheduling need.
 

Yes, makes sense. The way to make it conditional without dragging bits
of Xenomai logic into the kernel innards is not obvious though.

It is probably time to officially introduce exo-kernel oriented bits
into the Linux thread info. PTDs have too lose semantics to be practical
if we want to avoid trashing the I-cache by calling probe hooks within
the dual kernel, each time we want to check some basic condition (e.g.
resched needed). A backlink to a foreign TCB there would help too.

Which leads us to killing the ad hoc kernel threads (and stacks) at some
point, which are an absolute pain.

 Jan

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion

2011-07-16 Thread Jan Kiszka
On 2011-07-16 11:56, Philippe Gerum wrote:
 On Sat, 2011-07-16 at 11:15 +0200, Jan Kiszka wrote:
 On 2011-07-16 10:52, Philippe Gerum wrote:
 On Sat, 2011-07-16 at 10:13 +0200, Jan Kiszka wrote:
 On 2011-07-15 15:10, Jan Kiszka wrote:
 But... right now it looks like we found our primary regression:
 nucleus/shadow: shorten the uninterruptible path to secondary mode.
 It opens a short windows during relax where the migrated task may be
 active under both schedulers. We are currently evaluating a revert
 (looks good so far), and I need to work out my theory in more
 details.

 Looks like this commit just made a long-standing flaw in Xenomai's
 interrupt handling more visible: We reschedule over the interrupt stack
 in the Xenomai interrupt handler tails, at least on x86-64. Not sure if
 other archs have interrupt stacks, the point is Xenomai's design wrongly
 assumes there are no such things.

 Fortunately, no, this is not a design issue, no such assumption was ever
 made, but the Xenomai core expects this to be handled on a per-arch
 basis with the interrupt pipeline.

 And that's already the problem: If Linux uses interrupt stacks, relying 
 on ipipe to disable this during Xenomai interrupt handler execution is 
 at best a workaround. A fragile one unless you increase the pre-thread 
 stack size by the size of the interrupt stack. Lacking support for a 
 generic rescheduling hook became a problem by the time Linux introduced 
 interrupt threads.
 
 Don't assume too much. What was done for ppc64 was not meant as a
 general policy. Again, this is a per-arch decision.

Actually, it was the right decision, not only for ppc64: Reusing Linux
interrupt stacks for Xenomai does not work. If we interrupt Linux while
it is already running over the interrupt stack, the stack becomes taboo
on that CPU. From that point on, no RT IRQ must run over the Linux
interrupt stack as it would smash it.

But then the question is why we should try to use the interrupt stacks
for Xenomai at all. It's better to increase the task kernel stacks and
disable interrupt stacks when ipipe is enabled. That's what I'm heading
for with x86-64 now (THREAD_ORDER 2, no stack switching).

What we may do is introducing per-domain interrupt stacks. But that's at
best Xenomai 3 / I-pipe 3 stuff.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core