Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jeroen Van den Keybus wrote: >> Revision 466 contains the mutex-info fix, but that is post -rc2. Why not >> switching to SVN head? > > > Philippe asked to apply the patch against Xenomai 2.1-rc2. Can I safely > patch it against the SVN tree ? After that, what will 'svn up' do to the > patched tree ? The CONFIG_PREEMPT fix is already contained in the latest SVN revision, no need to patch anymore. When unsure if a patch will cleanly apply, try "patch --dry-run" first. (Virtually) rejected hunks can then be used to asses if the patch fits - without messing up the code base immediately. > > Remember I'm quite new to Linux. Actually, I spent half an hour finding out > how that patch stuff (especially the -p option) works. > :) (it's no problem to ask even these kind of "stupid" questions to the list or us directly - no one will bite you!) Jan signature.asc Description: OpenPGP digital signature
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Revision 466 contains the mutex-info fix, but that is post -rc2. Why notswitching to SVN head? Philippe asked to apply the patch against Xenomai 2.1-rc2. Can I safely patch it against the SVN tree ? After that, what will 'svn up' do to the patched tree ? Remember I'm quite new to Linux. Actually, I spent half an hour finding out how that patch stuff (especially the -p option) works. Jeroen.
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jeroen Van den Keybus wrote: >>> I've installed both patches and the problem seems to have disappeared. >>> I'll try it on another machine tomorrow, too. Meanwhile: thanks very >>> much for the assistance ! > > > While testing more thoroughly, my triggers for zero mutex values after > acquiring the lock are going off again. I was using the SVN xenomai > development tree, but I've now switched to the (fixed) 2.1-rc2 in order to > apply the patches. Is Jan's bugfix included in that one ? Revision 466 contains the mutex-info fix, but that is post -rc2. Why not switching to SVN head? Jan signature.asc Description: OpenPGP digital signature
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
> I've installed both patches and the problem seems to have disappeared.> I'll try it on another machine tomorrow, too. Meanwhile: thanks very > much for the assistance ! While testing more thoroughly, my triggers for zero mutex values after acquiring the lock are going off again. I was using the SVN xenomai development tree, but I've now switched to the (fixed) 2.1-rc2 in order to apply the patches. Is Jan's bugfix included in that one ? Jeroen.
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jeroen Van den Keybus wrote: >> Revision 466 contains the mutex-info fix, but that is post -rc2. Why not >> switching to SVN head? > > > Philippe asked to apply the patch against Xenomai 2.1-rc2. Can I safely > patch it against the SVN tree ? After that, what will 'svn up' do to the > patched tree ? The CONFIG_PREEMPT fix is already contained in the latest SVN revision, no need to patch anymore. When unsure if a patch will cleanly apply, try "patch --dry-run" first. (Virtually) rejected hunks can then be used to asses if the patch fits - without messing up the code base immediately. > > Remember I'm quite new to Linux. Actually, I spent half an hour finding out > how that patch stuff (especially the -p option) works. > :) (it's no problem to ask even these kind of "stupid" questions to the list or us directly - no one will bite you!) Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Revision 466 contains the mutex-info fix, but that is post -rc2. Why notswitching to SVN head? Philippe asked to apply the patch against Xenomai 2.1-rc2. Can I safely patch it against the SVN tree ? After that, what will 'svn up' do to the patched tree ? Remember I'm quite new to Linux. Actually, I spent half an hour finding out how that patch stuff (especially the -p option) works. Jeroen. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jeroen Van den Keybus wrote: >>> I've installed both patches and the problem seems to have disappeared. >>> I'll try it on another machine tomorrow, too. Meanwhile: thanks very >>> much for the assistance ! > > > While testing more thoroughly, my triggers for zero mutex values after > acquiring the lock are going off again. I was using the SVN xenomai > development tree, but I've now switched to the (fixed) 2.1-rc2 in order to > apply the patches. Is Jan's bugfix included in that one ? Revision 466 contains the mutex-info fix, but that is post -rc2. Why not switching to SVN head? Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
> I've installed both patches and the problem seems to have disappeared.> I'll try it on another machine tomorrow, too. Meanwhile: thanks very > much for the assistance ! While testing more thoroughly, my triggers for zero mutex values after acquiring the lock are going off again. I was using the SVN xenomai development tree, but I've now switched to the (fixed) 2.1-rc2 in order to apply the patches. Is Jan's bugfix included in that one ? Jeroen. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jeroen Van den Keybus wrote: And now, Ladies and Gentlemen, with the patches attached. I've installed both patches and the problem seems to have disappeared. I'll try it on another machine tomorrow, too. Meanwhile: thanks very much for the assistance ! Actually, the effort you made to provide a streamlined testcase that triggered the bug did most of the job, so you are the one to thank here. The rest was only a matter of dealing with my own bugs, which is a sisyphean activity I'm rather familiar with. Jeroen. -- Philippe.
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
And now, Ladies and Gentlemen, with the patches attached. I've installed both patches and the problem seems to have disappeared. I'll try it on another machine tomorrow, too. Meanwhile: thanks very much for the assistance ! Jeroen.
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Philippe Gerum wrote: > Jan Kiszka wrote: >> At this chance: any comments on the panic-freeze extension for the >> tracer? I need to rework the Xenomai patch, but the ipipe side should be >> ready for merge. >> > > No issue with the ipipe side since it only touches the tracer support > code. No issue either at first sight with the Xeno side, aside of the > trace being frozen twice in do_schedule_event? (once in this routine, > twice in xnpod_fatal); but maybe it's wanted to freeze the situation > before the stack is dumped; is it? Yes, this is the reason for it. Actually, only the first freeze has any effect, later calls will be ignored. Hmm, I though to remember some issue of the Xenomai-side patch when tracing was disabled but I cannot reproduce this issue again (was likely related to other hacks while tracking down the PREEMPT issue). So from my POV that patch is ready for merge as well. Jan signature.asc Description: OpenPGP digital signature
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jeroen Van den Keybus wrote: And now, Ladies and Gentlemen, with the patches attached. I've installed both patches and the problem seems to have disappeared. I'll try it on another machine tomorrow, too. Meanwhile: thanks very much for the assistance ! Actually, the effort you made to provide a streamlined testcase that triggered the bug did most of the job, so you are the one to thank here. The rest was only a matter of dealing with my own bugs, which is a sisyphean activity I'm rather familiar with. Jeroen. -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
And now, Ladies and Gentlemen, with the patches attached. I've installed both patches and the problem seems to have disappeared. I'll try it on another machine tomorrow, too. Meanwhile: thanks very much for the assistance ! Jeroen. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Philippe Gerum wrote: > Jan Kiszka wrote: >> At this chance: any comments on the panic-freeze extension for the >> tracer? I need to rework the Xenomai patch, but the ipipe side should be >> ready for merge. >> > > No issue with the ipipe side since it only touches the tracer support > code. No issue either at first sight with the Xeno side, aside of the > trace being frozen twice in do_schedule_event? (once in this routine, > twice in xnpod_fatal); but maybe it's wanted to freeze the situation > before the stack is dumped; is it? Yes, this is the reason for it. Actually, only the first freeze has any effect, later calls will be ignored. Hmm, I though to remember some issue of the Xenomai-side patch when tracing was disabled but I cannot reproduce this issue again (was likely related to other hacks while tracking down the PREEMPT issue). So from my POV that patch is ready for merge as well. Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jan Kiszka wrote: Philippe Gerum wrote: Philippe Gerum wrote: Jan Kiszka wrote: Gilles Chanteperdrix wrote: Jeroen Van den Keybus wrote: Hello, I'm currently not at a level to participate in your discussion. Although I'm willing to supply you with stresstests, I would nevertheless like to learn more from task migration as this debugging session proceeds. In order to do so, please confirm the following statements or indicate where I went wrong. I hope others may learn from this as well. xn_shadow_harden(): This is called whenever a Xenomai thread performs a Linux (root domain) system call (notified by Adeos ?). xnshadow_harden() is called whenever a thread running in secondary mode (that is, running as a regular Linux thread, handled by Linux scheduler) is switching to primary mode (where it will run as a Xenomai thread, handled by Xenomai scheduler). Migrations occur for some system calls. More precisely, Xenomai skin system calls tables associates a few flags with each system call, and some of these flags cause migration of the caller when it issues the system call. Each Xenomai user-space thread has two contexts, a regular Linux thread context, and a Xenomai thread called "shadow" thread. Both contexts share the same stack and program counter, so that at any time, at least one of the two contexts is seen as suspended by the scheduler which handles it. Before xnshadow_harden is called, the Linux thread is running, and its shadow is seen in suspended state with XNRELAX bit by Xenomai scheduler. After xnshadow_harden, the Linux context is seen suspended with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as running by Xenomai scheduler. The migrating thread (nRT) is marked INTERRUPTIBLE and run by the Linux kernel wake_up_interruptible_sync() call. Is this thread actually run or does it merely put the thread in some Linux to-do list (I assumed the first case) ? Here, I am not sure, but it seems that when calling wake_up_interruptible_sync the woken up task is put in the current CPU runqueue, and this task (i.e. the gatekeeper), will not run until the current thread (i.e. the thread running xnshadow_harden) marks itself as suspended and calls schedule(). Maybe, marking the running thread as Depends on CONFIG_PREEMPT. If set, we get a preempt_schedule already here - and a switch if the prio of the woken up task is higher. BTW, an easy way to enforce the current trouble is to remove the "_sync" from wake_up_interruptible. As I understand it this _sync is just an optimisation hint for Linux to avoid needless scheduler runs. You could not guarantee the following execution sequence doing so either, i.e. 1- current wakes up the gatekeeper 2- current goes sleeping to exit the Linux runqueue in schedule() 3- the gatekeeper resumes the shadow-side of the old current The point is all about making 100% sure that current is going to be unlinked from the Linux runqueue before the gatekeeper processes the resumption request, whatever event the kernel is processing asynchronously in the meantime. This is the reason why, as you already noticed, preempt_schedule_irq() nicely breaks our toy by stealing the CPU from the hardening thread whilst keeping it linked to the runqueue: upon return from such preemption, the gatekeeper might have run already, hence the newly hardened thread ends up being seen as runnable by both the Linux and Xeno schedulers. Rainy day indeed. We could rely on giving "current" the highest SCHED_FIFO priority in xnshadow_harden() before waking up the gk, until the gk eventually promotes it to the Xenomai scheduling mode and downgrades this priority back to normal, but we would pay additional latencies induced by each aborted rescheduling attempt that may occur during the atomic path we want to enforce. The other way is to make sure that no in-kernel preemption of the hardening task could occur after step 1) and until step 2) is performed, given that we cannot currently call schedule() with interrupts or preemption off. I'm on it. Could anyone interested in this issue test the following couple of patches? atomic-switch-state.patch is to be applied against Adeos-1.1-03/x86 for 2.6.15 atomic-wakeup-and-schedule.patch is to be applied against Xeno 2.1-rc2 Both patches are needed to fix the issue. TIA, Looks good. I tried Jeroen's test-case and I was not able to reproduce the crash anymore. I think it's time for a new ipipe-release. ;) Looks like, indeed. At this chance: any comments on the panic-freeze extension for the tracer? I need to rework the Xenomai patch, but the ipipe side should be ready for merge. No issue with the ipipe side since it only touches the tracer support code. No issue either at first sight with the Xeno side, aside of the trace being frozen twice in do_schedule_event? (once in this routine, twice in xnpod_fatal); but maybe it's wanted to freeze the situation before the stack is dumped
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Philippe Gerum wrote: > Philippe Gerum wrote: >> Jan Kiszka wrote: >> >>> Gilles Chanteperdrix wrote: >>> Jeroen Van den Keybus wrote: > Hello, > > > I'm currently not at a level to participate in your discussion. Although I'm > willing to supply you with stresstests, I would nevertheless like to learn > more from task migration as this debugging session proceeds. In order to do > so, please confirm the following statements or indicate where I went wrong. > I hope others may learn from this as well. > > xn_shadow_harden(): This is called whenever a Xenomai thread performs a > Linux (root domain) system call (notified by Adeos ?). xnshadow_harden() is called whenever a thread running in secondary mode (that is, running as a regular Linux thread, handled by Linux scheduler) is switching to primary mode (where it will run as a Xenomai thread, handled by Xenomai scheduler). Migrations occur for some system calls. More precisely, Xenomai skin system calls tables associates a few flags with each system call, and some of these flags cause migration of the caller when it issues the system call. Each Xenomai user-space thread has two contexts, a regular Linux thread context, and a Xenomai thread called "shadow" thread. Both contexts share the same stack and program counter, so that at any time, at least one of the two contexts is seen as suspended by the scheduler which handles it. Before xnshadow_harden is called, the Linux thread is running, and its shadow is seen in suspended state with XNRELAX bit by Xenomai scheduler. After xnshadow_harden, the Linux context is seen suspended with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as running by Xenomai scheduler. The migrating thread > (nRT) is marked INTERRUPTIBLE and run by the Linux kernel > wake_up_interruptible_sync() call. Is this thread actually run or does it > merely put the thread in some Linux to-do list (I assumed the first case) ? Here, I am not sure, but it seems that when calling wake_up_interruptible_sync the woken up task is put in the current CPU runqueue, and this task (i.e. the gatekeeper), will not run until the current thread (i.e. the thread running xnshadow_harden) marks itself as suspended and calls schedule(). Maybe, marking the running thread as >>> >>> >>> >>> Depends on CONFIG_PREEMPT. If set, we get a preempt_schedule already >>> here - and a switch if the prio of the woken up task is higher. >>> >>> BTW, an easy way to enforce the current trouble is to remove the "_sync" >>> from wake_up_interruptible. As I understand it this _sync is just an >>> optimisation hint for Linux to avoid needless scheduler runs. >>> >> >> You could not guarantee the following execution sequence doing so >> either, i.e. >> >> 1- current wakes up the gatekeeper >> 2- current goes sleeping to exit the Linux runqueue in schedule() >> 3- the gatekeeper resumes the shadow-side of the old current >> >> The point is all about making 100% sure that current is going to be >> unlinked from the Linux runqueue before the gatekeeper processes the >> resumption request, whatever event the kernel is processing >> asynchronously in the meantime. This is the reason why, as you already >> noticed, preempt_schedule_irq() nicely breaks our toy by stealing the >> CPU from the hardening thread whilst keeping it linked to the >> runqueue: upon return from such preemption, the gatekeeper might have >> run already, hence the newly hardened thread ends up being seen as >> runnable by both the Linux and Xeno schedulers. Rainy day indeed. >> >> We could rely on giving "current" the highest SCHED_FIFO priority in >> xnshadow_harden() before waking up the gk, until the gk eventually >> promotes it to the Xenomai scheduling mode and downgrades this >> priority back to normal, but we would pay additional latencies induced >> by each aborted rescheduling attempt that may occur during the atomic >> path we want to enforce. >> >> The other way is to make sure that no in-kernel preemption of the >> hardening task could occur after step 1) and until step 2) is >> performed, given that we cannot currently call schedule() with >> interrupts or preemption off. I'm on it. >> > > Could anyone interested in this issue test the following couple of patches? > > atomic-switch-state.patch is to be applied against Adeos-1.1-03/x86 for > 2.6.15 > atomic-wakeup-and-schedule.patch is to be applied against Xeno 2.1-rc2 > > Both patches are needed to fix the issue. > > TIA, > Looks good. I tried Jeroen's test-case and I was not able to reproduce the crash anymore. I think it's time for a new ipipe-release. ;) At this chance: any comments on the panic-freeze extension for the tracer? I need to rework the Xenomai patch, but the ipipe side should be re
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Philippe Gerum wrote: Jan Kiszka wrote: Gilles Chanteperdrix wrote: Jeroen Van den Keybus wrote: > Hello, > > > I'm currently not at a level to participate in your discussion. Although I'm > willing to supply you with stresstests, I would nevertheless like to learn > more from task migration as this debugging session proceeds. In order to do > so, please confirm the following statements or indicate where I went wrong. > I hope others may learn from this as well. > > xn_shadow_harden(): This is called whenever a Xenomai thread performs a > Linux (root domain) system call (notified by Adeos ?). xnshadow_harden() is called whenever a thread running in secondary mode (that is, running as a regular Linux thread, handled by Linux scheduler) is switching to primary mode (where it will run as a Xenomai thread, handled by Xenomai scheduler). Migrations occur for some system calls. More precisely, Xenomai skin system calls tables associates a few flags with each system call, and some of these flags cause migration of the caller when it issues the system call. Each Xenomai user-space thread has two contexts, a regular Linux thread context, and a Xenomai thread called "shadow" thread. Both contexts share the same stack and program counter, so that at any time, at least one of the two contexts is seen as suspended by the scheduler which handles it. Before xnshadow_harden is called, the Linux thread is running, and its shadow is seen in suspended state with XNRELAX bit by Xenomai scheduler. After xnshadow_harden, the Linux context is seen suspended with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as running by Xenomai scheduler. The migrating thread > (nRT) is marked INTERRUPTIBLE and run by the Linux kernel > wake_up_interruptible_sync() call. Is this thread actually run or does it > merely put the thread in some Linux to-do list (I assumed the first case) ? Here, I am not sure, but it seems that when calling wake_up_interruptible_sync the woken up task is put in the current CPU runqueue, and this task (i.e. the gatekeeper), will not run until the current thread (i.e. the thread running xnshadow_harden) marks itself as suspended and calls schedule(). Maybe, marking the running thread as Depends on CONFIG_PREEMPT. If set, we get a preempt_schedule already here - and a switch if the prio of the woken up task is higher. BTW, an easy way to enforce the current trouble is to remove the "_sync" from wake_up_interruptible. As I understand it this _sync is just an optimisation hint for Linux to avoid needless scheduler runs. You could not guarantee the following execution sequence doing so either, i.e. 1- current wakes up the gatekeeper 2- current goes sleeping to exit the Linux runqueue in schedule() 3- the gatekeeper resumes the shadow-side of the old current The point is all about making 100% sure that current is going to be unlinked from the Linux runqueue before the gatekeeper processes the resumption request, whatever event the kernel is processing asynchronously in the meantime. This is the reason why, as you already noticed, preempt_schedule_irq() nicely breaks our toy by stealing the CPU from the hardening thread whilst keeping it linked to the runqueue: upon return from such preemption, the gatekeeper might have run already, hence the newly hardened thread ends up being seen as runnable by both the Linux and Xeno schedulers. Rainy day indeed. We could rely on giving "current" the highest SCHED_FIFO priority in xnshadow_harden() before waking up the gk, until the gk eventually promotes it to the Xenomai scheduling mode and downgrades this priority back to normal, but we would pay additional latencies induced by each aborted rescheduling attempt that may occur during the atomic path we want to enforce. The other way is to make sure that no in-kernel preemption of the hardening task could occur after step 1) and until step 2) is performed, given that we cannot currently call schedule() with interrupts or preemption off. I'm on it. > Could anyone interested in this issue test the following couple of patches? > atomic-switch-state.patch is to be applied against Adeos-1.1-03/x86 for 2.6.15 > atomic-wakeup-and-schedule.patch is to be applied against Xeno 2.1-rc2 > Both patches are needed to fix the issue. > TIA, And now, Ladies and Gentlemen, with the patches attached. -- Philippe. --- 2.6.15-x86/kernel/sched.c 2006-01-07 15:18:31.0 +0100 +++ 2.6.15-ipipe/kernel/sched.c 2006-01-30 15:15:27.0 +0100 @@ -2963,7 +2963,7 @@ * Otherwise, whine if we are scheduling when we should not be. */ if (likely(!current->exit_state)) { - if (unlikely(in_atomic())) { + if (unlikely(!(current->state & TASK_ATOMICSWITCH) && in_atomic())) { printk(KERN_ERR "scheduling while atomic: " "%s/0x%08x/%d\n", current->comm, preempt_count(), current->pid); @@ -2972,8 +2972,13 @@ } profile_hit(SCHED_PROFILING, __builtin_return_ad
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Philippe Gerum wrote: Jan Kiszka wrote: Gilles Chanteperdrix wrote: Jeroen Van den Keybus wrote: > Hello, > > > I'm currently not at a level to participate in your discussion. Although I'm > willing to supply you with stresstests, I would nevertheless like to learn > more from task migration as this debugging session proceeds. In order to do > so, please confirm the following statements or indicate where I went wrong. > I hope others may learn from this as well. > > xn_shadow_harden(): This is called whenever a Xenomai thread performs a > Linux (root domain) system call (notified by Adeos ?). xnshadow_harden() is called whenever a thread running in secondary mode (that is, running as a regular Linux thread, handled by Linux scheduler) is switching to primary mode (where it will run as a Xenomai thread, handled by Xenomai scheduler). Migrations occur for some system calls. More precisely, Xenomai skin system calls tables associates a few flags with each system call, and some of these flags cause migration of the caller when it issues the system call. Each Xenomai user-space thread has two contexts, a regular Linux thread context, and a Xenomai thread called "shadow" thread. Both contexts share the same stack and program counter, so that at any time, at least one of the two contexts is seen as suspended by the scheduler which handles it. Before xnshadow_harden is called, the Linux thread is running, and its shadow is seen in suspended state with XNRELAX bit by Xenomai scheduler. After xnshadow_harden, the Linux context is seen suspended with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as running by Xenomai scheduler. The migrating thread > (nRT) is marked INTERRUPTIBLE and run by the Linux kernel > wake_up_interruptible_sync() call. Is this thread actually run or does it > merely put the thread in some Linux to-do list (I assumed the first case) ? Here, I am not sure, but it seems that when calling wake_up_interruptible_sync the woken up task is put in the current CPU runqueue, and this task (i.e. the gatekeeper), will not run until the current thread (i.e. the thread running xnshadow_harden) marks itself as suspended and calls schedule(). Maybe, marking the running thread as Depends on CONFIG_PREEMPT. If set, we get a preempt_schedule already here - and a switch if the prio of the woken up task is higher. BTW, an easy way to enforce the current trouble is to remove the "_sync" from wake_up_interruptible. As I understand it this _sync is just an optimisation hint for Linux to avoid needless scheduler runs. You could not guarantee the following execution sequence doing so either, i.e. 1- current wakes up the gatekeeper 2- current goes sleeping to exit the Linux runqueue in schedule() 3- the gatekeeper resumes the shadow-side of the old current The point is all about making 100% sure that current is going to be unlinked from the Linux runqueue before the gatekeeper processes the resumption request, whatever event the kernel is processing asynchronously in the meantime. This is the reason why, as you already noticed, preempt_schedule_irq() nicely breaks our toy by stealing the CPU from the hardening thread whilst keeping it linked to the runqueue: upon return from such preemption, the gatekeeper might have run already, hence the newly hardened thread ends up being seen as runnable by both the Linux and Xeno schedulers. Rainy day indeed. We could rely on giving "current" the highest SCHED_FIFO priority in xnshadow_harden() before waking up the gk, until the gk eventually promotes it to the Xenomai scheduling mode and downgrades this priority back to normal, but we would pay additional latencies induced by each aborted rescheduling attempt that may occur during the atomic path we want to enforce. The other way is to make sure that no in-kernel preemption of the hardening task could occur after step 1) and until step 2) is performed, given that we cannot currently call schedule() with interrupts or preemption off. I'm on it. Could anyone interested in this issue test the following couple of patches? atomic-switch-state.patch is to be applied against Adeos-1.1-03/x86 for 2.6.15 atomic-wakeup-and-schedule.patch is to be applied against Xeno 2.1-rc2 Both patches are needed to fix the issue. TIA, -- Philippe.
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jan Kiszka wrote: Gilles Chanteperdrix wrote: Jeroen Van den Keybus wrote: > Hello, > > > I'm currently not at a level to participate in your discussion. Although I'm > willing to supply you with stresstests, I would nevertheless like to learn > more from task migration as this debugging session proceeds. In order to do > so, please confirm the following statements or indicate where I went wrong. > I hope others may learn from this as well. > > xn_shadow_harden(): This is called whenever a Xenomai thread performs a > Linux (root domain) system call (notified by Adeos ?). xnshadow_harden() is called whenever a thread running in secondary mode (that is, running as a regular Linux thread, handled by Linux scheduler) is switching to primary mode (where it will run as a Xenomai thread, handled by Xenomai scheduler). Migrations occur for some system calls. More precisely, Xenomai skin system calls tables associates a few flags with each system call, and some of these flags cause migration of the caller when it issues the system call. Each Xenomai user-space thread has two contexts, a regular Linux thread context, and a Xenomai thread called "shadow" thread. Both contexts share the same stack and program counter, so that at any time, at least one of the two contexts is seen as suspended by the scheduler which handles it. Before xnshadow_harden is called, the Linux thread is running, and its shadow is seen in suspended state with XNRELAX bit by Xenomai scheduler. After xnshadow_harden, the Linux context is seen suspended with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as running by Xenomai scheduler. The migrating thread > (nRT) is marked INTERRUPTIBLE and run by the Linux kernel > wake_up_interruptible_sync() call. Is this thread actually run or does it > merely put the thread in some Linux to-do list (I assumed the first case) ? Here, I am not sure, but it seems that when calling wake_up_interruptible_sync the woken up task is put in the current CPU runqueue, and this task (i.e. the gatekeeper), will not run until the current thread (i.e. the thread running xnshadow_harden) marks itself as suspended and calls schedule(). Maybe, marking the running thread as Depends on CONFIG_PREEMPT. If set, we get a preempt_schedule already here - and a switch if the prio of the woken up task is higher. BTW, an easy way to enforce the current trouble is to remove the "_sync" from wake_up_interruptible. As I understand it this _sync is just an optimisation hint for Linux to avoid needless scheduler runs. You could not guarantee the following execution sequence doing so either, i.e. 1- current wakes up the gatekeeper 2- current goes sleeping to exit the Linux runqueue in schedule() 3- the gatekeeper resumes the shadow-side of the old current The point is all about making 100% sure that current is going to be unlinked from the Linux runqueue before the gatekeeper processes the resumption request, whatever event the kernel is processing asynchronously in the meantime. This is the reason why, as you already noticed, preempt_schedule_irq() nicely breaks our toy by stealing the CPU from the hardening thread whilst keeping it linked to the runqueue: upon return from such preemption, the gatekeeper might have run already, hence the newly hardened thread ends up being seen as runnable by both the Linux and Xeno schedulers. Rainy day indeed. We could rely on giving "current" the highest SCHED_FIFO priority in xnshadow_harden() before waking up the gk, until the gk eventually promotes it to the Xenomai scheduling mode and downgrades this priority back to normal, but we would pay additional latencies induced by each aborted rescheduling attempt that may occur during the atomic path we want to enforce. The other way is to make sure that no in-kernel preemption of the hardening task could occur after step 1) and until step 2) is performed, given that we cannot currently call schedule() with interrupts or preemption off. I'm on it. suspended is not needed, since the gatekeeper may have a high priority, and calling schedule() is enough. In any case, the waken up thread does not seem to be run immediately, so this rather look like the second case. Since in xnshadow_harden, the running thread marks itself as suspended before running wake_up_interruptible_sync, the gatekeeper will run when schedule() get called, which in turn, depend on the CONFIG_PREEMPT* configuration. In the non-preempt case, the current thread will be suspended and the gatekeeper will run when schedule() is explicitely called in xnshadow_harden(). In the preempt case, schedule gets called when the outermost spinlock is unlocked in wake_up_interruptible_sync(). > And how does it terminate: is only the system call migrated or is the thread > allowed to continue run (at a priority level equal to the Xenomai > priority level) until it hits something of the Xenomai API (or trivially: > explicitly go to RT using th
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jan Kiszka wrote: Philippe Gerum wrote: Philippe Gerum wrote: Jan Kiszka wrote: Gilles Chanteperdrix wrote: Jeroen Van den Keybus wrote: Hello, I'm currently not at a level to participate in your discussion. Although I'm willing to supply you with stresstests, I would nevertheless like to learn more from task migration as this debugging session proceeds. In order to do so, please confirm the following statements or indicate where I went wrong. I hope others may learn from this as well. xn_shadow_harden(): This is called whenever a Xenomai thread performs a Linux (root domain) system call (notified by Adeos ?). xnshadow_harden() is called whenever a thread running in secondary mode (that is, running as a regular Linux thread, handled by Linux scheduler) is switching to primary mode (where it will run as a Xenomai thread, handled by Xenomai scheduler). Migrations occur for some system calls. More precisely, Xenomai skin system calls tables associates a few flags with each system call, and some of these flags cause migration of the caller when it issues the system call. Each Xenomai user-space thread has two contexts, a regular Linux thread context, and a Xenomai thread called "shadow" thread. Both contexts share the same stack and program counter, so that at any time, at least one of the two contexts is seen as suspended by the scheduler which handles it. Before xnshadow_harden is called, the Linux thread is running, and its shadow is seen in suspended state with XNRELAX bit by Xenomai scheduler. After xnshadow_harden, the Linux context is seen suspended with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as running by Xenomai scheduler. The migrating thread (nRT) is marked INTERRUPTIBLE and run by the Linux kernel wake_up_interruptible_sync() call. Is this thread actually run or does it merely put the thread in some Linux to-do list (I assumed the first case) ? Here, I am not sure, but it seems that when calling wake_up_interruptible_sync the woken up task is put in the current CPU runqueue, and this task (i.e. the gatekeeper), will not run until the current thread (i.e. the thread running xnshadow_harden) marks itself as suspended and calls schedule(). Maybe, marking the running thread as Depends on CONFIG_PREEMPT. If set, we get a preempt_schedule already here - and a switch if the prio of the woken up task is higher. BTW, an easy way to enforce the current trouble is to remove the "_sync" from wake_up_interruptible. As I understand it this _sync is just an optimisation hint for Linux to avoid needless scheduler runs. You could not guarantee the following execution sequence doing so either, i.e. 1- current wakes up the gatekeeper 2- current goes sleeping to exit the Linux runqueue in schedule() 3- the gatekeeper resumes the shadow-side of the old current The point is all about making 100% sure that current is going to be unlinked from the Linux runqueue before the gatekeeper processes the resumption request, whatever event the kernel is processing asynchronously in the meantime. This is the reason why, as you already noticed, preempt_schedule_irq() nicely breaks our toy by stealing the CPU from the hardening thread whilst keeping it linked to the runqueue: upon return from such preemption, the gatekeeper might have run already, hence the newly hardened thread ends up being seen as runnable by both the Linux and Xeno schedulers. Rainy day indeed. We could rely on giving "current" the highest SCHED_FIFO priority in xnshadow_harden() before waking up the gk, until the gk eventually promotes it to the Xenomai scheduling mode and downgrades this priority back to normal, but we would pay additional latencies induced by each aborted rescheduling attempt that may occur during the atomic path we want to enforce. The other way is to make sure that no in-kernel preemption of the hardening task could occur after step 1) and until step 2) is performed, given that we cannot currently call schedule() with interrupts or preemption off. I'm on it. Could anyone interested in this issue test the following couple of patches? atomic-switch-state.patch is to be applied against Adeos-1.1-03/x86 for 2.6.15 atomic-wakeup-and-schedule.patch is to be applied against Xeno 2.1-rc2 Both patches are needed to fix the issue. TIA, Looks good. I tried Jeroen's test-case and I was not able to reproduce the crash anymore. I think it's time for a new ipipe-release. ;) Looks like, indeed. At this chance: any comments on the panic-freeze extension for the tracer? I need to rework the Xenomai patch, but the ipipe side should be ready for merge. No issue with the ipipe side since it only touches the tracer support code. No issue either at first sight with the Xeno side, aside of the trace being frozen twice in do_schedule_event? (once in this routine, twice in xnpod_fatal); but maybe it's wanted to freeze the situation before the stack is dumped
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
On 30/01/06, Jan Kiszka <[EMAIL PROTECTED]> wrote: Dmitry Adamushko wrote:>>> ...>>> I have not checked it yet but my presupposition that something as easy as>> :>>> preempt_disable()>> wake_up_interruptible_sync(); >>> schedule();>> preempt_enable();>> It's a no-go: "scheduling while atomic". One of my first attempts to>> solve it.>>> My fault. I meant the way preempt_schedule() and preempt_irq_schedule() call > schedule() while being non-preemptible.> To this end, ACTIVE_PREEMPT is set up.> The use of preempt_enable/disable() here is wrong.>>> The only way to enter schedule() without being preemptible is via >> ACTIVE_PREEMPT. But the effect of that flag should be well-known now.>> Kind of Gordian knot. :(>>> Maybe I have missed something so just for my curiosity : what does prevent > the use of PREEMPT_ACTIVE here?> We don't have a "preempted while atomic" message here as it seems to be a> legal way to call schedule() with that flag being set up.When PREEMPT_ACTIVE is set, task gets /preempted/ but not removed from the run queue - independent of its current status. Err... that's exactly the reason I have explained in my first mail for this thread :) Blah.. I wish I was smoking something special before so I would point that as the reason of my forgetfulness. Actually, we could use PREEMPT_ACTIVE indeed + something else (probably another flag) to distinguish between a case when PREEMPT_ACTIVE is set by Linux and another case when it's set by xnshadow_harden(). xnshadow_harden() { struct task_struct *this_task = current; ... xnthread_t *thread = xnshadow_thread(this_task); if (!thread) return; ... gk->thread = thread; + add_preempt_count(PREEMPT_ACTIVE); // should be checked in schedule() + xnthread_set_flags(thread, XNATOMIC_TRANSIT); set_current_state(TASK_INTERRUPTIBLE); wake_up_interruptible_sync(&gk->waitq); + schedule(); + sub_preempt_count(PREEMPT_ACTIVE); ... } Then, something like the following code should be called from schedule() : void ipipe_transit_cleanup(struct task_struct *task, runqueue_t *rq) { xnthread_t *thread = xnshadow_thread(task); if (!thread) return; if (xnthread_test_flags(thread, XNATOMIC_TRANSIT)) { xnthread_clear_flags(thread, XNATOMIC_TRANSIT); deactivate_task(task, rq); } } - schedule.c : ... switch_count = &prev->nivcsw; if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) switch_count = &prev->nvcsw; if (unlikely((prev->state & TASK_INTERRUPTIBLE) && unlikely(signal_pending(prev)) )) prev->state = TASK_RUNNING; else { if (prev->state == TASK_UNINTERRUPTIBLE) rq->nr_uninterruptible++; deactivate_task(prev, rq); } } // removes a task from the active queue if PREEMPT_ACTIVE + // XNATOMIC_TRANSIT + #ifdef CONFIG_IPIPE + ipipe_transit_cleanup(prev, rq); + #endif /* CONFIG_IPIPE */ ... Not very gracefully maybe, but could work or am I missing something important? -- Best regards,Dmitry Adamushko
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Dmitry Adamushko wrote: >>> ... > >> I have not checked it yet but my presupposition that something as easy as >> : >>> preempt_disable() >>> >>> wake_up_interruptible_sync(); >>> schedule(); >>> >>> preempt_enable(); >> It's a no-go: "scheduling while atomic". One of my first attempts to >> solve it. > > > My fault. I meant the way preempt_schedule() and preempt_irq_schedule() call > schedule() while being non-preemptible. > To this end, ACTIVE_PREEMPT is set up. > The use of preempt_enable/disable() here is wrong. > > > The only way to enter schedule() without being preemptible is via >> ACTIVE_PREEMPT. But the effect of that flag should be well-known now. >> Kind of Gordian knot. :( > > > Maybe I have missed something so just for my curiosity : what does prevent > the use of PREEMPT_ACTIVE here? > We don't have a "preempted while atomic" message here as it seems to be a > legal way to call schedule() with that flag being set up. When PREEMPT_ACTIVE is set, task gets /preempted/ but not removed from the run queue - independent of its current status. > > >>> could work... err.. and don't blame me if no, it's some one else who has >>> written that nonsense :o) >>> >>> -- >>> Best regards, >>> Dmitry Adamushko >>> >> Jan >> >> >> >> > > > -- > Best regards, > Dmitry Adamushko > > > > > > ___ > Xenomai-core mailing list > Xenomai-core@gna.org > https://mail.gna.org/listinfo/xenomai-core signature.asc Description: OpenPGP digital signature
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
>> ... > I have not checked it yet but my presupposition that something as easy as : >> preempt_disable()>> wake_up_interruptible_sync();> schedule();>> preempt_enable();It's a no-go: "scheduling while atomic". One of my first attempts tosolve it. My fault. I meant the way preempt_schedule() and preempt_irq_schedule() call schedule() while being non-preemptible. To this end, ACTIVE_PREEMPT is set up. The use of preempt_enable/disable() here is wrong. The only way to enter schedule() without being preemptible is viaACTIVE_PREEMPT. But the effect of that flag should be well-known now. Kind of Gordian knot. :( Maybe I have missed something so just for my curiosity : what does prevent the use of PREEMPT_ACTIVE here? We don't have a "preempted while atomic" message here as it seems to be a legal way to call schedule() with that flag being set up. >>> could work... err.. and don't blame me if no, it's some one else who has > written that nonsense :o)>> --> Best regards,> Dmitry Adamushko>Jan-- Best regards,Dmitry Adamushko
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Philippe Gerum wrote: Jan Kiszka wrote: Hi, well, if I'm not totally wrong, we have a design problem in the RT-thread hardening path. I dug into the crash Jeroen reported and I'm quite sure that this is the reason. So that's the bad news. The good one is that we can at least work around it by switching off CONFIG_PREEMPT for Linux (this implicitly means that it's a 2.6-only issue). @Jeroen: Did you verify that your setup also works fine without CONFIG_PREEMPT? But let's start with two assumptions my further analysis is based on: [Xenomai] o Shadow threads have only one stack, i.e. one context. If the real-time part is active (this includes it is blocked on some xnsynch object or delayed), the original Linux task must NEVER EVER be executed, even if it will immediately fall asleep again. That's because the stack is in use by the real-time part at that time. And this condition is checked in do_schedule_event() [1]. [Linux] o A Linux task which has called set_current_state() will remain in the run-queue as long as it calls schedule() on its own. This means that it can be preempted (if CONFIG_PREEMPT is set) between set_current_state() and schedule() and then even be resumed again. Only the explicit call of schedule() will trigger deactivate_task() which will in turn remove current from the run-queue. Ok, if this is true, let's have a look at xnshadow_harden(): After grabbing the gatekeeper sem and putting itself in gk->thread, a task going for RT then marks itself TASK_INTERRUPTIBLE and wakes up the gatekeeper [2]. This does not include a Linux reschedule due to the _sync version of wake_up_interruptible. What can happen now? 1) No interruption until we can called schedule() [3]. All fine as we will not be removed from the run-queue before the gatekeeper starts kicking our RT part, thus no conflict in using the thread's stack. 3) Interruption by a RT IRQ. This would just delay the path described above, even if some RT threads get executed. Once they are finished, we continue in xnshadow_harden() - given that the RT part does not trigger the following case: 3) Interruption by some Linux IRQ. This may cause other threads to become runnable as well, but the gatekeeper has the highest prio and will therefore be the next. The problem is that the rescheduling on Linux IRQ exit will PREEMPT our task in xnshadow_harden(), it will NOT remove it from the Linux run-queue. And now we are in real troubles: The gatekeeper will kick off our RT part which will take over the thread's stack. As soon as the RT domain falls asleep and Linux takes over again, it will continue our non-RT part as well! Actually, this seems to be the reason for the panic in do_schedule_event(). Without CONFIG_XENO_OPT_DEBUG and this check, we will run both parts AT THE SAME TIME now, thus violating my first assumption. The system gets fatally corrupted. Yep, that's it. And we may not lock out the interrupts before calling schedule to prevent that. Well, I would be happy if someone can prove me wrong here. The problem is that I don't see a solution because Linux does not provide an atomic wake-up + schedule-out under CONFIG_PREEMPT. I'm currently considering a hack to remove the migrating Linux thread manually from the run-queue, but this could easily break the Linux scheduler. Maybe the best way would be to provide atomic wakeup-and-schedule support into the Adeos patch for Linux tasks; previous attempts to fix this by circumventing the potential for preemption from outside of the scheduler code have all failed, and this bug is uselessly lingering for that reason. Having slept on this, I'm going to add a simple extension to the Linux scheduler available from Adeos, in order to get an atomic/unpreemptable path from the statement when the current task's state is changed for suspension (e.g. TASK_INTERRUPTIBLE), to the point where schedule() normally enters its atomic section, which looks like the sanest way to solve this issue, i.e. without gory hackery all over the place. Patch will follow later for testing this approach. Jan PS: Out of curiosity I also checked RTAI's migration mechanism in this regard. It's similar except for the fact that it does the gatekeeper's work in the Linux scheduler's tail (i.e. after the next context switch). And RTAI seems it suffers from the very same race. So this is either a fundamental issue - or I'm fundamentally wrong. [1]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L1573 [2]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L461 [3]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L481 ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core -- Philippe.
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Philippe Gerum wrote: > Philippe Gerum wrote: >> Jan Kiszka wrote: >> >>> Gilles Chanteperdrix wrote: >>> Jeroen Van den Keybus wrote: > Hello, > > > I'm currently not at a level to participate in your discussion. Although I'm > willing to supply you with stresstests, I would nevertheless like to learn > more from task migration as this debugging session proceeds. In order to do > so, please confirm the following statements or indicate where I went wrong. > I hope others may learn from this as well. > > xn_shadow_harden(): This is called whenever a Xenomai thread performs a > Linux (root domain) system call (notified by Adeos ?). xnshadow_harden() is called whenever a thread running in secondary mode (that is, running as a regular Linux thread, handled by Linux scheduler) is switching to primary mode (where it will run as a Xenomai thread, handled by Xenomai scheduler). Migrations occur for some system calls. More precisely, Xenomai skin system calls tables associates a few flags with each system call, and some of these flags cause migration of the caller when it issues the system call. Each Xenomai user-space thread has two contexts, a regular Linux thread context, and a Xenomai thread called "shadow" thread. Both contexts share the same stack and program counter, so that at any time, at least one of the two contexts is seen as suspended by the scheduler which handles it. Before xnshadow_harden is called, the Linux thread is running, and its shadow is seen in suspended state with XNRELAX bit by Xenomai scheduler. After xnshadow_harden, the Linux context is seen suspended with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as running by Xenomai scheduler. The migrating thread > (nRT) is marked INTERRUPTIBLE and run by the Linux kernel > wake_up_interruptible_sync() call. Is this thread actually run or does it > merely put the thread in some Linux to-do list (I assumed the first case) ? Here, I am not sure, but it seems that when calling wake_up_interruptible_sync the woken up task is put in the current CPU runqueue, and this task (i.e. the gatekeeper), will not run until the current thread (i.e. the thread running xnshadow_harden) marks itself as suspended and calls schedule(). Maybe, marking the running thread as >>> >>> >>> >>> Depends on CONFIG_PREEMPT. If set, we get a preempt_schedule already >>> here - and a switch if the prio of the woken up task is higher. >>> >>> BTW, an easy way to enforce the current trouble is to remove the "_sync" >>> from wake_up_interruptible. As I understand it this _sync is just an >>> optimisation hint for Linux to avoid needless scheduler runs. >>> >> >> You could not guarantee the following execution sequence doing so >> either, i.e. >> >> 1- current wakes up the gatekeeper >> 2- current goes sleeping to exit the Linux runqueue in schedule() >> 3- the gatekeeper resumes the shadow-side of the old current >> >> The point is all about making 100% sure that current is going to be >> unlinked from the Linux runqueue before the gatekeeper processes the >> resumption request, whatever event the kernel is processing >> asynchronously in the meantime. This is the reason why, as you already >> noticed, preempt_schedule_irq() nicely breaks our toy by stealing the >> CPU from the hardening thread whilst keeping it linked to the >> runqueue: upon return from such preemption, the gatekeeper might have >> run already, hence the newly hardened thread ends up being seen as >> runnable by both the Linux and Xeno schedulers. Rainy day indeed. >> >> We could rely on giving "current" the highest SCHED_FIFO priority in >> xnshadow_harden() before waking up the gk, until the gk eventually >> promotes it to the Xenomai scheduling mode and downgrades this >> priority back to normal, but we would pay additional latencies induced >> by each aborted rescheduling attempt that may occur during the atomic >> path we want to enforce. >> >> The other way is to make sure that no in-kernel preemption of the >> hardening task could occur after step 1) and until step 2) is >> performed, given that we cannot currently call schedule() with >> interrupts or preemption off. I'm on it. >> > > Could anyone interested in this issue test the following couple of patches? > > atomic-switch-state.patch is to be applied against Adeos-1.1-03/x86 for > 2.6.15 > atomic-wakeup-and-schedule.patch is to be applied against Xeno 2.1-rc2 > > Both patches are needed to fix the issue. > > TIA, > Looks good. I tried Jeroen's test-case and I was not able to reproduce the crash anymore. I think it's time for a new ipipe-release. ;) At this chance: any comments on the panic-freeze extension for the tracer? I need to rework the Xenomai patch, but the ipipe side should be re
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Philippe Gerum wrote: Jan Kiszka wrote: Gilles Chanteperdrix wrote: Jeroen Van den Keybus wrote: > Hello, > > > I'm currently not at a level to participate in your discussion. Although I'm > willing to supply you with stresstests, I would nevertheless like to learn > more from task migration as this debugging session proceeds. In order to do > so, please confirm the following statements or indicate where I went wrong. > I hope others may learn from this as well. > > xn_shadow_harden(): This is called whenever a Xenomai thread performs a > Linux (root domain) system call (notified by Adeos ?). xnshadow_harden() is called whenever a thread running in secondary mode (that is, running as a regular Linux thread, handled by Linux scheduler) is switching to primary mode (where it will run as a Xenomai thread, handled by Xenomai scheduler). Migrations occur for some system calls. More precisely, Xenomai skin system calls tables associates a few flags with each system call, and some of these flags cause migration of the caller when it issues the system call. Each Xenomai user-space thread has two contexts, a regular Linux thread context, and a Xenomai thread called "shadow" thread. Both contexts share the same stack and program counter, so that at any time, at least one of the two contexts is seen as suspended by the scheduler which handles it. Before xnshadow_harden is called, the Linux thread is running, and its shadow is seen in suspended state with XNRELAX bit by Xenomai scheduler. After xnshadow_harden, the Linux context is seen suspended with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as running by Xenomai scheduler. The migrating thread > (nRT) is marked INTERRUPTIBLE and run by the Linux kernel > wake_up_interruptible_sync() call. Is this thread actually run or does it > merely put the thread in some Linux to-do list (I assumed the first case) ? Here, I am not sure, but it seems that when calling wake_up_interruptible_sync the woken up task is put in the current CPU runqueue, and this task (i.e. the gatekeeper), will not run until the current thread (i.e. the thread running xnshadow_harden) marks itself as suspended and calls schedule(). Maybe, marking the running thread as Depends on CONFIG_PREEMPT. If set, we get a preempt_schedule already here - and a switch if the prio of the woken up task is higher. BTW, an easy way to enforce the current trouble is to remove the "_sync" from wake_up_interruptible. As I understand it this _sync is just an optimisation hint for Linux to avoid needless scheduler runs. You could not guarantee the following execution sequence doing so either, i.e. 1- current wakes up the gatekeeper 2- current goes sleeping to exit the Linux runqueue in schedule() 3- the gatekeeper resumes the shadow-side of the old current The point is all about making 100% sure that current is going to be unlinked from the Linux runqueue before the gatekeeper processes the resumption request, whatever event the kernel is processing asynchronously in the meantime. This is the reason why, as you already noticed, preempt_schedule_irq() nicely breaks our toy by stealing the CPU from the hardening thread whilst keeping it linked to the runqueue: upon return from such preemption, the gatekeeper might have run already, hence the newly hardened thread ends up being seen as runnable by both the Linux and Xeno schedulers. Rainy day indeed. We could rely on giving "current" the highest SCHED_FIFO priority in xnshadow_harden() before waking up the gk, until the gk eventually promotes it to the Xenomai scheduling mode and downgrades this priority back to normal, but we would pay additional latencies induced by each aborted rescheduling attempt that may occur during the atomic path we want to enforce. The other way is to make sure that no in-kernel preemption of the hardening task could occur after step 1) and until step 2) is performed, given that we cannot currently call schedule() with interrupts or preemption off. I'm on it. Could anyone interested in this issue test the following couple of patches? atomic-switch-state.patch is to be applied against Adeos-1.1-03/x86 for 2.6.15 atomic-wakeup-and-schedule.patch is to be applied against Xeno 2.1-rc2 Both patches are needed to fix the issue. TIA, -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Philippe Gerum wrote: Jan Kiszka wrote: Gilles Chanteperdrix wrote: Jeroen Van den Keybus wrote: > Hello, > > > I'm currently not at a level to participate in your discussion. Although I'm > willing to supply you with stresstests, I would nevertheless like to learn > more from task migration as this debugging session proceeds. In order to do > so, please confirm the following statements or indicate where I went wrong. > I hope others may learn from this as well. > > xn_shadow_harden(): This is called whenever a Xenomai thread performs a > Linux (root domain) system call (notified by Adeos ?). xnshadow_harden() is called whenever a thread running in secondary mode (that is, running as a regular Linux thread, handled by Linux scheduler) is switching to primary mode (where it will run as a Xenomai thread, handled by Xenomai scheduler). Migrations occur for some system calls. More precisely, Xenomai skin system calls tables associates a few flags with each system call, and some of these flags cause migration of the caller when it issues the system call. Each Xenomai user-space thread has two contexts, a regular Linux thread context, and a Xenomai thread called "shadow" thread. Both contexts share the same stack and program counter, so that at any time, at least one of the two contexts is seen as suspended by the scheduler which handles it. Before xnshadow_harden is called, the Linux thread is running, and its shadow is seen in suspended state with XNRELAX bit by Xenomai scheduler. After xnshadow_harden, the Linux context is seen suspended with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as running by Xenomai scheduler. The migrating thread > (nRT) is marked INTERRUPTIBLE and run by the Linux kernel > wake_up_interruptible_sync() call. Is this thread actually run or does it > merely put the thread in some Linux to-do list (I assumed the first case) ? Here, I am not sure, but it seems that when calling wake_up_interruptible_sync the woken up task is put in the current CPU runqueue, and this task (i.e. the gatekeeper), will not run until the current thread (i.e. the thread running xnshadow_harden) marks itself as suspended and calls schedule(). Maybe, marking the running thread as Depends on CONFIG_PREEMPT. If set, we get a preempt_schedule already here - and a switch if the prio of the woken up task is higher. BTW, an easy way to enforce the current trouble is to remove the "_sync" from wake_up_interruptible. As I understand it this _sync is just an optimisation hint for Linux to avoid needless scheduler runs. You could not guarantee the following execution sequence doing so either, i.e. 1- current wakes up the gatekeeper 2- current goes sleeping to exit the Linux runqueue in schedule() 3- the gatekeeper resumes the shadow-side of the old current The point is all about making 100% sure that current is going to be unlinked from the Linux runqueue before the gatekeeper processes the resumption request, whatever event the kernel is processing asynchronously in the meantime. This is the reason why, as you already noticed, preempt_schedule_irq() nicely breaks our toy by stealing the CPU from the hardening thread whilst keeping it linked to the runqueue: upon return from such preemption, the gatekeeper might have run already, hence the newly hardened thread ends up being seen as runnable by both the Linux and Xeno schedulers. Rainy day indeed. We could rely on giving "current" the highest SCHED_FIFO priority in xnshadow_harden() before waking up the gk, until the gk eventually promotes it to the Xenomai scheduling mode and downgrades this priority back to normal, but we would pay additional latencies induced by each aborted rescheduling attempt that may occur during the atomic path we want to enforce. The other way is to make sure that no in-kernel preemption of the hardening task could occur after step 1) and until step 2) is performed, given that we cannot currently call schedule() with interrupts or preemption off. I'm on it. > Could anyone interested in this issue test the following couple of patches? > atomic-switch-state.patch is to be applied against Adeos-1.1-03/x86 for 2.6.15 > atomic-wakeup-and-schedule.patch is to be applied against Xeno 2.1-rc2 > Both patches are needed to fix the issue. > TIA, And now, Ladies and Gentlemen, with the patches attached. -- Philippe. --- 2.6.15-x86/kernel/sched.c 2006-01-07 15:18:31.0 +0100 +++ 2.6.15-ipipe/kernel/sched.c 2006-01-30 15:15:27.0 +0100 @@ -2963,7 +2963,7 @@ * Otherwise, whine if we are scheduling when we should not be. */ if (likely(!current->exit_state)) { - if (unlikely(in_atomic())) { + if (unlikely(!(current->state & TASK_ATOMICSWITCH) && in_atomic())) { printk(KERN_ERR "scheduling while atomic: " "%s/0x%08x/%d\n", current->comm, preempt_count(), current->pid); @@ -2972,8 +2972,13 @@ } profile_hit(SCHED_PROFILING, __builtin_return_ad
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jan Kiszka wrote: Gilles Chanteperdrix wrote: Jeroen Van den Keybus wrote: > Hello, > > > I'm currently not at a level to participate in your discussion. Although I'm > willing to supply you with stresstests, I would nevertheless like to learn > more from task migration as this debugging session proceeds. In order to do > so, please confirm the following statements or indicate where I went wrong. > I hope others may learn from this as well. > > xn_shadow_harden(): This is called whenever a Xenomai thread performs a > Linux (root domain) system call (notified by Adeos ?). xnshadow_harden() is called whenever a thread running in secondary mode (that is, running as a regular Linux thread, handled by Linux scheduler) is switching to primary mode (where it will run as a Xenomai thread, handled by Xenomai scheduler). Migrations occur for some system calls. More precisely, Xenomai skin system calls tables associates a few flags with each system call, and some of these flags cause migration of the caller when it issues the system call. Each Xenomai user-space thread has two contexts, a regular Linux thread context, and a Xenomai thread called "shadow" thread. Both contexts share the same stack and program counter, so that at any time, at least one of the two contexts is seen as suspended by the scheduler which handles it. Before xnshadow_harden is called, the Linux thread is running, and its shadow is seen in suspended state with XNRELAX bit by Xenomai scheduler. After xnshadow_harden, the Linux context is seen suspended with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as running by Xenomai scheduler. The migrating thread > (nRT) is marked INTERRUPTIBLE and run by the Linux kernel > wake_up_interruptible_sync() call. Is this thread actually run or does it > merely put the thread in some Linux to-do list (I assumed the first case) ? Here, I am not sure, but it seems that when calling wake_up_interruptible_sync the woken up task is put in the current CPU runqueue, and this task (i.e. the gatekeeper), will not run until the current thread (i.e. the thread running xnshadow_harden) marks itself as suspended and calls schedule(). Maybe, marking the running thread as Depends on CONFIG_PREEMPT. If set, we get a preempt_schedule already here - and a switch if the prio of the woken up task is higher. BTW, an easy way to enforce the current trouble is to remove the "_sync" from wake_up_interruptible. As I understand it this _sync is just an optimisation hint for Linux to avoid needless scheduler runs. You could not guarantee the following execution sequence doing so either, i.e. 1- current wakes up the gatekeeper 2- current goes sleeping to exit the Linux runqueue in schedule() 3- the gatekeeper resumes the shadow-side of the old current The point is all about making 100% sure that current is going to be unlinked from the Linux runqueue before the gatekeeper processes the resumption request, whatever event the kernel is processing asynchronously in the meantime. This is the reason why, as you already noticed, preempt_schedule_irq() nicely breaks our toy by stealing the CPU from the hardening thread whilst keeping it linked to the runqueue: upon return from such preemption, the gatekeeper might have run already, hence the newly hardened thread ends up being seen as runnable by both the Linux and Xeno schedulers. Rainy day indeed. We could rely on giving "current" the highest SCHED_FIFO priority in xnshadow_harden() before waking up the gk, until the gk eventually promotes it to the Xenomai scheduling mode and downgrades this priority back to normal, but we would pay additional latencies induced by each aborted rescheduling attempt that may occur during the atomic path we want to enforce. The other way is to make sure that no in-kernel preemption of the hardening task could occur after step 1) and until step 2) is performed, given that we cannot currently call schedule() with interrupts or preemption off. I'm on it. suspended is not needed, since the gatekeeper may have a high priority, and calling schedule() is enough. In any case, the waken up thread does not seem to be run immediately, so this rather look like the second case. Since in xnshadow_harden, the running thread marks itself as suspended before running wake_up_interruptible_sync, the gatekeeper will run when schedule() get called, which in turn, depend on the CONFIG_PREEMPT* configuration. In the non-preempt case, the current thread will be suspended and the gatekeeper will run when schedule() is explicitely called in xnshadow_harden(). In the preempt case, schedule gets called when the outermost spinlock is unlocked in wake_up_interruptible_sync(). > And how does it terminate: is only the system call migrated or is the thread > allowed to continue run (at a priority level equal to the Xenomai > priority level) until it hits something of the Xenomai API (or trivially: > explicitly go to RT using th
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
On 30/01/06, Jan Kiszka <[EMAIL PROTECTED]> wrote: Dmitry Adamushko wrote:>>> ...>>> I have not checked it yet but my presupposition that something as easy as>> :>>> preempt_disable()>> wake_up_interruptible_sync(); >>> schedule();>> preempt_enable();>> It's a no-go: "scheduling while atomic". One of my first attempts to>> solve it.>>> My fault. I meant the way preempt_schedule() and preempt_irq_schedule() call > schedule() while being non-preemptible.> To this end, ACTIVE_PREEMPT is set up.> The use of preempt_enable/disable() here is wrong.>>> The only way to enter schedule() without being preemptible is via >> ACTIVE_PREEMPT. But the effect of that flag should be well-known now.>> Kind of Gordian knot. :(>>> Maybe I have missed something so just for my curiosity : what does prevent > the use of PREEMPT_ACTIVE here?> We don't have a "preempted while atomic" message here as it seems to be a> legal way to call schedule() with that flag being set up.When PREEMPT_ACTIVE is set, task gets /preempted/ but not removed from the run queue - independent of its current status. Err... that's exactly the reason I have explained in my first mail for this thread :) Blah.. I wish I was smoking something special before so I would point that as the reason of my forgetfulness. Actually, we could use PREEMPT_ACTIVE indeed + something else (probably another flag) to distinguish between a case when PREEMPT_ACTIVE is set by Linux and another case when it's set by xnshadow_harden(). xnshadow_harden() { struct task_struct *this_task = current; ... xnthread_t *thread = xnshadow_thread(this_task); if (!thread) return; ... gk->thread = thread; + add_preempt_count(PREEMPT_ACTIVE); // should be checked in schedule() + xnthread_set_flags(thread, XNATOMIC_TRANSIT); set_current_state(TASK_INTERRUPTIBLE); wake_up_interruptible_sync(&gk->waitq); + schedule(); + sub_preempt_count(PREEMPT_ACTIVE); ... } Then, something like the following code should be called from schedule() : void ipipe_transit_cleanup(struct task_struct *task, runqueue_t *rq) { xnthread_t *thread = xnshadow_thread(task); if (!thread) return; if (xnthread_test_flags(thread, XNATOMIC_TRANSIT)) { xnthread_clear_flags(thread, XNATOMIC_TRANSIT); deactivate_task(task, rq); } } - schedule.c : ... switch_count = &prev->nivcsw; if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) switch_count = &prev->nvcsw; if (unlikely((prev->state & TASK_INTERRUPTIBLE) && unlikely(signal_pending(prev)) )) prev->state = TASK_RUNNING; else { if (prev->state == TASK_UNINTERRUPTIBLE) rq->nr_uninterruptible++; deactivate_task(prev, rq); } } // removes a task from the active queue if PREEMPT_ACTIVE + // XNATOMIC_TRANSIT + #ifdef CONFIG_IPIPE + ipipe_transit_cleanup(prev, rq); + #endif /* CONFIG_IPIPE */ ... Not very gracefully maybe, but could work or am I missing something important? -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Dmitry Adamushko wrote: >>> ... > >> I have not checked it yet but my presupposition that something as easy as >> : >>> preempt_disable() >>> >>> wake_up_interruptible_sync(); >>> schedule(); >>> >>> preempt_enable(); >> It's a no-go: "scheduling while atomic". One of my first attempts to >> solve it. > > > My fault. I meant the way preempt_schedule() and preempt_irq_schedule() call > schedule() while being non-preemptible. > To this end, ACTIVE_PREEMPT is set up. > The use of preempt_enable/disable() here is wrong. > > > The only way to enter schedule() without being preemptible is via >> ACTIVE_PREEMPT. But the effect of that flag should be well-known now. >> Kind of Gordian knot. :( > > > Maybe I have missed something so just for my curiosity : what does prevent > the use of PREEMPT_ACTIVE here? > We don't have a "preempted while atomic" message here as it seems to be a > legal way to call schedule() with that flag being set up. When PREEMPT_ACTIVE is set, task gets /preempted/ but not removed from the run queue - independent of its current status. > > >>> could work... err.. and don't blame me if no, it's some one else who has >>> written that nonsense :o) >>> >>> -- >>> Best regards, >>> Dmitry Adamushko >>> >> Jan >> >> >> >> > > > -- > Best regards, > Dmitry Adamushko > > > > > > ___ > Xenomai-core mailing list > Xenomai-core@gna.org > https://mail.gna.org/listinfo/xenomai-core signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
>> ... > I have not checked it yet but my presupposition that something as easy as : >> preempt_disable()>> wake_up_interruptible_sync();> schedule();>> preempt_enable();It's a no-go: "scheduling while atomic". One of my first attempts tosolve it. My fault. I meant the way preempt_schedule() and preempt_irq_schedule() call schedule() while being non-preemptible. To this end, ACTIVE_PREEMPT is set up. The use of preempt_enable/disable() here is wrong. The only way to enter schedule() without being preemptible is viaACTIVE_PREEMPT. But the effect of that flag should be well-known now. Kind of Gordian knot. :( Maybe I have missed something so just for my curiosity : what does prevent the use of PREEMPT_ACTIVE here? We don't have a "preempted while atomic" message here as it seems to be a legal way to call schedule() with that flag being set up. >>> could work... err.. and don't blame me if no, it's some one else who has > written that nonsense :o)>> --> Best regards,> Dmitry Adamushko>Jan-- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Philippe Gerum wrote: Jan Kiszka wrote: Hi, well, if I'm not totally wrong, we have a design problem in the RT-thread hardening path. I dug into the crash Jeroen reported and I'm quite sure that this is the reason. So that's the bad news. The good one is that we can at least work around it by switching off CONFIG_PREEMPT for Linux (this implicitly means that it's a 2.6-only issue). @Jeroen: Did you verify that your setup also works fine without CONFIG_PREEMPT? But let's start with two assumptions my further analysis is based on: [Xenomai] o Shadow threads have only one stack, i.e. one context. If the real-time part is active (this includes it is blocked on some xnsynch object or delayed), the original Linux task must NEVER EVER be executed, even if it will immediately fall asleep again. That's because the stack is in use by the real-time part at that time. And this condition is checked in do_schedule_event() [1]. [Linux] o A Linux task which has called set_current_state() will remain in the run-queue as long as it calls schedule() on its own. This means that it can be preempted (if CONFIG_PREEMPT is set) between set_current_state() and schedule() and then even be resumed again. Only the explicit call of schedule() will trigger deactivate_task() which will in turn remove current from the run-queue. Ok, if this is true, let's have a look at xnshadow_harden(): After grabbing the gatekeeper sem and putting itself in gk->thread, a task going for RT then marks itself TASK_INTERRUPTIBLE and wakes up the gatekeeper [2]. This does not include a Linux reschedule due to the _sync version of wake_up_interruptible. What can happen now? 1) No interruption until we can called schedule() [3]. All fine as we will not be removed from the run-queue before the gatekeeper starts kicking our RT part, thus no conflict in using the thread's stack. 3) Interruption by a RT IRQ. This would just delay the path described above, even if some RT threads get executed. Once they are finished, we continue in xnshadow_harden() - given that the RT part does not trigger the following case: 3) Interruption by some Linux IRQ. This may cause other threads to become runnable as well, but the gatekeeper has the highest prio and will therefore be the next. The problem is that the rescheduling on Linux IRQ exit will PREEMPT our task in xnshadow_harden(), it will NOT remove it from the Linux run-queue. And now we are in real troubles: The gatekeeper will kick off our RT part which will take over the thread's stack. As soon as the RT domain falls asleep and Linux takes over again, it will continue our non-RT part as well! Actually, this seems to be the reason for the panic in do_schedule_event(). Without CONFIG_XENO_OPT_DEBUG and this check, we will run both parts AT THE SAME TIME now, thus violating my first assumption. The system gets fatally corrupted. Yep, that's it. And we may not lock out the interrupts before calling schedule to prevent that. Well, I would be happy if someone can prove me wrong here. The problem is that I don't see a solution because Linux does not provide an atomic wake-up + schedule-out under CONFIG_PREEMPT. I'm currently considering a hack to remove the migrating Linux thread manually from the run-queue, but this could easily break the Linux scheduler. Maybe the best way would be to provide atomic wakeup-and-schedule support into the Adeos patch for Linux tasks; previous attempts to fix this by circumventing the potential for preemption from outside of the scheduler code have all failed, and this bug is uselessly lingering for that reason. Having slept on this, I'm going to add a simple extension to the Linux scheduler available from Adeos, in order to get an atomic/unpreemptable path from the statement when the current task's state is changed for suspension (e.g. TASK_INTERRUPTIBLE), to the point where schedule() normally enters its atomic section, which looks like the sanest way to solve this issue, i.e. without gory hackery all over the place. Patch will follow later for testing this approach. Jan PS: Out of curiosity I also checked RTAI's migration mechanism in this regard. It's similar except for the fact that it does the gatekeeper's work in the Linux scheduler's tail (i.e. after the next context switch). And RTAI seems it suffers from the very same race. So this is either a fundamental issue - or I'm fundamentally wrong. [1]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L1573 [2]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L461 [3]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L481 ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core -- Philippe. ___
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jan Kiszka wrote: Hi, well, if I'm not totally wrong, we have a design problem in the RT-thread hardening path. I dug into the crash Jeroen reported and I'm quite sure that this is the reason. So that's the bad news. The good one is that we can at least work around it by switching off CONFIG_PREEMPT for Linux (this implicitly means that it's a 2.6-only issue). @Jeroen: Did you verify that your setup also works fine without CONFIG_PREEMPT? But let's start with two assumptions my further analysis is based on: [Xenomai] o Shadow threads have only one stack, i.e. one context. If the real-time part is active (this includes it is blocked on some xnsynch object or delayed), the original Linux task must NEVER EVER be executed, even if it will immediately fall asleep again. That's because the stack is in use by the real-time part at that time. And this condition is checked in do_schedule_event() [1]. [Linux] o A Linux task which has called set_current_state() will remain in the run-queue as long as it calls schedule() on its own. This means that it can be preempted (if CONFIG_PREEMPT is set) between set_current_state() and schedule() and then even be resumed again. Only the explicit call of schedule() will trigger deactivate_task() which will in turn remove current from the run-queue. Ok, if this is true, let's have a look at xnshadow_harden(): After grabbing the gatekeeper sem and putting itself in gk->thread, a task going for RT then marks itself TASK_INTERRUPTIBLE and wakes up the gatekeeper [2]. This does not include a Linux reschedule due to the _sync version of wake_up_interruptible. What can happen now? 1) No interruption until we can called schedule() [3]. All fine as we will not be removed from the run-queue before the gatekeeper starts kicking our RT part, thus no conflict in using the thread's stack. 3) Interruption by a RT IRQ. This would just delay the path described above, even if some RT threads get executed. Once they are finished, we continue in xnshadow_harden() - given that the RT part does not trigger the following case: 3) Interruption by some Linux IRQ. This may cause other threads to become runnable as well, but the gatekeeper has the highest prio and will therefore be the next. The problem is that the rescheduling on Linux IRQ exit will PREEMPT our task in xnshadow_harden(), it will NOT remove it from the Linux run-queue. And now we are in real troubles: The gatekeeper will kick off our RT part which will take over the thread's stack. As soon as the RT domain falls asleep and Linux takes over again, it will continue our non-RT part as well! Actually, this seems to be the reason for the panic in do_schedule_event(). Without CONFIG_XENO_OPT_DEBUG and this check, we will run both parts AT THE SAME TIME now, thus violating my first assumption. The system gets fatally corrupted. Yep, that's it. And we may not lock out the interrupts before calling schedule to prevent that. Well, I would be happy if someone can prove me wrong here. The problem is that I don't see a solution because Linux does not provide an atomic wake-up + schedule-out under CONFIG_PREEMPT. I'm currently considering a hack to remove the migrating Linux thread manually from the run-queue, but this could easily break the Linux scheduler. Maybe the best way would be to provide atomic wakeup-and-schedule support into the Adeos patch for Linux tasks; previous attempts to fix this by circumventing the potential for preemption from outside of the scheduler code have all failed, and this bug is uselessly lingering for that reason. Jan PS: Out of curiosity I also checked RTAI's migration mechanism in this regard. It's similar except for the fact that it does the gatekeeper's work in the Linux scheduler's tail (i.e. after the next context switch). And RTAI seems it suffers from the very same race. So this is either a fundamental issue - or I'm fundamentally wrong. [1]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L1573 [2]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L461 [3]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L481 ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core -- Philippe.
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jan Kiszka wrote: Hi, well, if I'm not totally wrong, we have a design problem in the RT-thread hardening path. I dug into the crash Jeroen reported and I'm quite sure that this is the reason. So that's the bad news. The good one is that we can at least work around it by switching off CONFIG_PREEMPT for Linux (this implicitly means that it's a 2.6-only issue). @Jeroen: Did you verify that your setup also works fine without CONFIG_PREEMPT? But let's start with two assumptions my further analysis is based on: [Xenomai] o Shadow threads have only one stack, i.e. one context. If the real-time part is active (this includes it is blocked on some xnsynch object or delayed), the original Linux task must NEVER EVER be executed, even if it will immediately fall asleep again. That's because the stack is in use by the real-time part at that time. And this condition is checked in do_schedule_event() [1]. [Linux] o A Linux task which has called set_current_state() will remain in the run-queue as long as it calls schedule() on its own. This means that it can be preempted (if CONFIG_PREEMPT is set) between set_current_state() and schedule() and then even be resumed again. Only the explicit call of schedule() will trigger deactivate_task() which will in turn remove current from the run-queue. Ok, if this is true, let's have a look at xnshadow_harden(): After grabbing the gatekeeper sem and putting itself in gk->thread, a task going for RT then marks itself TASK_INTERRUPTIBLE and wakes up the gatekeeper [2]. This does not include a Linux reschedule due to the _sync version of wake_up_interruptible. What can happen now? 1) No interruption until we can called schedule() [3]. All fine as we will not be removed from the run-queue before the gatekeeper starts kicking our RT part, thus no conflict in using the thread's stack. 3) Interruption by a RT IRQ. This would just delay the path described above, even if some RT threads get executed. Once they are finished, we continue in xnshadow_harden() - given that the RT part does not trigger the following case: 3) Interruption by some Linux IRQ. This may cause other threads to become runnable as well, but the gatekeeper has the highest prio and will therefore be the next. The problem is that the rescheduling on Linux IRQ exit will PREEMPT our task in xnshadow_harden(), it will NOT remove it from the Linux run-queue. And now we are in real troubles: The gatekeeper will kick off our RT part which will take over the thread's stack. As soon as the RT domain falls asleep and Linux takes over again, it will continue our non-RT part as well! Actually, this seems to be the reason for the panic in do_schedule_event(). Without CONFIG_XENO_OPT_DEBUG and this check, we will run both parts AT THE SAME TIME now, thus violating my first assumption. The system gets fatally corrupted. Yep, that's it. And we may not lock out the interrupts before calling schedule to prevent that. Well, I would be happy if someone can prove me wrong here. The problem is that I don't see a solution because Linux does not provide an atomic wake-up + schedule-out under CONFIG_PREEMPT. I'm currently considering a hack to remove the migrating Linux thread manually from the run-queue, but this could easily break the Linux scheduler. Maybe the best way would be to provide atomic wakeup-and-schedule support into the Adeos patch for Linux tasks; previous attempts to fix this by circumventing the potential for preemption from outside of the scheduler code have all failed, and this bug is uselessly lingering for that reason. Jan PS: Out of curiosity I also checked RTAI's migration mechanism in this regard. It's similar except for the fact that it does the gatekeeper's work in the Linux scheduler's tail (i.e. after the next context switch). And RTAI seems it suffers from the very same race. So this is either a fundamental issue - or I'm fundamentally wrong. [1]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L1573 [2]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L461 [3]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L481 ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Dmitry Adamushko wrote: > On 23/01/06, Gilles Chanteperdrix <[EMAIL PROTECTED]> wrote: >> Jeroen Van den Keybus wrote: >>> Hello, > > > >> [ skip-skip-skip ] >> > > >> Since in xnshadow_harden, the running thread marks itself as suspended >> before running wake_up_interruptible_sync, the gatekeeper will run when >> schedule() get called, which in turn, depend on the CONFIG_PREEMPT* >> configuration. In the non-preempt case, the current thread will be >> suspended and the gatekeeper will run when schedule() is explicitely >> called in xnshadow_harden(). In the preempt case, schedule gets called >> when the outermost spinlock is unlocked in wake_up_interruptible_sync(). > > > In fact, no. > > wake_up_interruptible_sync() doesn't set the need_resched "flag" up. That's > why it's "sync" actually. > > Only if the need_resched was already set before calling > wake_up_interruptible_sync(), then yes. > > The secuence is as follows : > > wake_up_interruptible_sync ---> wake_up_sync ---> wake_up_common(..., > sync=1, ...) ---> ... ---> try_to_wake_up(..., sync=1) > > Look at the end of try_to_wake_up() to see when it calls resched_task(). > The comment there speaks for itself. > > So let's suppose need_resched == 0 (it's per-task of course). > As a result of wake_up_interruptible_sync() the new task is added to the > current active run-queue but need_resched remains to be unset in the hope > that the waker will call schedule() on its own soon. > > I have CONFIG_PREEMPT set on my machine but I have never encountered a bug > described by Jan. > > The catalyst of the problem, I guess, is that some IRQ interrupts a task > between wake_up_interruptible_sync() and schedule() and its ISR, in turn, > wakes up another task which prio is higher than the one of our waker (as a > result, the need_resched flag is set). And now, rescheduling occurs on > return from irq handling code (ret_from_intr -> ...-> preempt_irq_schedule() > -> schedule()). Yes, this is exactly what happened. I unfortunately have not saved a related trace I took with the extended ipipe-tracer (the one I sent ends too early), but they showed a preemption right after the wake_up, first by one of the other real-time threads in Jeroen's scenario, and then, as a result of some xnshadow_relax() of that thread, a Linux preempt_schedule to the gatekeeper. We do not see this bug that often as it requires a specific load and it must hit a really small race window. > > Some events should coincide, yep. But I guess that problem does not occur > every time? > > I have not checked it yet but my presupposition that something as easy as : > > preempt_disable() > > wake_up_interruptible_sync(); > schedule(); > > preempt_enable(); It's a no-go: "scheduling while atomic". One of my first attempts to solve it. The only way to enter schedule() without being preemptible is via ACTIVE_PREEMPT. But the effect of that flag should be well-known now. Kind of Gordian knot. :( > > > could work... err.. and don't blame me if no, it's some one else who has > written that nonsense :o) > > -- > Best regards, > Dmitry Adamushko > Jan signature.asc Description: OpenPGP digital signature
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
On 23/01/06, Gilles Chanteperdrix <[EMAIL PROTECTED]> wrote: Jeroen Van den Keybus wrote: > Hello, [ skip-skip-skip ] Since in xnshadow_harden, the running thread marks itself as suspendedbefore running wake_up_interruptible_sync, the gatekeeper will run when schedule() get called, which in turn, depend on the CONFIG_PREEMPT*configuration. In the non-preempt case, the current thread will besuspended and the gatekeeper will run when schedule() is explicitelycalled in xnshadow_harden(). In the preempt case, schedule gets called when the outermost spinlock is unlocked in wake_up_interruptible_sync(). In fact, no. wake_up_interruptible_sync() doesn't set the need_resched "flag" up. That's why it's "sync" actually. Only if the need_resched was already set before calling wake_up_interruptible_sync(), then yes. The secuence is as follows : wake_up_interruptible_sync ---> wake_up_sync ---> wake_up_common(..., sync=1, ...) ---> ... ---> try_to_wake_up(..., sync=1) Look at the end of try_to_wake_up() to see when it calls resched_task(). The comment there speaks for itself. So let's suppose need_resched == 0 (it's per-task of course). As a result of wake_up_interruptible_sync() the new task is added to the current active run-queue but need_resched remains to be unset in the hope that the waker will call schedule() on its own soon. I have CONFIG_PREEMPT set on my machine but I have never encountered a bug described by Jan. The catalyst of the problem, I guess, is that some IRQ interrupts a task between wake_up_interruptible_sync() and schedule() and its ISR, in turn, wakes up another task which prio is higher than the one of our waker (as a result, the need_resched flag is set). And now, rescheduling occurs on return from irq handling code (ret_from_intr -> ...-> preempt_irq_schedule() -> schedule()). Some events should coincide, yep. But I guess that problem does not occur every time? I have not checked it yet but my presupposition that something as easy as : preempt_disable() wake_up_interruptible_sync(); schedule(); preempt_enable(); could work... err.. and don't blame me if no, it's some one else who has written that nonsense :o) -- Best regards,Dmitry Adamushko
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
On 23/01/06, Gilles Chanteperdrix <[EMAIL PROTECTED]> wrote: Jeroen Van den Keybus wrote: > Hello, [ skip-skip-skip ] Since in xnshadow_harden, the running thread marks itself as suspendedbefore running wake_up_interruptible_sync, the gatekeeper will run when schedule() get called, which in turn, depend on the CONFIG_PREEMPT*configuration. In the non-preempt case, the current thread will besuspended and the gatekeeper will run when schedule() is explicitelycalled in xnshadow_harden(). In the preempt case, schedule gets called when the outermost spinlock is unlocked in wake_up_interruptible_sync(). In fact, no. wake_up_interruptible_sync() doesn't set the need_resched "flag" up. That's why it's "sync" actually. Only if the need_resched was already set before calling wake_up_interruptible_sync(), then yes. The secuence is as follows : wake_up_interruptible_sync ---> wake_up_sync ---> wake_up_common(..., sync=1, ...) ---> ... ---> try_to_wake_up(..., sync=1) Look at the end of try_to_wake_up() to see when it calls resched_task(). The comment there speaks for itself. So let's suppose need_resched == 0 (it's per-task of course). As a result of wake_up_interruptible_sync() the new task is added to the current active run-queue but need_resched remains to be unset in the hope that the waker will call schedule() on its own soon. I have CONFIG_PREEMPT set on my machine but I have never encountered a bug described by Jan. The catalyst of the problem, I guess, is that some IRQ interrupts a task between wake_up_interruptible_sync() and schedule() and its ISR, in turn, wakes up another task which prio is higher than the one of our waker (as a result, the need_resched flag is set). And now, rescheduling occurs on return from irq handling code (ret_from_intr -> ...-> preempt_irq_schedule() -> schedule()). Some events should coincide, yep. But I guess that problem does not occur every time? I have not checked it yet but my presupposition that something as easy as : preempt_disable() wake_up_interruptible_sync(); schedule(); preempt_enable(); could work... err.. and don't blame me if no, it's some one else who has written that nonsense :o) -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Dmitry Adamushko wrote: > On 23/01/06, Gilles Chanteperdrix <[EMAIL PROTECTED]> wrote: >> Jeroen Van den Keybus wrote: >>> Hello, > > > >> [ skip-skip-skip ] >> > > >> Since in xnshadow_harden, the running thread marks itself as suspended >> before running wake_up_interruptible_sync, the gatekeeper will run when >> schedule() get called, which in turn, depend on the CONFIG_PREEMPT* >> configuration. In the non-preempt case, the current thread will be >> suspended and the gatekeeper will run when schedule() is explicitely >> called in xnshadow_harden(). In the preempt case, schedule gets called >> when the outermost spinlock is unlocked in wake_up_interruptible_sync(). > > > In fact, no. > > wake_up_interruptible_sync() doesn't set the need_resched "flag" up. That's > why it's "sync" actually. > > Only if the need_resched was already set before calling > wake_up_interruptible_sync(), then yes. > > The secuence is as follows : > > wake_up_interruptible_sync ---> wake_up_sync ---> wake_up_common(..., > sync=1, ...) ---> ... ---> try_to_wake_up(..., sync=1) > > Look at the end of try_to_wake_up() to see when it calls resched_task(). > The comment there speaks for itself. > > So let's suppose need_resched == 0 (it's per-task of course). > As a result of wake_up_interruptible_sync() the new task is added to the > current active run-queue but need_resched remains to be unset in the hope > that the waker will call schedule() on its own soon. > > I have CONFIG_PREEMPT set on my machine but I have never encountered a bug > described by Jan. > > The catalyst of the problem, I guess, is that some IRQ interrupts a task > between wake_up_interruptible_sync() and schedule() and its ISR, in turn, > wakes up another task which prio is higher than the one of our waker (as a > result, the need_resched flag is set). And now, rescheduling occurs on > return from irq handling code (ret_from_intr -> ...-> preempt_irq_schedule() > -> schedule()). Yes, this is exactly what happened. I unfortunately have not saved a related trace I took with the extended ipipe-tracer (the one I sent ends too early), but they showed a preemption right after the wake_up, first by one of the other real-time threads in Jeroen's scenario, and then, as a result of some xnshadow_relax() of that thread, a Linux preempt_schedule to the gatekeeper. We do not see this bug that often as it requires a specific load and it must hit a really small race window. > > Some events should coincide, yep. But I guess that problem does not occur > every time? > > I have not checked it yet but my presupposition that something as easy as : > > preempt_disable() > > wake_up_interruptible_sync(); > schedule(); > > preempt_enable(); It's a no-go: "scheduling while atomic". One of my first attempts to solve it. The only way to enter schedule() without being preemptible is via ACTIVE_PREEMPT. But the effect of that flag should be well-known now. Kind of Gordian knot. :( > > > could work... err.. and don't blame me if no, it's some one else who has > written that nonsense :o) > > -- > Best regards, > Dmitry Adamushko > Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Gilles Chanteperdrix wrote: > Jeroen Van den Keybus wrote: > > Hello, > > > > > > I'm currently not at a level to participate in your discussion. Although > I'm > > willing to supply you with stresstests, I would nevertheless like to learn > > more from task migration as this debugging session proceeds. In order to do > > so, please confirm the following statements or indicate where I went wrong. > > I hope others may learn from this as well. > > > > xn_shadow_harden(): This is called whenever a Xenomai thread performs a > > Linux (root domain) system call (notified by Adeos ?). > > xnshadow_harden() is called whenever a thread running in secondary > mode (that is, running as a regular Linux thread, handled by Linux > scheduler) is switching to primary mode (where it will run as a Xenomai > thread, handled by Xenomai scheduler). Migrations occur for some system > calls. More precisely, Xenomai skin system calls tables associates a few > flags with each system call, and some of these flags cause migration of > the caller when it issues the system call. > > Each Xenomai user-space thread has two contexts, a regular Linux > thread context, and a Xenomai thread called "shadow" thread. Both > contexts share the same stack and program counter, so that at any time, > at least one of the two contexts is seen as suspended by the scheduler > which handles it. > > Before xnshadow_harden is called, the Linux thread is running, and its > shadow is seen in suspended state with XNRELAX bit by Xenomai > scheduler. After xnshadow_harden, the Linux context is seen suspended > with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as > running by Xenomai scheduler. > > The migrating thread > > (nRT) is marked INTERRUPTIBLE and run by the Linux kernel > > wake_up_interruptible_sync() call. Is this thread actually run or does it > > merely put the thread in some Linux to-do list (I assumed the first case) ? > > Here, I am not sure, but it seems that when calling > wake_up_interruptible_sync the woken up task is put in the current CPU > runqueue, and this task (i.e. the gatekeeper), will not run until the > current thread (i.e. the thread running xnshadow_harden) marks itself as > suspended and calls schedule(). Maybe, marking the running thread as Depends on CONFIG_PREEMPT. If set, we get a preempt_schedule already here - and a switch if the prio of the woken up task is higher. BTW, an easy way to enforce the current trouble is to remove the "_sync" from wake_up_interruptible. As I understand it this _sync is just an optimisation hint for Linux to avoid needless scheduler runs. > suspended is not needed, since the gatekeeper may have a high priority, > and calling schedule() is enough. In any case, the waken up thread does > not seem to be run immediately, so this rather look like the second > case. > > Since in xnshadow_harden, the running thread marks itself as suspended > before running wake_up_interruptible_sync, the gatekeeper will run when > schedule() get called, which in turn, depend on the CONFIG_PREEMPT* > configuration. In the non-preempt case, the current thread will be > suspended and the gatekeeper will run when schedule() is explicitely > called in xnshadow_harden(). In the preempt case, schedule gets called > when the outermost spinlock is unlocked in wake_up_interruptible_sync(). > > > And how does it terminate: is only the system call migrated or is the > thread > > allowed to continue run (at a priority level equal to the Xenomai > > priority level) until it hits something of the Xenomai API (or trivially: > > explicitly go to RT using the API) ? > > I am not sure I follow you here. The usual case is that the thread will > remain in primary mode after the system call, but I think a system call > flag allow the other behaviour. So, if I understand the question > correctly, the answer is that it depends on the system call. > > > In that case, I expect the nRT thread to terminate with a schedule() > > call in the Xeno OS API code which deactivates the task so that it > > won't ever run in Linux context anymore. A top priority gatekeeper is > > in place as a software hook to catch Linux's attention right after > > that schedule(), which might otherwise schedule something else (and > > leave only interrupts for Xenomai to come back to life again). > > Here is the way I understand it. We have two threads, or rather two > "views" of the same thread, with each its state. Switching from > secondary to primary mode, i.e. xnshadow_harden and gatekeeper job, > means changing the two states at once. Since we can not do that, we need > an intermediate state. Since the intermediate state can not be the state > where the two threads are running (they share the same stack and > program counter), the intermediate state is a state where the two > threads are suspended, but another context needs running, it is the > gatekeeper. > > > I have > > the impression that I c
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jeroen Van den Keybus wrote: > Hello, > > > I'm currently not at a level to participate in your discussion. Although I'm > willing to supply you with stresstests, I would nevertheless like to learn > more from task migration as this debugging session proceeds. In order to do > so, please confirm the following statements or indicate where I went wrong. > I hope others may learn from this as well. > > xn_shadow_harden(): This is called whenever a Xenomai thread performs a > Linux (root domain) system call (notified by Adeos ?). xnshadow_harden() is called whenever a thread running in secondary mode (that is, running as a regular Linux thread, handled by Linux scheduler) is switching to primary mode (where it will run as a Xenomai thread, handled by Xenomai scheduler). Migrations occur for some system calls. More precisely, Xenomai skin system calls tables associates a few flags with each system call, and some of these flags cause migration of the caller when it issues the system call. Each Xenomai user-space thread has two contexts, a regular Linux thread context, and a Xenomai thread called "shadow" thread. Both contexts share the same stack and program counter, so that at any time, at least one of the two contexts is seen as suspended by the scheduler which handles it. Before xnshadow_harden is called, the Linux thread is running, and its shadow is seen in suspended state with XNRELAX bit by Xenomai scheduler. After xnshadow_harden, the Linux context is seen suspended with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as running by Xenomai scheduler. The migrating thread > (nRT) is marked INTERRUPTIBLE and run by the Linux kernel > wake_up_interruptible_sync() call. Is this thread actually run or does it > merely put the thread in some Linux to-do list (I assumed the first case) ? Here, I am not sure, but it seems that when calling wake_up_interruptible_sync the woken up task is put in the current CPU runqueue, and this task (i.e. the gatekeeper), will not run until the current thread (i.e. the thread running xnshadow_harden) marks itself as suspended and calls schedule(). Maybe, marking the running thread as suspended is not needed, since the gatekeeper may have a high priority, and calling schedule() is enough. In any case, the waken up thread does not seem to be run immediately, so this rather look like the second case. Since in xnshadow_harden, the running thread marks itself as suspended before running wake_up_interruptible_sync, the gatekeeper will run when schedule() get called, which in turn, depend on the CONFIG_PREEMPT* configuration. In the non-preempt case, the current thread will be suspended and the gatekeeper will run when schedule() is explicitely called in xnshadow_harden(). In the preempt case, schedule gets called when the outermost spinlock is unlocked in wake_up_interruptible_sync(). > And how does it terminate: is only the system call migrated or is the thread > allowed to continue run (at a priority level equal to the Xenomai > priority level) until it hits something of the Xenomai API (or trivially: > explicitly go to RT using the API) ? I am not sure I follow you here. The usual case is that the thread will remain in primary mode after the system call, but I think a system call flag allow the other behaviour. So, if I understand the question correctly, the answer is that it depends on the system call. > In that case, I expect the nRT thread to terminate with a schedule() > call in the Xeno OS API code which deactivates the task so that it > won't ever run in Linux context anymore. A top priority gatekeeper is > in place as a software hook to catch Linux's attention right after > that schedule(), which might otherwise schedule something else (and > leave only interrupts for Xenomai to come back to life again). Here is the way I understand it. We have two threads, or rather two "views" of the same thread, with each its state. Switching from secondary to primary mode, i.e. xnshadow_harden and gatekeeper job, means changing the two states at once. Since we can not do that, we need an intermediate state. Since the intermediate state can not be the state where the two threads are running (they share the same stack and program counter), the intermediate state is a state where the two threads are suspended, but another context needs running, it is the gatekeeper. > I have > the impression that I cannot see this gatekeeper, nor the (n)RT > threads using the ps command ? The gatekeeper and Xenomai user-space threads are regular Linux contexts, you can seen them using the ps command. > > Is it correct to state that the current preemption issue is due to the > gatekeeper being invoked too soon ? Could someone knowing more about the > migration technology explain what exactly goes wrong ? Jan seems to have found such an issue here. I am not sure I understood what he wrote. But if the issue is due to CONFIG_PREEMPT, it explains
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Hello, I'm currently not at a level to participate in your discussion. Although I'm willing to supply you with stresstests, I would nevertheless like to learn more from task migration as this debugging session proceeds. In order to do so, please confirm the following statements or indicate where I went wrong. I hope others may learn from this as well. xn_shadow_harden(): This is called whenever a Xenomai thread performs a Linux (root domain) system call (notified by Adeos ?). The migrating thread (nRT) is marked INTERRUPTIBLE and run by the Linux kernel wake_up_interruptible_sync() call. Is this thread actually run or does it merely put the thread in some Linux to-do list (I assumed the first case) ? And how does it terminate: is only the system call migrated or is the thread allowed to continue run (at a priority level equal to the Xenomai priority level) until it hits something of the Xenomai API (or trivially: explicitly go to RT using the API) ? In that case, I expect the nRT thread to terminate with a schedule() call in the Xeno OS API code which deactivates the task so that it won't ever run in Linux context anymore. A top priority gatekeeper is in place as a software hook to catch Linux's attention right after that schedule(), which might otherwise schedule something else (and leave only interrupts for Xenomai to come back to life again). I have the impression that I cannot see this gatekeeper, nor the (n)RT threads using the ps command ? Is it correct to state that the current preemption issue is due to the gatekeeper being invoked too soon ? Could someone knowing more about the migration technology explain what exactly goes wrong ? Thanks, Jeroen.
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Gilles Chanteperdrix wrote: > Jeroen Van den Keybus wrote: > > Hello, > > > > > > I'm currently not at a level to participate in your discussion. Although > I'm > > willing to supply you with stresstests, I would nevertheless like to learn > > more from task migration as this debugging session proceeds. In order to do > > so, please confirm the following statements or indicate where I went wrong. > > I hope others may learn from this as well. > > > > xn_shadow_harden(): This is called whenever a Xenomai thread performs a > > Linux (root domain) system call (notified by Adeos ?). > > xnshadow_harden() is called whenever a thread running in secondary > mode (that is, running as a regular Linux thread, handled by Linux > scheduler) is switching to primary mode (where it will run as a Xenomai > thread, handled by Xenomai scheduler). Migrations occur for some system > calls. More precisely, Xenomai skin system calls tables associates a few > flags with each system call, and some of these flags cause migration of > the caller when it issues the system call. > > Each Xenomai user-space thread has two contexts, a regular Linux > thread context, and a Xenomai thread called "shadow" thread. Both > contexts share the same stack and program counter, so that at any time, > at least one of the two contexts is seen as suspended by the scheduler > which handles it. > > Before xnshadow_harden is called, the Linux thread is running, and its > shadow is seen in suspended state with XNRELAX bit by Xenomai > scheduler. After xnshadow_harden, the Linux context is seen suspended > with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as > running by Xenomai scheduler. > > The migrating thread > > (nRT) is marked INTERRUPTIBLE and run by the Linux kernel > > wake_up_interruptible_sync() call. Is this thread actually run or does it > > merely put the thread in some Linux to-do list (I assumed the first case) ? > > Here, I am not sure, but it seems that when calling > wake_up_interruptible_sync the woken up task is put in the current CPU > runqueue, and this task (i.e. the gatekeeper), will not run until the > current thread (i.e. the thread running xnshadow_harden) marks itself as > suspended and calls schedule(). Maybe, marking the running thread as Depends on CONFIG_PREEMPT. If set, we get a preempt_schedule already here - and a switch if the prio of the woken up task is higher. BTW, an easy way to enforce the current trouble is to remove the "_sync" from wake_up_interruptible. As I understand it this _sync is just an optimisation hint for Linux to avoid needless scheduler runs. > suspended is not needed, since the gatekeeper may have a high priority, > and calling schedule() is enough. In any case, the waken up thread does > not seem to be run immediately, so this rather look like the second > case. > > Since in xnshadow_harden, the running thread marks itself as suspended > before running wake_up_interruptible_sync, the gatekeeper will run when > schedule() get called, which in turn, depend on the CONFIG_PREEMPT* > configuration. In the non-preempt case, the current thread will be > suspended and the gatekeeper will run when schedule() is explicitely > called in xnshadow_harden(). In the preempt case, schedule gets called > when the outermost spinlock is unlocked in wake_up_interruptible_sync(). > > > And how does it terminate: is only the system call migrated or is the > thread > > allowed to continue run (at a priority level equal to the Xenomai > > priority level) until it hits something of the Xenomai API (or trivially: > > explicitly go to RT using the API) ? > > I am not sure I follow you here. The usual case is that the thread will > remain in primary mode after the system call, but I think a system call > flag allow the other behaviour. So, if I understand the question > correctly, the answer is that it depends on the system call. > > > In that case, I expect the nRT thread to terminate with a schedule() > > call in the Xeno OS API code which deactivates the task so that it > > won't ever run in Linux context anymore. A top priority gatekeeper is > > in place as a software hook to catch Linux's attention right after > > that schedule(), which might otherwise schedule something else (and > > leave only interrupts for Xenomai to come back to life again). > > Here is the way I understand it. We have two threads, or rather two > "views" of the same thread, with each its state. Switching from > secondary to primary mode, i.e. xnshadow_harden and gatekeeper job, > means changing the two states at once. Since we can not do that, we need > an intermediate state. Since the intermediate state can not be the state > where the two threads are running (they share the same stack and > program counter), the intermediate state is a state where the two > threads are suspended, but another context needs running, it is the > gatekeeper. > > > I have > > the impression that I c
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jeroen Van den Keybus wrote: > Hello, > > > I'm currently not at a level to participate in your discussion. Although I'm > willing to supply you with stresstests, I would nevertheless like to learn > more from task migration as this debugging session proceeds. In order to do > so, please confirm the following statements or indicate where I went wrong. > I hope others may learn from this as well. > > xn_shadow_harden(): This is called whenever a Xenomai thread performs a > Linux (root domain) system call (notified by Adeos ?). xnshadow_harden() is called whenever a thread running in secondary mode (that is, running as a regular Linux thread, handled by Linux scheduler) is switching to primary mode (where it will run as a Xenomai thread, handled by Xenomai scheduler). Migrations occur for some system calls. More precisely, Xenomai skin system calls tables associates a few flags with each system call, and some of these flags cause migration of the caller when it issues the system call. Each Xenomai user-space thread has two contexts, a regular Linux thread context, and a Xenomai thread called "shadow" thread. Both contexts share the same stack and program counter, so that at any time, at least one of the two contexts is seen as suspended by the scheduler which handles it. Before xnshadow_harden is called, the Linux thread is running, and its shadow is seen in suspended state with XNRELAX bit by Xenomai scheduler. After xnshadow_harden, the Linux context is seen suspended with INTERRUPTIBLE state by Linux scheduler, and its shadow is seen as running by Xenomai scheduler. The migrating thread > (nRT) is marked INTERRUPTIBLE and run by the Linux kernel > wake_up_interruptible_sync() call. Is this thread actually run or does it > merely put the thread in some Linux to-do list (I assumed the first case) ? Here, I am not sure, but it seems that when calling wake_up_interruptible_sync the woken up task is put in the current CPU runqueue, and this task (i.e. the gatekeeper), will not run until the current thread (i.e. the thread running xnshadow_harden) marks itself as suspended and calls schedule(). Maybe, marking the running thread as suspended is not needed, since the gatekeeper may have a high priority, and calling schedule() is enough. In any case, the waken up thread does not seem to be run immediately, so this rather look like the second case. Since in xnshadow_harden, the running thread marks itself as suspended before running wake_up_interruptible_sync, the gatekeeper will run when schedule() get called, which in turn, depend on the CONFIG_PREEMPT* configuration. In the non-preempt case, the current thread will be suspended and the gatekeeper will run when schedule() is explicitely called in xnshadow_harden(). In the preempt case, schedule gets called when the outermost spinlock is unlocked in wake_up_interruptible_sync(). > And how does it terminate: is only the system call migrated or is the thread > allowed to continue run (at a priority level equal to the Xenomai > priority level) until it hits something of the Xenomai API (or trivially: > explicitly go to RT using the API) ? I am not sure I follow you here. The usual case is that the thread will remain in primary mode after the system call, but I think a system call flag allow the other behaviour. So, if I understand the question correctly, the answer is that it depends on the system call. > In that case, I expect the nRT thread to terminate with a schedule() > call in the Xeno OS API code which deactivates the task so that it > won't ever run in Linux context anymore. A top priority gatekeeper is > in place as a software hook to catch Linux's attention right after > that schedule(), which might otherwise schedule something else (and > leave only interrupts for Xenomai to come back to life again). Here is the way I understand it. We have two threads, or rather two "views" of the same thread, with each its state. Switching from secondary to primary mode, i.e. xnshadow_harden and gatekeeper job, means changing the two states at once. Since we can not do that, we need an intermediate state. Since the intermediate state can not be the state where the two threads are running (they share the same stack and program counter), the intermediate state is a state where the two threads are suspended, but another context needs running, it is the gatekeeper. > I have > the impression that I cannot see this gatekeeper, nor the (n)RT > threads using the ps command ? The gatekeeper and Xenomai user-space threads are regular Linux contexts, you can seen them using the ps command. > > Is it correct to state that the current preemption issue is due to the > gatekeeper being invoked too soon ? Could someone knowing more about the > migration technology explain what exactly goes wrong ? Jan seems to have found such an issue here. I am not sure I understood what he wrote. But if the issue is due to CONFIG_PREEMPT, it explains
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Hello, I'm currently not at a level to participate in your discussion. Although I'm willing to supply you with stresstests, I would nevertheless like to learn more from task migration as this debugging session proceeds. In order to do so, please confirm the following statements or indicate where I went wrong. I hope others may learn from this as well. xn_shadow_harden(): This is called whenever a Xenomai thread performs a Linux (root domain) system call (notified by Adeos ?). The migrating thread (nRT) is marked INTERRUPTIBLE and run by the Linux kernel wake_up_interruptible_sync() call. Is this thread actually run or does it merely put the thread in some Linux to-do list (I assumed the first case) ? And how does it terminate: is only the system call migrated or is the thread allowed to continue run (at a priority level equal to the Xenomai priority level) until it hits something of the Xenomai API (or trivially: explicitly go to RT using the API) ? In that case, I expect the nRT thread to terminate with a schedule() call in the Xeno OS API code which deactivates the task so that it won't ever run in Linux context anymore. A top priority gatekeeper is in place as a software hook to catch Linux's attention right after that schedule(), which might otherwise schedule something else (and leave only interrupts for Xenomai to come back to life again). I have the impression that I cannot see this gatekeeper, nor the (n)RT threads using the ps command ? Is it correct to state that the current preemption issue is due to the gatekeeper being invoked too soon ? Could someone knowing more about the migration technology explain what exactly goes wrong ? Thanks, Jeroen. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
> Hi, > > well, if I'm not totally wrong, we have a design problem in the > RT-thread hardening path. I dug into the crash Jeroen reported > and I'm > quite sure that this is the reason. > > So that's the bad news. The good one is that we can at least > work around > it by switching off CONFIG_PREEMPT for Linux (this implicitly means that > it's a 2.6-only issue). > > > But let's start with two assumptions my further analysis is > based on: > > [Xenomai] > o Shadow threads have only one stack, i.e. one context. If the > real-time part is active (this includes it is blocked on some > xnsynch object or delayed), the original Linux task must > NEVER EVER be > executed, even if it will immediately fall asleep again. That's > because the stack is in use by the real-time part at that time. > And this condition is checked in do_schedule_event() [1]. > > [Linux] > o A Linux task which has called > set_current_state() will > remain in the run-queue as long as it calls schedule() on its > own. Yes, you are right. Let's keep in mind the following piece of code. [*] [code] from sched.c::schedule() ... switch_count = &prev->nivcsw; if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) { <--- MUST BE TRUE FOR A TASK TO BE REMOVED switch_count = &prev->nvcsw; if (unlikely((prev->state & TASK_INTERRUPTIBLE) && unlikely(signal_pending(prev prev->state = TASK_RUNNING; else { if (prev->state == TASK_UNINTERRUPTIBLE) rq->nr_uninterruptible++; deactivate_task(prev, rq); <--- removing from the active queue } } ... [/code] On executing schedule(), a "current" (prev = current) task is not removed from the active queue in one of the following cases: [1] prev->state == 0, i.e. == TASK_RUNNING (since #define TASK_RUNNING 0); [2] add_preempt_count(PREEMPT_ACTIVE) has been called before calling schedule() from the task's context i.e. from the context of the "current" task (prev = current in schedule()); [3] there is a pending signal for the "current" task. Keeping that in mind too, let's take a look at what happens in your "crash"-scenario. > ... > > 3) Interruption by some Linux IRQ. This may cause other > threads to become runnable as well, but the gatekeeper has > the highest prio and will therefore be the next. The problem is > that the rescheduling on Linux IRQ exit will PREEMPT our task > in xnshadow_harden(), it will NOT remove it from the Linux > run-queue. Right. But what actually happens is the following sequence of calls: ret_from_intr ---> resume_kernel ---> need_resched ---> sched.c::preempt_schedule_irq() ---> schedule() (**) As a result, schedule() is called indeed but it does not execute the [*] code - the "current" task is not removed from the active queue. The reason is [2] (from the list above) and that's done in preempt_schedule_irq(). > And now we are in real troubles: The > gatekeeper will kick off our RT part which will take over the > thread's stack. As soon as the RT domain falls asleep and > Linux takes over again, it will continue our non-RT part as well! > Actually, this seems to be the reason for the panic in > do_schedule_event(). Without CONFIG_XENO_OPT_DEBUG > and this check, we will run both parts AT THE SAME > TIME now, thus violating my first assumption. The system gets > fatally corrupted. > > Well, I would be happy if someone can prove me wrong here. I'm afraid you are right. > The problem is that I don't see a solution because Linux does > not provide an atomic wake-up + schedule-out under > CONFIG_PREEMPT. I'm > currently considering a hack to remove the migrating Linux > thread manually from the run-queue, but this could easily break > the Linux scheduler. I have a "stupid" idea on top of my head but I'd prefer to test it on my own first so not to look as a complete idiot if it's totally wrong. Err... it's difficult to look more an idiot than I'm already? :o) > Jan -- Best regards,Dmitry Adamushko
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
> Hi, > > well, if I'm not totally wrong, we have a design problem in the > RT-thread hardening path. I dug into the crash Jeroen reported > and I'm > quite sure that this is the reason. > > So that's the bad news. The good one is that we can at least > work around > it by switching off CONFIG_PREEMPT for Linux (this implicitly means that > it's a 2.6-only issue). > > > But let's start with two assumptions my further analysis is > based on: > > [Xenomai] > o Shadow threads have only one stack, i.e. one context. If the > real-time part is active (this includes it is blocked on some > xnsynch object or delayed), the original Linux task must > NEVER EVER be > executed, even if it will immediately fall asleep again. That's > because the stack is in use by the real-time part at that time. > And this condition is checked in do_schedule_event() [1]. > > [Linux] > o A Linux task which has called > set_current_state() will > remain in the run-queue as long as it calls schedule() on its > own. Yes, you are right. Let's keep in mind the following piece of code. [*] [code] from sched.c::schedule() ... switch_count = &prev->nivcsw; if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) { <--- MUST BE TRUE FOR A TASK TO BE REMOVED switch_count = &prev->nvcsw; if (unlikely((prev->state & TASK_INTERRUPTIBLE) && unlikely(signal_pending(prev prev->state = TASK_RUNNING; else { if (prev->state == TASK_UNINTERRUPTIBLE) rq->nr_uninterruptible++; deactivate_task(prev, rq); <--- removing from the active queue } } ... [/code] On executing schedule(), a "current" (prev = current) task is not removed from the active queue in one of the following cases: [1] prev->state == 0, i.e. == TASK_RUNNING (since #define TASK_RUNNING 0); [2] add_preempt_count(PREEMPT_ACTIVE) has been called before calling schedule() from the task's context i.e. from the context of the "current" task (prev = current in schedule()); [3] there is a pending signal for the "current" task. Keeping that in mind too, let's take a look at what happens in your "crash"-scenario. > ... > > 3) Interruption by some Linux IRQ. This may cause other > threads to become runnable as well, but the gatekeeper has > the highest prio and will therefore be the next. The problem is > that the rescheduling on Linux IRQ exit will PREEMPT our task > in xnshadow_harden(), it will NOT remove it from the Linux > run-queue. Right. But what actually happens is the following sequence of calls: ret_from_intr ---> resume_kernel ---> need_resched ---> sched.c::preempt_schedule_irq() ---> schedule() (**) As a result, schedule() is called indeed but it does not execute the [*] code - the "current" task is not removed from the active queue. The reason is [2] (from the list above) and that's done in preempt_schedule_irq(). > And now we are in real troubles: The > gatekeeper will kick off our RT part which will take over the > thread's stack. As soon as the RT domain falls asleep and > Linux takes over again, it will continue our non-RT part as well! > Actually, this seems to be the reason for the panic in > do_schedule_event(). Without CONFIG_XENO_OPT_DEBUG > and this check, we will run both parts AT THE SAME > TIME now, thus violating my first assumption. The system gets > fatally corrupted. > > Well, I would be happy if someone can prove me wrong here. I'm afraid you are right. > The problem is that I don't see a solution because Linux does > not provide an atomic wake-up + schedule-out under > CONFIG_PREEMPT. I'm > currently considering a hack to remove the migrating Linux > thread manually from the run-queue, but this could easily break > the Linux scheduler. I have a "stupid" idea on top of my head but I'd prefer to test it on my own first so not to look as a complete idiot if it's totally wrong. Err... it's difficult to look more an idiot than I'm already? :o) > Jan -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Hannes Mayer wrote: > Jan Kiszka wrote: > [...] >> PS: Out of curiosity I also checked RTAI's migration mechanism in this >> regard. It's similar except for the fact that it does the gatekeeper's >> work in the Linux scheduler's tail (i.e. after the next context switch). >> And RTAI seems it suffers from the very same race. So this is either a >> fundamental issue - or I'm fundamentally wrong. > > > Well, most of the stuff you guys talk about in this thread is still > beyond my level, but out of curiosity I ported the SEM example to > RTAI (see attached sem.c) > I couldn't come up with something similar to rt_sem_inquire and > rt_task_inquire in RTAI (in "void output(char c)")... > Anyway, unless I haven't missed something else important while > porting, the example runs flawlessly on RTAI 3.3test3 (kernel 2.6.15). > My claim on the RTAI race is based on quick code analysis and a bit outdated information about its core design. I haven't tried any code to crash it, and I guess it will take a slightly different test design to trigger the issue there. As soon as someone could follow my reasoning and confirm it (don't mind that you did not understand it, I hadn't either two days ago, this is quite heavy stuff), I will inform Paolo about this potential problem. Jan signature.asc Description: OpenPGP digital signature
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jan Kiszka wrote: [...] PS: Out of curiosity I also checked RTAI's migration mechanism in this regard. It's similar except for the fact that it does the gatekeeper's work in the Linux scheduler's tail (i.e. after the next context switch). And RTAI seems it suffers from the very same race. So this is either a fundamental issue - or I'm fundamentally wrong. Well, most of the stuff you guys talk about in this thread is still beyond my level, but out of curiosity I ported the SEM example to RTAI (see attached sem.c) I couldn't come up with something similar to rt_sem_inquire and rt_task_inquire in RTAI (in "void output(char c)")... Anyway, unless I haven't missed something else important while porting, the example runs flawlessly on RTAI 3.3test3 (kernel 2.6.15). Best regards, Hannes. /* TEST_SEM.C ported to RTAI3.3*/ #include #include #include #include #include #include #include #include #include #include #include int fd, err; int t0end = 1; int t1end = 1; SEM *s, *m; float tmax = 1.0e9; #define CHECK(arg) check(arg, __LINE__) int check(int r, int n) { if (r != 0) fprintf(stderr, "L%d: %s.\n", n, strerror(-r)); return(r); } void output(char c) { static int cnt = 0; int n; char buf[2]; buf[0] = c; if (cnt == 80) { buf[1] = '\n'; n = 2; cnt = 0; } else { n = 1; cnt++; } /* CHECK(rt_sem_inquire(&m, &seminfo)); if (seminfo.count != 0) { RT_TASK_INFO taskinfo; CHECK(rt_task_inquire(NULL, &taskinfo)); fprintf(stderr, "ALERT: No lock! (count=%ld) Offending task: %s\n", seminfo.count, taskinfo.name); } */ if (write(fd, buf, n) != n) { fprintf(stderr, "File write error.\n"); CHECK( rt_sem_signal(s) ); } } static void *task0(void *args) { RT_TASK *handler; if (!(handler = rt_task_init_schmod(nam2num("T0HDLR"), 0, 0, 0, SCHED_FIFO, 0xF))) { printf("CANNOT INIT HANDLER TASK > T0HDLR <\n"); exit(1); } rt_allow_nonroot_hrt(); mlockall(MCL_CURRENT | MCL_FUTURE); rt_make_hard_real_time(); t0end = 0; rt_task_use_fpu(handler, TASK_USE_FPU ); while ( !t0end ) { rt_sleep((float)rand()*tmax/(float)RAND_MAX); rt_sem_wait(m); output('0'); CHECK( rt_sem_signal(m) ); } rt_make_soft_real_time(); rt_task_delete(handler); return 0; } static void *task1(void *args) { RT_TASK *handler; if (!(handler = rt_task_init_schmod(nam2num("T1HDLR"), 0, 0, 0, SCHED_FIFO, 0xF))) { printf("CANNOT INIT HANDLER TASK > T1HDLR <\n"); exit(1); } rt_allow_nonroot_hrt(); mlockall(MCL_CURRENT | MCL_FUTURE); rt_make_hard_real_time(); t1end = 0; rt_task_use_fpu(handler, TASK_USE_FPU ); while ( !t1end ) { rt_sleep((float)rand()*tmax/(float)RAND_MAX); rt_sem_wait(m); output('1'); CHECK( rt_sem_signal(m) ); } rt_make_soft_real_time(); rt_task_delete(handler); return 0; } void sighandler(int arg) { CHECK(rt_sem_signal(s)); } int main(int argc, char *argv[]) { RT_TASK *maint; //, *squaretask; int t0, t1; if ((fd = open("dump.txt", O_CREAT | O_TRUNC | O_WRONLY)) < 0) fprintf(stderr, "File open error.\n"); else { if (argc == 2) { tmax = atof(argv[1]); if (tmax == 0.0) tmax = 1.0e7; } rt_set_oneshot_mode(); start_rt_timer(0); m = rt_sem_init(nam2num("MSEM"), 1); s = rt_sem_init(nam2num("SSEM"), 0); signal(SIGINT, sighandler); if (!(maint = rt_task_init(nam2num("MAIN"), 1, 0, 0))) { printf("CANNOT INIT MAIN TASK > MAIN <\n"); exit(1); } t0 = rt_thread_create(task0, NULL, 1); // create thread while (t0end) { // wait until thread went to hard real time usleep(10); } t1 = rt_thread_create(task1, NULL, 1); // create thread while (t1end) { // wait until thread went to hard real time usleep(10); } printf("Running for %.2f seconds.\n", (float)MAXLONG/1.0e9); rt_sem_wait(s); signal(SIGINT, SIG_IGN); t0end = 1; t1end = 1; printf("TEST ENDS\n"); CHECK( rt_thread_join(t0) ); CHECK( rt_thread_join(t1) ); CHECK(rt_sem_delete(s)); CHECK(rt_sem_delete(m)); CHECK( rt_task_delete(maint) ); close(fd); } return 0; }
[Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Hi, well, if I'm not totally wrong, we have a design problem in the RT-thread hardening path. I dug into the crash Jeroen reported and I'm quite sure that this is the reason. So that's the bad news. The good one is that we can at least work around it by switching off CONFIG_PREEMPT for Linux (this implicitly means that it's a 2.6-only issue). @Jeroen: Did you verify that your setup also works fine without CONFIG_PREEMPT? But let's start with two assumptions my further analysis is based on: [Xenomai] o Shadow threads have only one stack, i.e. one context. If the real-time part is active (this includes it is blocked on some xnsynch object or delayed), the original Linux task must NEVER EVER be executed, even if it will immediately fall asleep again. That's because the stack is in use by the real-time part at that time. And this condition is checked in do_schedule_event() [1]. [Linux] o A Linux task which has called set_current_state() will remain in the run-queue as long as it calls schedule() on its own. This means that it can be preempted (if CONFIG_PREEMPT is set) between set_current_state() and schedule() and then even be resumed again. Only the explicit call of schedule() will trigger deactivate_task() which will in turn remove current from the run-queue. Ok, if this is true, let's have a look at xnshadow_harden(): After grabbing the gatekeeper sem and putting itself in gk->thread, a task going for RT then marks itself TASK_INTERRUPTIBLE and wakes up the gatekeeper [2]. This does not include a Linux reschedule due to the _sync version of wake_up_interruptible. What can happen now? 1) No interruption until we can called schedule() [3]. All fine as we will not be removed from the run-queue before the gatekeeper starts kicking our RT part, thus no conflict in using the thread's stack. 3) Interruption by a RT IRQ. This would just delay the path described above, even if some RT threads get executed. Once they are finished, we continue in xnshadow_harden() - given that the RT part does not trigger the following case: 3) Interruption by some Linux IRQ. This may cause other threads to become runnable as well, but the gatekeeper has the highest prio and will therefore be the next. The problem is that the rescheduling on Linux IRQ exit will PREEMPT our task in xnshadow_harden(), it will NOT remove it from the Linux run-queue. And now we are in real troubles: The gatekeeper will kick off our RT part which will take over the thread's stack. As soon as the RT domain falls asleep and Linux takes over again, it will continue our non-RT part as well! Actually, this seems to be the reason for the panic in do_schedule_event(). Without CONFIG_XENO_OPT_DEBUG and this check, we will run both parts AT THE SAME TIME now, thus violating my first assumption. The system gets fatally corrupted. Well, I would be happy if someone can prove me wrong here. The problem is that I don't see a solution because Linux does not provide an atomic wake-up + schedule-out under CONFIG_PREEMPT. I'm currently considering a hack to remove the migrating Linux thread manually from the run-queue, but this could easily break the Linux scheduler. Jan PS: Out of curiosity I also checked RTAI's migration mechanism in this regard. It's similar except for the fact that it does the gatekeeper's work in the Linux scheduler's tail (i.e. after the next context switch). And RTAI seems it suffers from the very same race. So this is either a fundamental issue - or I'm fundamentally wrong. [1]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L1573 [2]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L461 [3]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L481 signature.asc Description: OpenPGP digital signature
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Hannes Mayer wrote: > Jan Kiszka wrote: > [...] >> PS: Out of curiosity I also checked RTAI's migration mechanism in this >> regard. It's similar except for the fact that it does the gatekeeper's >> work in the Linux scheduler's tail (i.e. after the next context switch). >> And RTAI seems it suffers from the very same race. So this is either a >> fundamental issue - or I'm fundamentally wrong. > > > Well, most of the stuff you guys talk about in this thread is still > beyond my level, but out of curiosity I ported the SEM example to > RTAI (see attached sem.c) > I couldn't come up with something similar to rt_sem_inquire and > rt_task_inquire in RTAI (in "void output(char c)")... > Anyway, unless I haven't missed something else important while > porting, the example runs flawlessly on RTAI 3.3test3 (kernel 2.6.15). > My claim on the RTAI race is based on quick code analysis and a bit outdated information about its core design. I haven't tried any code to crash it, and I guess it will take a slightly different test design to trigger the issue there. As soon as someone could follow my reasoning and confirm it (don't mind that you did not understand it, I hadn't either two days ago, this is quite heavy stuff), I will inform Paolo about this potential problem. Jan signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Jan Kiszka wrote: [...] PS: Out of curiosity I also checked RTAI's migration mechanism in this regard. It's similar except for the fact that it does the gatekeeper's work in the Linux scheduler's tail (i.e. after the next context switch). And RTAI seems it suffers from the very same race. So this is either a fundamental issue - or I'm fundamentally wrong. Well, most of the stuff you guys talk about in this thread is still beyond my level, but out of curiosity I ported the SEM example to RTAI (see attached sem.c) I couldn't come up with something similar to rt_sem_inquire and rt_task_inquire in RTAI (in "void output(char c)")... Anyway, unless I haven't missed something else important while porting, the example runs flawlessly on RTAI 3.3test3 (kernel 2.6.15). Best regards, Hannes. /* TEST_SEM.C ported to RTAI3.3*/ #include #include #include #include #include #include #include #include #include #include #include int fd, err; int t0end = 1; int t1end = 1; SEM *s, *m; float tmax = 1.0e9; #define CHECK(arg) check(arg, __LINE__) int check(int r, int n) { if (r != 0) fprintf(stderr, "L%d: %s.\n", n, strerror(-r)); return(r); } void output(char c) { static int cnt = 0; int n; char buf[2]; buf[0] = c; if (cnt == 80) { buf[1] = '\n'; n = 2; cnt = 0; } else { n = 1; cnt++; } /* CHECK(rt_sem_inquire(&m, &seminfo)); if (seminfo.count != 0) { RT_TASK_INFO taskinfo; CHECK(rt_task_inquire(NULL, &taskinfo)); fprintf(stderr, "ALERT: No lock! (count=%ld) Offending task: %s\n", seminfo.count, taskinfo.name); } */ if (write(fd, buf, n) != n) { fprintf(stderr, "File write error.\n"); CHECK( rt_sem_signal(s) ); } } static void *task0(void *args) { RT_TASK *handler; if (!(handler = rt_task_init_schmod(nam2num("T0HDLR"), 0, 0, 0, SCHED_FIFO, 0xF))) { printf("CANNOT INIT HANDLER TASK > T0HDLR <\n"); exit(1); } rt_allow_nonroot_hrt(); mlockall(MCL_CURRENT | MCL_FUTURE); rt_make_hard_real_time(); t0end = 0; rt_task_use_fpu(handler, TASK_USE_FPU ); while ( !t0end ) { rt_sleep((float)rand()*tmax/(float)RAND_MAX); rt_sem_wait(m); output('0'); CHECK( rt_sem_signal(m) ); } rt_make_soft_real_time(); rt_task_delete(handler); return 0; } static void *task1(void *args) { RT_TASK *handler; if (!(handler = rt_task_init_schmod(nam2num("T1HDLR"), 0, 0, 0, SCHED_FIFO, 0xF))) { printf("CANNOT INIT HANDLER TASK > T1HDLR <\n"); exit(1); } rt_allow_nonroot_hrt(); mlockall(MCL_CURRENT | MCL_FUTURE); rt_make_hard_real_time(); t1end = 0; rt_task_use_fpu(handler, TASK_USE_FPU ); while ( !t1end ) { rt_sleep((float)rand()*tmax/(float)RAND_MAX); rt_sem_wait(m); output('1'); CHECK( rt_sem_signal(m) ); } rt_make_soft_real_time(); rt_task_delete(handler); return 0; } void sighandler(int arg) { CHECK(rt_sem_signal(s)); } int main(int argc, char *argv[]) { RT_TASK *maint; //, *squaretask; int t0, t1; if ((fd = open("dump.txt", O_CREAT | O_TRUNC | O_WRONLY)) < 0) fprintf(stderr, "File open error.\n"); else { if (argc == 2) { tmax = atof(argv[1]); if (tmax == 0.0) tmax = 1.0e7; } rt_set_oneshot_mode(); start_rt_timer(0); m = rt_sem_init(nam2num("MSEM"), 1); s = rt_sem_init(nam2num("SSEM"), 0); signal(SIGINT, sighandler); if (!(maint = rt_task_init(nam2num("MAIN"), 1, 0, 0))) { printf("CANNOT INIT MAIN TASK > MAIN <\n"); exit(1); } t0 = rt_thread_create(task0, NULL, 1); // create thread while (t0end) { // wait until thread went to hard real time usleep(10); } t1 = rt_thread_create(task1, NULL, 1); // create thread while (t1end) { // wait until thread went to hard real time usleep(10); } printf("Running for %.2f seconds.\n", (float)MAXLONG/1.0e9); rt_sem_wait(s); signal(SIGINT, SIG_IGN); t0end = 1; t1end = 1; printf("TEST ENDS\n"); CHECK( rt_thread_join(t0) ); CHECK( rt_thread_join(t1) ); CHECK(rt_sem_delete(s)); CHECK(rt_sem_delete(m)); CHECK( rt_task_delete(maint) ); close(fd); } return 0; } ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
Hi, well, if I'm not totally wrong, we have a design problem in the RT-thread hardening path. I dug into the crash Jeroen reported and I'm quite sure that this is the reason. So that's the bad news. The good one is that we can at least work around it by switching off CONFIG_PREEMPT for Linux (this implicitly means that it's a 2.6-only issue). @Jeroen: Did you verify that your setup also works fine without CONFIG_PREEMPT? But let's start with two assumptions my further analysis is based on: [Xenomai] o Shadow threads have only one stack, i.e. one context. If the real-time part is active (this includes it is blocked on some xnsynch object or delayed), the original Linux task must NEVER EVER be executed, even if it will immediately fall asleep again. That's because the stack is in use by the real-time part at that time. And this condition is checked in do_schedule_event() [1]. [Linux] o A Linux task which has called set_current_state() will remain in the run-queue as long as it calls schedule() on its own. This means that it can be preempted (if CONFIG_PREEMPT is set) between set_current_state() and schedule() and then even be resumed again. Only the explicit call of schedule() will trigger deactivate_task() which will in turn remove current from the run-queue. Ok, if this is true, let's have a look at xnshadow_harden(): After grabbing the gatekeeper sem and putting itself in gk->thread, a task going for RT then marks itself TASK_INTERRUPTIBLE and wakes up the gatekeeper [2]. This does not include a Linux reschedule due to the _sync version of wake_up_interruptible. What can happen now? 1) No interruption until we can called schedule() [3]. All fine as we will not be removed from the run-queue before the gatekeeper starts kicking our RT part, thus no conflict in using the thread's stack. 3) Interruption by a RT IRQ. This would just delay the path described above, even if some RT threads get executed. Once they are finished, we continue in xnshadow_harden() - given that the RT part does not trigger the following case: 3) Interruption by some Linux IRQ. This may cause other threads to become runnable as well, but the gatekeeper has the highest prio and will therefore be the next. The problem is that the rescheduling on Linux IRQ exit will PREEMPT our task in xnshadow_harden(), it will NOT remove it from the Linux run-queue. And now we are in real troubles: The gatekeeper will kick off our RT part which will take over the thread's stack. As soon as the RT domain falls asleep and Linux takes over again, it will continue our non-RT part as well! Actually, this seems to be the reason for the panic in do_schedule_event(). Without CONFIG_XENO_OPT_DEBUG and this check, we will run both parts AT THE SAME TIME now, thus violating my first assumption. The system gets fatally corrupted. Well, I would be happy if someone can prove me wrong here. The problem is that I don't see a solution because Linux does not provide an atomic wake-up + schedule-out under CONFIG_PREEMPT. I'm currently considering a hack to remove the migrating Linux thread manually from the run-queue, but this could easily break the Linux scheduler. Jan PS: Out of curiosity I also checked RTAI's migration mechanism in this regard. It's similar except for the fact that it does the gatekeeper's work in the Linux scheduler's tail (i.e. after the next context switch). And RTAI seems it suffers from the very same race. So this is either a fundamental issue - or I'm fundamentally wrong. [1]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L1573 [2]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L461 [3]http://www.rts.uni-hannover.de/xenomai/lxr/source/ksrc/nucleus/shadow.c?v=SVN-trunk#L481 signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core