Re: [Xenomai-core] [BUG] module usage counter of xenomai native corrupted (version 2.2.0 and 2.2.5)
Jan Kiszka wrote: Thomas Wiedemann wrote: Hi, there seems to be a bug in rt_task_create(). When no more memory is available, the module usage counter of xeno_native is decremented. I guess it is not incremented before, however, so the counter gets 0 and wraps then to a negative number. It is therefore not possible to remove the module. I appended a small program to demonstrate this. It simply eats up all memory from xenomai by registering as much mutexes as possible, and then tries to execute rt_task_create(), which fails. When started again, the bug occurs at rt_task_shadow(), as the mutexes have never been deleted. Compile with gcc -O2 -Wall `xeno-config --xeno-cflags` `xeno-config --xeno-ldflags` -lrtdm -lnative -o rttest rttest.c then simply run it, and watch the output of lsmod before and after. Tested with xenomai 2.2.{0,5} and linux 2.6.17.8, modules loaded: xeno_native and xeno_nucleus. Confirmed. Requires a closer look to find the leak path. Here is what happens: the task is created with the XNSHADOW bit, and destroyed before it was xnshadow_mapped, but the deletion hook calls xnshadow_unmap because the task has the XNSHADOW bit. And xnshadow_unmap decrements the module count. -- Gilles Chanteperdrix ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] module usage counter of xenomai native corrupted (version 2.2.0 and 2.2.5)
Gilles Chanteperdrix wrote: Gilles Chanteperdrix wrote: Jan Kiszka wrote: Thomas Wiedemann wrote: Hi, there seems to be a bug in rt_task_create(). When no more memory is available, the module usage counter of xeno_native is decremented. I guess it is not incremented before, however, so the counter gets 0 and wraps then to a negative number. It is therefore not possible to remove the module. I appended a small program to demonstrate this. It simply eats up all memory from xenomai by registering as much mutexes as possible, and then tries to execute rt_task_create(), which fails. When started again, the bug occurs at rt_task_shadow(), as the mutexes have never been deleted. Compile with gcc -O2 -Wall `xeno-config --xeno-cflags` `xeno-config --xeno-ldflags` -lrtdm -lnative -o rttest rttest.c then simply run it, and watch the output of lsmod before and after. Tested with xenomai 2.2.{0,5} and linux 2.6.17.8, modules loaded: xeno_native and xeno_nucleus. Confirmed. Requires a closer look to find the leak path. Here is what happens: the task is created with the XNSHADOW bit, and destroyed before it was xnshadow_mapped, but the deletion hook calls xnshadow_unmap because the task has the XNSHADOW bit. And xnshadow_unmap decrements the module count. Here is an untested quick fix. Index: ksrc/nucleus/shadow.c === --- ksrc/nucleus/shadow.c (révision 1930) +++ ksrc/nucleus/shadow.c (copie de travail) @@ -888,6 +888,9 @@ p = xnthread_archtcb(thread)-user_task;/* May be != current */ + if (!xnshadow_thrptd(p)) + return; + magic = xnthread_get_magic(thread); for (muxid = 0; muxid XENOMAI_MUX_NR; muxid++) { Nope, shows unwanted side effects, probably because xnshadow_thrptd is already NULL'ed in do_taskexit_event. Looks like it takes an extra flag, no? signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Buildbot: Failure building Xenomai kernel for TQM860
Wolfgang Grandegger wrote: Hi Niklaus, I just compiled my Linux 2.4 kernel for TQM860L with the latest revision of Xenomai and I cannot reproduce your problem. In you linker path there are no Xenomai objects. How does it come? Do you use --arch=ppc with prepare_kernel (--arch=powerpc is not valid any more for the ppc tree). The problem pops up with CONFIG_XENO_OPT_SCALABLE_SCHED. The definition of xnlogerr is not visible to queue.h, namely to the inline function getmlq. Here is possible solution at XENO_ASSERT level. Jan Index: include/nucleus/assert.h === --- include/nucleus/assert.h(revision 1930) +++ include/nucleus/assert.h(working copy) @@ -27,7 +27,7 @@ #define XENO_ASSERT(subsystem,cond,action) do { \ if (unlikely(CONFIG_XENO_OPT_DEBUG_##subsystem 0 !(cond))) { \ xnarch_trace_panic_freeze(); \ -xnlogerr(assertion failed at %s:%d (%s)\n, __FILE__, __LINE__, (#cond)); \ +xnarch_logerr(assertion failed at %s:%d (%s)\n, __FILE__, __LINE__, (#cond)); \ xnarch_trace_panic_dump(); \ action; \ } \ signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Buildbot: Failure building Xenomai kernel for TQM860
Am Freitag, 8. Dezember 2006 13:40 schrieb Wolfgang Grandegger: Hi Niklaus, I just compiled my Linux 2.4 kernel for TQM860L with the latest revision of Xenomai and I cannot reproduce your problem. In you linker path there are no Xenomai objects. How does it come? Do you use --arch=ppc with prepare_kernel (--arch=powerpc is not valid any more for the ppc tree). As seen in http://ngiger.dyndns.org:8011/tqm_f/builds/30/step-prep_bb/0 Exec in /var/buildbot/slave/tqm_f/xenomai: scripts/prepare-kernel.sh --arch=ppc --adeos=ksrc/arch/powerpc/patches/adeos-ipipe-2.4.25-*.patch --linux=/var/buildbot/slave/tqm_f/linux Then in http://ngiger.dyndns.org:8011/tqm_f/builds/30/step-cfg_kernel/0 /var/buildbot/scripts/configure.sh /var/buildbot/configs/TQM860L_defconfig CROSS_COMPILE=powerpc-860-linux-gnu- ARCH=ppc and finally the build with make --jobs=4 uImage modules The http logs also show all environment variables (e.g. CROSS_COMPILE, PATH). You can navigate between the different builds and buildsteps on http://ngiger.dyndns.org:8011/. I tried to make all steps transparent to the observer, to enable them to verify themselves all the build step. If you are unable to reach this site please tell me so. I updated my apache2 server and since this time the proxy/reverse proxy form http://ngiger.dyndns.org/buildbot to http://ngiger.dyndns.org:8011/ does not work anymore (but that is another problem I am investigating). I did not change my Buildbot master setup since December 4. The prepare_slave.rb was updated on December 3 I updated today the TQM860L_defconfig to enable more CAN drivers etc, but that didn't change the error. Build 28 (revision 1920) completed without any problem on December 6, and the build on December 7 (revision 1930). All these build are full build, meaning that all the build directories get removed before starting the prepare_kernel.sh Did you also clean rebuild? Is there somewhere a missing include? I see that include/xenomai/nucleus/pod.h has #define xnlogerr(fmt,args...) xnarch_logerr(fmt , ##args But grep finds xnlogerr in the following object files kernel/xenomai/nucleus/xeno_nucleus.o. kernel/xenomai/nucleus/built-in.o. Wolfgang. Niklaus Giger wrote: Hi I get an error building the TQM860 image (2.4 based kernel). Last successful build was with revision 1920 (didn't build between). Log is available under http://ngiger.dyndns.org:8011/tqm_f/builds/30/step-mk_kernel/1 Best regards -- Niklaus Giger ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] module usage counter of xenomai native corrupted (version 2.2.0 and 2.2.5)
Gilles Chanteperdrix wrote: Jan Kiszka wrote: Gilles Chanteperdrix wrote: Gilles Chanteperdrix wrote: Jan Kiszka wrote: Thomas Wiedemann wrote: Hi, there seems to be a bug in rt_task_create(). When no more memory is available, the module usage counter of xeno_native is decremented. I guess it is not incremented before, however, so the counter gets 0 and wraps then to a negative number. It is therefore not possible to remove the module. I appended a small program to demonstrate this. It simply eats up all memory from xenomai by registering as much mutexes as possible, and then tries to execute rt_task_create(), which fails. When started again, the bug occurs at rt_task_shadow(), as the mutexes have never been deleted. Compile with gcc -O2 -Wall `xeno-config --xeno-cflags` `xeno-config --xeno-ldflags` -lrtdm -lnative -o rttest rttest.c then simply run it, and watch the output of lsmod before and after. Tested with xenomai 2.2.{0,5} and linux 2.6.17.8, modules loaded: xeno_native and xeno_nucleus. Confirmed. Requires a closer look to find the leak path. Here is what happens: the task is created with the XNSHADOW bit, and destroyed before it was xnshadow_mapped, but the deletion hook calls xnshadow_unmap because the task has the XNSHADOW bit. And xnshadow_unmap decrements the module count. Here is an untested quick fix. Index: ksrc/nucleus/shadow.c === --- ksrc/nucleus/shadow.c (révision 1930) +++ ksrc/nucleus/shadow.c (copie de travail) @@ -888,6 +888,9 @@ p = xnthread_archtcb(thread)-user_task;/* May be != current */ + if (!xnshadow_thrptd(p)) + return; + magic = xnthread_get_magic(thread); for (muxid = 0; muxid XENOMAI_MUX_NR; muxid++) { Nope, shows unwanted side effects, probably because xnshadow_thrptd is already NULL'ed in do_taskexit_event. Looks like it takes an extra flag, no? Setting xnshadow_thrptd to NULL in do_taskexit_event does not seem to be that useful. Here comes version 2. Index: ksrc/nucleus/shadow.c === --- ksrc/nucleus/shadow.c (révision 1930) +++ ksrc/nucleus/shadow.c (copie de travail) @@ -888,6 +888,9 @@ p = xnthread_archtcb(thread)-user_task;/* May be != current */ + if (!xnshadow_thrptd(p)) + return; + magic = xnthread_get_magic(thread); for (muxid = 0; muxid XENOMAI_MUX_NR; muxid++) { @@ -1639,8 +1642,6 @@ xnshadow_relax(0); xnlock_get_irqsave(nklock, s); - /* Prevent wakeup call from xnshadow_unmap(). */ - xnshadow_thrptd(p) = NULL; xnthread_archtcb(thread)-user_task = NULL; /* xnpod_delete_thread() - hook - xnshadow_unmap(). */ xnpod_delete_thread(thread); Can't comment on the correctness of the second hunk, but it unfortunately doesn't change the situation that test case does not longer terminate with the first hunk applied. May look like a trivial issue - but it isn't. :- signature.asc Description: OpenPGP digital signature ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [Adeos-main] Differences between IPIPE v1.5 and v1.6
On Thu, 2006-12-07 at 10:59 +0100, Wolfgang Grandegger wrote: Hi Philippe, what are the major differences between the ADEOS-IPIPE patch versions v1.5 and v1.6, apart from support for the new genirq layer. I realized, that the arch specific files ipipe-core.c and ipipe-root.c have been merged into ipipe.c. Relying on the genirq layer, and the related changes it implies on the flow handling in the I-pipe code, has motivated the version change. The rest is x86 specific, and does not include radical changes (e.g. a few more spinlocks ironed in the legacy driver code). Wolfgang. ___ Adeos-main mailing list Adeos-main@gna.org https://mail.gna.org/listinfo/adeos-main -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Re: [patch] memory barriers in intr.c :: xnintr_lock/unlock()
On Thu, 2006-12-07 at 00:03 +0100, Jan Kiszka wrote: Dmitry Adamushko wrote: Hello, following the recent discussion with Jan, here is a patch that aims at allowing xnintr_lock/unlock actually do what they were supposed to do in the first instance. [...] --- xenomai/ksrc/nucleus/intr-old.c 2006-11-12 00:17:56.0 +0100 +++ xenomai/ksrc/nucleus/intr.c 2006-11-12 00:22:15.0 +0100 @@ -135,12 +135,14 @@ static inline void xnintr_shirq_lock(xni { #ifdef CONFIG_SMP xnarch_atomic_inc(shirq-active); + xnarch_memory_barrier(); #endif } static inline void xnintr_shirq_unlock(xnintr_shirq_t *shirq) { #ifdef CONFIG_SMP + xnarch_memory_barrier(); xnarch_atomic_dec(shirq-active); #endif } As Dmitry and I are still a bit undecided about who to evolve such RCU locks best but still face this SMP bug in the current code, we are suggesting now to merge the patch above as-is for 2.3 - before things get lost for the release. Ack. It's in my patch queue. Given the changes, the risk for regression is zero and the situation could only improve with those, so this is going to be applied last. Jan ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [BUG] module usage counter of xenomai native corrupted (version 2.2.0 and 2.2.5)
Jan Kiszka wrote: Gilles Chanteperdrix wrote: Index: ksrc/nucleus/shadow.c === --- ksrc/nucleus/shadow.c (révision 1930) +++ ksrc/nucleus/shadow.c (copie de travail) @@ -888,6 +888,9 @@ p = xnthread_archtcb(thread)-user_task;/* May be != current */ + if (!xnshadow_thrptd(p)) + return; + magic = xnthread_get_magic(thread); for (muxid = 0; muxid XENOMAI_MUX_NR; muxid++) { @@ -1639,8 +1642,6 @@ xnshadow_relax(0); xnlock_get_irqsave(nklock, s); - /* Prevent wakeup call from xnshadow_unmap(). */ - xnshadow_thrptd(p) = NULL; xnthread_archtcb(thread)-user_task = NULL; /* xnpod_delete_thread() - hook - xnshadow_unmap(). */ xnpod_delete_thread(thread); Can't comment on the correctness of the second hunk, but it unfortunately doesn't change the situation that test case does not longer terminate with the first hunk applied. May look like a trivial issue - but it isn't. :- Indeed. And xnshadow_thrptd(current) == NULL is used by xnpod_schedule, so the patch is probably completely incorrect. -- Gilles Chanteperdrix ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core