Re: [Xenomai-core] kernel threads crash
Okay /Jesper On 2011-04-19 11:30, Philippe Gerum wrote: > On Tue, 2011-04-19 at 11:29 +0200, Philippe Gerum wrote: > >> On Tue, 2011-04-19 at 10:42 +0200, Gilles Chanteperdrix wrote: >> >>> Philippe Gerum wrote: >>> On Tue, 2011-04-19 at 09:58 +0200, Jesper Christensen wrote: > Great thanks, but i can't help wondering if the problems i'm seeing are > related to some of my userspace programs using fp. > I don't think so. The switchtest programs exercises the FPU hardware in a certain way to make sure it is available in real-time mode from kernel space (which is an utterly crappy legacy, but we will have to deal with it until Xenomai 3.x). As far as I can see from your .config, you can't have such support, so switchtest was basically trying to test an inexistent feature. >>> In fact, switchtest whether Xenomai FPU switch routines work when the >>> Linux kernel itself uses FPU in kernel-space. Currently, the only place >>> when this happens is in the RAID code: x86 uses mmx/sse, and some power >>> pcs use altivec. Some powerpc also fix unaligned accesses to floating >>> point data in kernel-space, I do not know if this may interfere, which >>> is why the powerpc code is compiled even without RAID. >>> >>> >>> >> AFAICS, fp_regs_set() on ppc is issuing a load float instruction in >> kernel space which could be unaligned, and therefore trap. Looking at >> the .config for the target system, hw FPU support is disabled in the >> alignment code, so basically, this would beget a nop. >> > A nop in fixing the issue, I mean. > > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On Tue, 2011-04-19 at 11:29 +0200, Philippe Gerum wrote: > On Tue, 2011-04-19 at 10:42 +0200, Gilles Chanteperdrix wrote: > > Philippe Gerum wrote: > > > On Tue, 2011-04-19 at 09:58 +0200, Jesper Christensen wrote: > > >> Great thanks, but i can't help wondering if the problems i'm seeing are > > >> related to some of my userspace programs using fp. > > > > > > I don't think so. The switchtest programs exercises the FPU hardware in > > > a certain way to make sure it is available in real-time mode from kernel > > > space (which is an utterly crappy legacy, but we will have to deal with > > > it until Xenomai 3.x). As far as I can see from your .config, you can't > > > have such support, so switchtest was basically trying to test an > > > inexistent feature. > > > > In fact, switchtest whether Xenomai FPU switch routines work when the > > Linux kernel itself uses FPU in kernel-space. Currently, the only place > > when this happens is in the RAID code: x86 uses mmx/sse, and some power > > pcs use altivec. Some powerpc also fix unaligned accesses to floating > > point data in kernel-space, I do not know if this may interfere, which > > is why the powerpc code is compiled even without RAID. > > > > > > AFAICS, fp_regs_set() on ppc is issuing a load float instruction in > kernel space which could be unaligned, and therefore trap. Looking at > the .config for the target system, hw FPU support is disabled in the > alignment code, so basically, this would beget a nop. A nop in fixing the issue, I mean. -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On Tue, 2011-04-19 at 10:42 +0200, Gilles Chanteperdrix wrote: > Philippe Gerum wrote: > > On Tue, 2011-04-19 at 09:58 +0200, Jesper Christensen wrote: > >> Great thanks, but i can't help wondering if the problems i'm seeing are > >> related to some of my userspace programs using fp. > > > > I don't think so. The switchtest programs exercises the FPU hardware in > > a certain way to make sure it is available in real-time mode from kernel > > space (which is an utterly crappy legacy, but we will have to deal with > > it until Xenomai 3.x). As far as I can see from your .config, you can't > > have such support, so switchtest was basically trying to test an > > inexistent feature. > > In fact, switchtest whether Xenomai FPU switch routines work when the > Linux kernel itself uses FPU in kernel-space. Currently, the only place > when this happens is in the RAID code: x86 uses mmx/sse, and some power > pcs use altivec. Some powerpc also fix unaligned accesses to floating > point data in kernel-space, I do not know if this may interfere, which > is why the powerpc code is compiled even without RAID. > > AFAICS, fp_regs_set() on ppc is issuing a load float instruction in kernel space which could be unaligned, and therefore trap. Looking at the .config for the target system, hw FPU support is disabled in the alignment code, so basically, this would beget a nop. -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
Philippe Gerum wrote: > On Tue, 2011-04-19 at 09:58 +0200, Jesper Christensen wrote: >> Great thanks, but i can't help wondering if the problems i'm seeing are >> related to some of my userspace programs using fp. > > I don't think so. The switchtest programs exercises the FPU hardware in > a certain way to make sure it is available in real-time mode from kernel > space (which is an utterly crappy legacy, but we will have to deal with > it until Xenomai 3.x). As far as I can see from your .config, you can't > have such support, so switchtest was basically trying to test an > inexistent feature. In fact, switchtest whether Xenomai FPU switch routines work when the Linux kernel itself uses FPU in kernel-space. Currently, the only place when this happens is in the RAID code: x86 uses mmx/sse, and some power pcs use altivec. Some powerpc also fix unaligned accesses to floating point data in kernel-space, I do not know if this may interfere, which is why the powerpc code is compiled even without RAID. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On Tue, 2011-04-19 at 09:58 +0200, Jesper Christensen wrote: > Great thanks, but i can't help wondering if the problems i'm seeing are > related to some of my userspace programs using fp. I don't think so. The switchtest programs exercises the FPU hardware in a certain way to make sure it is available in real-time mode from kernel space (which is an utterly crappy legacy, but we will have to deal with it until Xenomai 3.x). As far as I can see from your .config, you can't have such support, so switchtest was basically trying to test an inexistent feature. > > /Jesper > > > On 2011-04-19 09:39, Philippe Gerum wrote: > > On Tue, 2011-04-19 at 09:26 +0200, Jesper Christensen wrote: > > > >> If i run switchtest i get the following output: > >> > >> > > > > If still talking about the cpci6200, this patch should apply: > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=3d6fa118ef282c60dfeb0e690a579e8357bb7d13 > > > > > >> [root@slot6 /bin]# switchtest > >> == Testing FPU check routines... > >> r0: 1 != 2 > >> r1: 1 != 2 > >> r2: 1 != 2 > >> r3: 1 != 2 > >> r4: 1 != 2 > >> r5: 1 != 2 > >> r6: 1 != 2 > >> r7: 1 != 2 > >> r8: 1 != 2 > >> r9: 1 != 2 > >> r10: 1 != 2 > >> r11: 1 != 2 > >> r12: 1 != 2 > >> r13: 1 != 2 > >> r14: 1 != 2 > >> r15: 1 != 2 > >> r16: 1 != 2 > >> r17: 1 != 2 > >> r18: 1 != 2 > >> r19: 1 != 2 > >> r20: 1 != 2 > >> r21: 1 != 2 > >> r22: 1 != 2 > >> r23: 1 != 2 > >> r24: 1 != 2 > >> r25: 1 != 2 > >> r26: 1 != 2 > >> r27: 1 != 2 > >> r28: 1 != 2 > >> r29: 1 != 2 > >> r30: 1 != 2 > >> r31: 1 != 2 > >> == FPU check routines: OK. > >> == Threads: sleeper_ufps0-0 rtk0-1 rtk0-2 rtk_fp0-3 rtk_fp0-4 > >> rtk_fp_ufpp0-5 rtk_fp_ufpp0-6 rtup0-7 rtup0-8 rtup_ufpp0-9 rtup_ufpp0-10 > >> rtus0-11 rtus0-12 rtus_ufps0-13 rtus_ufps0-14 rtuo0-15 rtuo0-16 > >> rtuo_ufpp0-17 rtuo_ufpp0-18 rtuo_ufps0-19 rtuo_ufps0-20 > >> rtuo_ufpp_ufps0-21 rtuo_ufpp_ufps0-22 > >> > >> > >> > >> And then it halts. dmesg shows: > >> > >> Xenomai: suspending kernel thread ae819678 ('rtk5/0') at nip=0x80319aa0, > >> lr=0x80319a70, r1=0xafa90510 after exception #1792 > >> > >> > >> switchtest -n runs normally, should i use some sort of soft float flag > >> in my compilations? > >> > >> /Jesper > >> > >> > >> > >> > >> ___ > >> Xenomai-core mailing list > >> Xenomai-core@gna.org > >> https://mail.gna.org/listinfo/xenomai-core > >> > > > -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
Great thanks, but i can't help wondering if the problems i'm seeing are related to some of my userspace programs using fp. /Jesper On 2011-04-19 09:39, Philippe Gerum wrote: > On Tue, 2011-04-19 at 09:26 +0200, Jesper Christensen wrote: > >> If i run switchtest i get the following output: >> >> > > If still talking about the cpci6200, this patch should apply: > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=3d6fa118ef282c60dfeb0e690a579e8357bb7d13 > > >> [root@slot6 /bin]# switchtest >> == Testing FPU check routines... >> r0: 1 != 2 >> r1: 1 != 2 >> r2: 1 != 2 >> r3: 1 != 2 >> r4: 1 != 2 >> r5: 1 != 2 >> r6: 1 != 2 >> r7: 1 != 2 >> r8: 1 != 2 >> r9: 1 != 2 >> r10: 1 != 2 >> r11: 1 != 2 >> r12: 1 != 2 >> r13: 1 != 2 >> r14: 1 != 2 >> r15: 1 != 2 >> r16: 1 != 2 >> r17: 1 != 2 >> r18: 1 != 2 >> r19: 1 != 2 >> r20: 1 != 2 >> r21: 1 != 2 >> r22: 1 != 2 >> r23: 1 != 2 >> r24: 1 != 2 >> r25: 1 != 2 >> r26: 1 != 2 >> r27: 1 != 2 >> r28: 1 != 2 >> r29: 1 != 2 >> r30: 1 != 2 >> r31: 1 != 2 >> == FPU check routines: OK. >> == Threads: sleeper_ufps0-0 rtk0-1 rtk0-2 rtk_fp0-3 rtk_fp0-4 >> rtk_fp_ufpp0-5 rtk_fp_ufpp0-6 rtup0-7 rtup0-8 rtup_ufpp0-9 rtup_ufpp0-10 >> rtus0-11 rtus0-12 rtus_ufps0-13 rtus_ufps0-14 rtuo0-15 rtuo0-16 >> rtuo_ufpp0-17 rtuo_ufpp0-18 rtuo_ufps0-19 rtuo_ufps0-20 >> rtuo_ufpp_ufps0-21 rtuo_ufpp_ufps0-22 >> >> >> >> And then it halts. dmesg shows: >> >> Xenomai: suspending kernel thread ae819678 ('rtk5/0') at nip=0x80319aa0, >> lr=0x80319a70, r1=0xafa90510 after exception #1792 >> >> >> switchtest -n runs normally, should i use some sort of soft float flag >> in my compilations? >> >> /Jesper >> >> >> >> >> ___ >> Xenomai-core mailing list >> Xenomai-core@gna.org >> https://mail.gna.org/listinfo/xenomai-core >> > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On Tue, 2011-04-19 at 09:26 +0200, Jesper Christensen wrote: > If i run switchtest i get the following output: > If still talking about the cpci6200, this patch should apply: http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=3d6fa118ef282c60dfeb0e690a579e8357bb7d13 > [root@slot6 /bin]# switchtest > == Testing FPU check routines... > r0: 1 != 2 > r1: 1 != 2 > r2: 1 != 2 > r3: 1 != 2 > r4: 1 != 2 > r5: 1 != 2 > r6: 1 != 2 > r7: 1 != 2 > r8: 1 != 2 > r9: 1 != 2 > r10: 1 != 2 > r11: 1 != 2 > r12: 1 != 2 > r13: 1 != 2 > r14: 1 != 2 > r15: 1 != 2 > r16: 1 != 2 > r17: 1 != 2 > r18: 1 != 2 > r19: 1 != 2 > r20: 1 != 2 > r21: 1 != 2 > r22: 1 != 2 > r23: 1 != 2 > r24: 1 != 2 > r25: 1 != 2 > r26: 1 != 2 > r27: 1 != 2 > r28: 1 != 2 > r29: 1 != 2 > r30: 1 != 2 > r31: 1 != 2 > == FPU check routines: OK. > == Threads: sleeper_ufps0-0 rtk0-1 rtk0-2 rtk_fp0-3 rtk_fp0-4 > rtk_fp_ufpp0-5 rtk_fp_ufpp0-6 rtup0-7 rtup0-8 rtup_ufpp0-9 rtup_ufpp0-10 > rtus0-11 rtus0-12 rtus_ufps0-13 rtus_ufps0-14 rtuo0-15 rtuo0-16 > rtuo_ufpp0-17 rtuo_ufpp0-18 rtuo_ufps0-19 rtuo_ufps0-20 > rtuo_ufpp_ufps0-21 rtuo_ufpp_ufps0-22 > > > > And then it halts. dmesg shows: > > Xenomai: suspending kernel thread ae819678 ('rtk5/0') at nip=0x80319aa0, > lr=0x80319a70, r1=0xafa90510 after exception #1792 > > > switchtest -n runs normally, should i use some sort of soft float flag > in my compilations? > > /Jesper > > > > > ___ > Xenomai-core mailing list > Xenomai-core@gna.org > https://mail.gna.org/listinfo/xenomai-core -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] kernel threads crash
If i run switchtest i get the following output: [root@slot6 /bin]# switchtest == Testing FPU check routines... r0: 1 != 2 r1: 1 != 2 r2: 1 != 2 r3: 1 != 2 r4: 1 != 2 r5: 1 != 2 r6: 1 != 2 r7: 1 != 2 r8: 1 != 2 r9: 1 != 2 r10: 1 != 2 r11: 1 != 2 r12: 1 != 2 r13: 1 != 2 r14: 1 != 2 r15: 1 != 2 r16: 1 != 2 r17: 1 != 2 r18: 1 != 2 r19: 1 != 2 r20: 1 != 2 r21: 1 != 2 r22: 1 != 2 r23: 1 != 2 r24: 1 != 2 r25: 1 != 2 r26: 1 != 2 r27: 1 != 2 r28: 1 != 2 r29: 1 != 2 r30: 1 != 2 r31: 1 != 2 == FPU check routines: OK. == Threads: sleeper_ufps0-0 rtk0-1 rtk0-2 rtk_fp0-3 rtk_fp0-4 rtk_fp_ufpp0-5 rtk_fp_ufpp0-6 rtup0-7 rtup0-8 rtup_ufpp0-9 rtup_ufpp0-10 rtus0-11 rtus0-12 rtus_ufps0-13 rtus_ufps0-14 rtuo0-15 rtuo0-16 rtuo_ufpp0-17 rtuo_ufpp0-18 rtuo_ufps0-19 rtuo_ufps0-20 rtuo_ufpp_ufps0-21 rtuo_ufpp_ufps0-22 And then it halts. dmesg shows: Xenomai: suspending kernel thread ae819678 ('rtk5/0') at nip=0x80319aa0, lr=0x80319a70, r1=0xafa90510 after exception #1792 switchtest -n runs normally, should i use some sort of soft float flag in my compilations? /Jesper ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash - possible race condition?
On 2011-04-14 16:09, Philippe Gerum wrote: > On Thu, 2011-04-14 at 15:46 +0200, Jesper Christensen wrote: > >> Actually i have been running with CONFIG_XENO_HW_UNLOCKED_SWITCH the >> whole time >> > You mean enabled? > Disabled, sorry. > >> and i also raised the stack size from 4k to 8k. I do however >> think there could be some fishyness in entry_32.S. In >> "transfer_to_handler" SPRN_SPRG3 is used to check for stack overflow (at >> least in my kernel 2.6.29.6), but i must admit i haven't seen any of >> that in the kernel log. >> >> > Mmm, you are right. In any case, what we want with the unmasked switch > feature is to allow interrupts while we flush the tlb and set the new mm > context, which may be lengthy on some low end platforms. Allowing the > switch code to be preempted during the register swap is of no use wrt > latency. > > Do you have a patch at hand which you could post that flips MSR_EE in > rthal_thread_switch already? > > This protects the whole function, but it should flip the bit inside like you suggest. diff --git a/include/asm-powerpc/bits/pod.h b/include/asm-powerpc/bits/pod.h old mode 100644 new mode 100755 index 6269907..e279647 --- a/include/asm-powerpc/bits/pod.h +++ b/include/asm-powerpc/bits/pod.h @@ -106,6 +106,7 @@ static inline void xnarch_switch_to(xnarchtcb_t *out_tcb, struct mm_struct *prev_mm = out_tcb->active_mm, *next_mm; struct task_struct *prev = out_tcb->active_task; struct task_struct *next = in_tcb->user_task; + unsigned long flags; if (likely(next != NULL)) { in_tcb->active_task = next; @@ -156,12 +157,14 @@ static inline void xnarch_switch_to(xnarchtcb_t *out_tcb, #endif /* PPC32 */ #endif /* !__IPIPE_FEATURE_HARDENED_SWITCHMM */ +rthal_local_irq_save_hw(flags); #ifdef CONFIG_PPC64 rthal_thread_switch(out_tcb->tsp, in_tcb->tsp, next == NULL); #else rthal_thread_switch(out_tcb->tsp, in_tcb->tsp); #endif barrier(); + rthal_local_irq_restore_hw(flags); } >> /Jesper >> >> >> On 2011-04-14 15:31, Philippe Gerum wrote: >> >>> On Thu, 2011-04-14 at 15:04 +0200, Jesper Christensen wrote: >>> >>> I wrote about some problems concerning stack corruption when running xenomai on ppc. I have found out that if i disable hardware interrupts while running "rthal_thread_switch" the problem seems to dissapear somewhat. I saw a crash yesterday after running for 3 hours, and i'm currently running a test (has been running for 3 hours). Usually it would fail after 30-40 minutes. My question is: could there be a problem if we receive an interrupt between updating the stack pointer and the sprg3 register with the new thread pointer? >>> Normally, there should not be any issue (famous last words), since we >>> would run Xenomai-only code over the preempted context, and we don't >>> depend on SPRG3 to fetch the current phys address. In fact, at this >>> stage we simply don't care about the linux context, only referring to >>> the current Xenomai thread, which is obtained differently. >>> >>> Try switching off CONFIG_XENO_HW_UNLOCKED_SWITCH, in the "machine" >>> config area, if this ends up being rock-solid, then this would be a hint >>> that something may be fishy in this area. Raising your k-thread stack >>> sizes in a separate test may be interesting to check too, if not already >>> done. >>> >>> >>> >>> /Jesper ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core >>> >>> >> > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash - possible race condition?
On Thu, 2011-04-14 at 15:46 +0200, Jesper Christensen wrote: > Actually i have been running with CONFIG_XENO_HW_UNLOCKED_SWITCH the > whole time You mean enabled? > and i also raised the stack size from 4k to 8k. I do however > think there could be some fishyness in entry_32.S. In > "transfer_to_handler" SPRN_SPRG3 is used to check for stack overflow (at > least in my kernel 2.6.29.6), but i must admit i haven't seen any of > that in the kernel log. > Mmm, you are right. In any case, what we want with the unmasked switch feature is to allow interrupts while we flush the tlb and set the new mm context, which may be lengthy on some low end platforms. Allowing the switch code to be preempted during the register swap is of no use wrt latency. Do you have a patch at hand which you could post that flips MSR_EE in rthal_thread_switch already? > /Jesper > > > On 2011-04-14 15:31, Philippe Gerum wrote: > > On Thu, 2011-04-14 at 15:04 +0200, Jesper Christensen wrote: > > > >> I wrote about some problems concerning stack corruption when running > >> xenomai on ppc. I have found out that if i disable hardware interrupts > >> while running "rthal_thread_switch" the problem seems to dissapear > >> somewhat. I saw a crash yesterday after running for 3 hours, and i'm > >> currently running a test (has been running for 3 hours). Usually it > >> would fail after 30-40 minutes. My question is: could there be a problem > >> if we receive an interrupt between updating the stack pointer and the > >> sprg3 register with the new thread pointer? > >> > >> > > Normally, there should not be any issue (famous last words), since we > > would run Xenomai-only code over the preempted context, and we don't > > depend on SPRG3 to fetch the current phys address. In fact, at this > > stage we simply don't care about the linux context, only referring to > > the current Xenomai thread, which is obtained differently. > > > > Try switching off CONFIG_XENO_HW_UNLOCKED_SWITCH, in the "machine" > > config area, if this ends up being rock-solid, then this would be a hint > > that something may be fishy in this area. Raising your k-thread stack > > sizes in a separate test may be interesting to check too, if not already > > done. > > > > > > > >> /Jesper > >> > >> > >> > >> ___ > >> Xenomai-core mailing list > >> Xenomai-core@gna.org > >> https://mail.gna.org/listinfo/xenomai-core > >> > > > -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash - possible race condition?
Actually i have been running with CONFIG_XENO_HW_UNLOCKED_SWITCH the whole time and i also raised the stack size from 4k to 8k. I do however think there could be some fishyness in entry_32.S. In "transfer_to_handler" SPRN_SPRG3 is used to check for stack overflow (at least in my kernel 2.6.29.6), but i must admit i haven't seen any of that in the kernel log. /Jesper On 2011-04-14 15:31, Philippe Gerum wrote: > On Thu, 2011-04-14 at 15:04 +0200, Jesper Christensen wrote: > >> I wrote about some problems concerning stack corruption when running >> xenomai on ppc. I have found out that if i disable hardware interrupts >> while running "rthal_thread_switch" the problem seems to dissapear >> somewhat. I saw a crash yesterday after running for 3 hours, and i'm >> currently running a test (has been running for 3 hours). Usually it >> would fail after 30-40 minutes. My question is: could there be a problem >> if we receive an interrupt between updating the stack pointer and the >> sprg3 register with the new thread pointer? >> >> > Normally, there should not be any issue (famous last words), since we > would run Xenomai-only code over the preempted context, and we don't > depend on SPRG3 to fetch the current phys address. In fact, at this > stage we simply don't care about the linux context, only referring to > the current Xenomai thread, which is obtained differently. > > Try switching off CONFIG_XENO_HW_UNLOCKED_SWITCH, in the "machine" > config area, if this ends up being rock-solid, then this would be a hint > that something may be fishy in this area. Raising your k-thread stack > sizes in a separate test may be interesting to check too, if not already > done. > > > >> /Jesper >> >> >> >> ___ >> Xenomai-core mailing list >> Xenomai-core@gna.org >> https://mail.gna.org/listinfo/xenomai-core >> > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash - possible race condition?
On Thu, 2011-04-14 at 15:04 +0200, Jesper Christensen wrote: > I wrote about some problems concerning stack corruption when running > xenomai on ppc. I have found out that if i disable hardware interrupts > while running "rthal_thread_switch" the problem seems to dissapear > somewhat. I saw a crash yesterday after running for 3 hours, and i'm > currently running a test (has been running for 3 hours). Usually it > would fail after 30-40 minutes. My question is: could there be a problem > if we receive an interrupt between updating the stack pointer and the > sprg3 register with the new thread pointer? > Normally, there should not be any issue (famous last words), since we would run Xenomai-only code over the preempted context, and we don't depend on SPRG3 to fetch the current phys address. In fact, at this stage we simply don't care about the linux context, only referring to the current Xenomai thread, which is obtained differently. Try switching off CONFIG_XENO_HW_UNLOCKED_SWITCH, in the "machine" config area, if this ends up being rock-solid, then this would be a hint that something may be fishy in this area. Raising your k-thread stack sizes in a separate test may be interesting to check too, if not already done. > /Jesper > > > > ___ > Xenomai-core mailing list > Xenomai-core@gna.org > https://mail.gna.org/listinfo/xenomai-core -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] kernel threads crash - possible race condition?
I wrote about some problems concerning stack corruption when running xenomai on ppc. I have found out that if i disable hardware interrupts while running "rthal_thread_switch" the problem seems to dissapear somewhat. I saw a crash yesterday after running for 3 hours, and i'm currently running a test (has been running for 3 hours). Usually it would fail after 30-40 minutes. My question is: could there be a problem if we receive an interrupt between updating the stack pointer and the sprg3 register with the new thread pointer? /Jesper ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On 2011-04-12 17:24, Jan Kiszka wrote: > On 2011-04-12 16:21, Jesper Christensen wrote: > >> There you go: >> >> >> --- >> >> #include >> #include >> #include >> #include >> >> #include >> #include >> >> >> >> static rtdm_lock_t umsg_list_lock = RTDM_LOCK_UNLOCKED; >> static rtdm_nrtsig_tup_nrt_signal; >> LIST_HEAD(umsg_list); >> >> >> static void up_work_queue_handler(struct work_struct *work); >> DECLARE_WORK(wq, up_work_queue_handler); >> >> u32 up_user_pid = 0; >> >> struct up_msg_buf *up_alloc_msg_buf(int cmd, size_t psize, struct >> genl_info *info, >> up_msg_finalize fin) >> { >> >> struct up_msg_buf *ret; >> >> ret = kmalloc(sizeof(struct up_msg_buf) + psize, GFP_KERNEL); >> if(!ret) >> return ret; >> >> memset(ret, 0, sizeof(struct up_msg_buf)); >> > [ kzalloc = kmalloc + memset ] > > Got it. >> >> /* Initialize some fields */ >> if(info) { >> ret->pid = info->snd_pid; >> ret->seq = info->snd_seq; >> } >> ret->cmd = cmd; >> ret->finalize = fin; >> >> >> return ret; >> >> >> } >> >> void up_queue_umsg(struct up_msg_buf *umsg, int op, int len, U8 upid) >> { >> >> rtdm_lockctx_t context; >> >> umsg->hdr.opcode = op; >> umsg->hdr.plen = len; >> umsg->hdr.up_id = upid; >> >> rtdm_lock_get_irqsave(&umsg_list_lock, context); >> list_add_tail(&umsg->list_entry, &umsg_list); >> rtdm_lock_put_irqrestore(&umsg_list_lock, context); >> >> rtdm_nrtsig_pend(&up_nrt_signal); >> >> > Why this signaling here? Either the message is processed synchronously > (/wrt dispatch_call) so that you can release it after return from the > dispatcher, or it's handled asynchronously, but then this signal comes > too early as the RT work is potentially still ongoing. > > Well this is for dispatching messages the other way (via the work_queue). The rt part will (by design) not modify the umsg buffer after this call. This mechanism is also needed to pass unsolicited messages from rt to userland. >> >> } >> >> >> static int up_handle_cmd_msg_rt(struct rt_proc_call *call) >> { >> >> struct up_cmd_param *params; >> struct up_msg_buf *resp; >> >> params = rtpc_get_priv(call, struct up_cmd_param); >> resp = params->resp_buf; >> >> if(resp) >> resp->cmd = UP_NL_C_CMD; >> >> switch(params->hdr.opcode) { >> >> case UP_CMD_INIT: >> break; >> >> case UP_CMD_CREATE_REQ: >> { >> struct up_config *c = (struct up_config *)¶ms->msg.config; >> int *res = (int *)resp->payload; >> *res = up_create(params->hdr.up_id, c); >> > I can only assume this function doesn't do anything crazy. > You are correct. It is only called once during startup. > >> if(*res < 0) >> rtdm_printk("Error creating UP\n"); >> >> up_queue_umsg(resp, UP_CMD_CREATE_RES, sizeof(int), 0); >> break; >> } >> >> default: >> rtdm_printk("Unknown cmd message: op=%d\n", params->hdr.opcode); >> return -ENOTSUPP; >> >> } >> >> return 0; >> } >> >> static int up_handle_cmd_msg_nrt(struct sk_buff *skb, struct genl_info >> *info) >> { >> >> int ret; >> struct up_cmd_param params; >> struct up_user_hdr *uhdr = (struct up_user_hdr *)info->userhdr; >> >> //TODO: Allocate response buffer based on message type >> switch(uhdr->opcode) { >> >> case UP_CMD_INIT: >> up_user_pid = info->snd_pid; >> params.resp_buf = NULL; >> break; >> >> default: >> params.resp_buf = up_alloc_msg_buf(UP_NL_C_CMD, >> sizeof(cmdMsg_t), >> info, NULL); >> break; >> }; >> >> memcpy(¶ms.hdr, info->userhdr, sizeof(struct up_user_hdr)); >> >> if(params.hdr.plen > sizeof(cmdMsg_t)) >> printk("up_handle_cmd_msg_nrt(): ERROR plen=%u > >> sizeof(cmdMsg_t)=%u\n", params.hdr.plen, sizeof(cmdMsg_t)); >> >> if(params.hdr.plen) >> nla_memcpy(¶ms.msg, info->attrs[UP_NL_A_MSG], params.hdr.plen); >> >> ret = rtpc_dispatch_call(up_handle_cmd_msg_rt, 0, ¶ms, >> sizeof(params), >> NULL, NULL, NULL); >> > That shouldn't build (too many arguments). > That is actually a concurrency bug fix i made in rtnet (in the tar file :) that prevented multiple userland threads from sending rt pings at the same time. > >> >> if(ret < 0) >> kfree(params.resp_buf); >> >> return ret; >> >> } >> >> /* netlink attribute policy */ >> static const struct nla_policy up_genl_policy[UP_NL_A_MAX + 1] = { >> [UP_NL_A_MSG] = { .type
Re: [Xenomai-core] kernel threads crash
On 2011-04-12 16:21, Jesper Christensen wrote: > There you go: > > > --- > > #include > #include > #include > #include > > #include > #include > > > > static rtdm_lock_t umsg_list_lock = RTDM_LOCK_UNLOCKED; > static rtdm_nrtsig_tup_nrt_signal; > LIST_HEAD(umsg_list); > > > static void up_work_queue_handler(struct work_struct *work); > DECLARE_WORK(wq, up_work_queue_handler); > > u32 up_user_pid = 0; > > struct up_msg_buf *up_alloc_msg_buf(int cmd, size_t psize, struct > genl_info *info, > up_msg_finalize fin) > { > > struct up_msg_buf *ret; > > ret = kmalloc(sizeof(struct up_msg_buf) + psize, GFP_KERNEL); > if(!ret) > return ret; > > memset(ret, 0, sizeof(struct up_msg_buf)); [ kzalloc = kmalloc + memset ] > > /* Initialize some fields */ > if(info) { > ret->pid = info->snd_pid; > ret->seq = info->snd_seq; > } > ret->cmd = cmd; > ret->finalize = fin; > > > return ret; > > > } > > void up_queue_umsg(struct up_msg_buf *umsg, int op, int len, U8 upid) > { > > rtdm_lockctx_t context; > > umsg->hdr.opcode = op; > umsg->hdr.plen = len; > umsg->hdr.up_id = upid; > > rtdm_lock_get_irqsave(&umsg_list_lock, context); > list_add_tail(&umsg->list_entry, &umsg_list); > rtdm_lock_put_irqrestore(&umsg_list_lock, context); > > rtdm_nrtsig_pend(&up_nrt_signal); > Why this signaling here? Either the message is processed synchronously (/wrt dispatch_call) so that you can release it after return from the dispatcher, or it's handled asynchronously, but then this signal comes too early as the RT work is potentially still ongoing. > > } > > > static int up_handle_cmd_msg_rt(struct rt_proc_call *call) > { > > struct up_cmd_param *params; > struct up_msg_buf *resp; > > params = rtpc_get_priv(call, struct up_cmd_param); > resp = params->resp_buf; > > if(resp) > resp->cmd = UP_NL_C_CMD; > > switch(params->hdr.opcode) { > > case UP_CMD_INIT: > break; > > case UP_CMD_CREATE_REQ: > { > struct up_config *c = (struct up_config *)¶ms->msg.config; > int *res = (int *)resp->payload; > *res = up_create(params->hdr.up_id, c); I can only assume this function doesn't do anything crazy. > if(*res < 0) > rtdm_printk("Error creating UP\n"); > > up_queue_umsg(resp, UP_CMD_CREATE_RES, sizeof(int), 0); > break; > } > > default: > rtdm_printk("Unknown cmd message: op=%d\n", params->hdr.opcode); > return -ENOTSUPP; > > } > > return 0; > } > > static int up_handle_cmd_msg_nrt(struct sk_buff *skb, struct genl_info > *info) > { > > int ret; > struct up_cmd_param params; > struct up_user_hdr *uhdr = (struct up_user_hdr *)info->userhdr; > > //TODO: Allocate response buffer based on message type > switch(uhdr->opcode) { > > case UP_CMD_INIT: > up_user_pid = info->snd_pid; > params.resp_buf = NULL; > break; > > default: > params.resp_buf = up_alloc_msg_buf(UP_NL_C_CMD, > sizeof(cmdMsg_t), > info, NULL); > break; > }; > > memcpy(¶ms.hdr, info->userhdr, sizeof(struct up_user_hdr)); > > if(params.hdr.plen > sizeof(cmdMsg_t)) > printk("up_handle_cmd_msg_nrt(): ERROR plen=%u > > sizeof(cmdMsg_t)=%u\n", params.hdr.plen, sizeof(cmdMsg_t)); > > if(params.hdr.plen) > nla_memcpy(¶ms.msg, info->attrs[UP_NL_A_MSG], params.hdr.plen); > > ret = rtpc_dispatch_call(up_handle_cmd_msg_rt, 0, ¶ms, > sizeof(params), > NULL, NULL, NULL); That shouldn't build (too many arguments). > > if(ret < 0) > kfree(params.resp_buf); > > return ret; > > } > > /* netlink attribute policy */ > static const struct nla_policy up_genl_policy[UP_NL_A_MAX + 1] = { > [UP_NL_A_MSG] = { .type = NLA_BINARY } > }; > > /* Generic netlink event operation definition */ > static struct genl_ops up_genl_ops_cmd = { > .cmd = UP_NL_C_CMD, > .flags = 0, > .policy = up_genl_policy, > .doit = up_handle_cmd_msg_nrt, > .dumpit = NULL > }; > > /* Generic netlink family */ > static struct genl_family up_genl_family = { > .id = GENL_ID_GENERATE, > .hdrsize = UP_GENL_HDRLEN, > .name = "up", > .version = UP_NL_VERSION, > .maxattr = UP_NL_A_MAX > }; > > static void up_work_queue_handler(struct work_struct *work) > { > > struct up_msg_buf *msg; > struct sk_buff *skb; > struct up_user_hdr *uhdr; > rtdm_lockctx_t context; > int rc; >
Re: [Xenomai-core] kernel threads crash
There you go: --- #include #include #include #include #include #include static rtdm_lock_t umsg_list_lock = RTDM_LOCK_UNLOCKED; static rtdm_nrtsig_tup_nrt_signal; LIST_HEAD(umsg_list); static void up_work_queue_handler(struct work_struct *work); DECLARE_WORK(wq, up_work_queue_handler); u32 up_user_pid = 0; struct up_msg_buf *up_alloc_msg_buf(int cmd, size_t psize, struct genl_info *info, up_msg_finalize fin) { struct up_msg_buf *ret; ret = kmalloc(sizeof(struct up_msg_buf) + psize, GFP_KERNEL); if(!ret) return ret; memset(ret, 0, sizeof(struct up_msg_buf)); /* Initialize some fields */ if(info) { ret->pid = info->snd_pid; ret->seq = info->snd_seq; } ret->cmd = cmd; ret->finalize = fin; return ret; } void up_queue_umsg(struct up_msg_buf *umsg, int op, int len, U8 upid) { rtdm_lockctx_t context; umsg->hdr.opcode = op; umsg->hdr.plen = len; umsg->hdr.up_id = upid; rtdm_lock_get_irqsave(&umsg_list_lock, context); list_add_tail(&umsg->list_entry, &umsg_list); rtdm_lock_put_irqrestore(&umsg_list_lock, context); rtdm_nrtsig_pend(&up_nrt_signal); } static int up_handle_cmd_msg_rt(struct rt_proc_call *call) { struct up_cmd_param *params; struct up_msg_buf *resp; params = rtpc_get_priv(call, struct up_cmd_param); resp = params->resp_buf; if(resp) resp->cmd = UP_NL_C_CMD; switch(params->hdr.opcode) { case UP_CMD_INIT: break; case UP_CMD_CREATE_REQ: { struct up_config *c = (struct up_config *)¶ms->msg.config; int *res = (int *)resp->payload; *res = up_create(params->hdr.up_id, c); if(*res < 0) rtdm_printk("Error creating UP\n"); up_queue_umsg(resp, UP_CMD_CREATE_RES, sizeof(int), 0); break; } default: rtdm_printk("Unknown cmd message: op=%d\n", params->hdr.opcode); return -ENOTSUPP; } return 0; } static int up_handle_cmd_msg_nrt(struct sk_buff *skb, struct genl_info *info) { int ret; struct up_cmd_param params; struct up_user_hdr *uhdr = (struct up_user_hdr *)info->userhdr; //TODO: Allocate response buffer based on message type switch(uhdr->opcode) { case UP_CMD_INIT: up_user_pid = info->snd_pid; params.resp_buf = NULL; break; default: params.resp_buf = up_alloc_msg_buf(UP_NL_C_CMD, sizeof(cmdMsg_t), info, NULL); break; }; memcpy(¶ms.hdr, info->userhdr, sizeof(struct up_user_hdr)); if(params.hdr.plen > sizeof(cmdMsg_t)) printk("up_handle_cmd_msg_nrt(): ERROR plen=%u > sizeof(cmdMsg_t)=%u\n", params.hdr.plen, sizeof(cmdMsg_t)); if(params.hdr.plen) nla_memcpy(¶ms.msg, info->attrs[UP_NL_A_MSG], params.hdr.plen); ret = rtpc_dispatch_call(up_handle_cmd_msg_rt, 0, ¶ms, sizeof(params), NULL, NULL, NULL); if(ret < 0) kfree(params.resp_buf); return ret; } /* netlink attribute policy */ static const struct nla_policy up_genl_policy[UP_NL_A_MAX + 1] = { [UP_NL_A_MSG] = { .type = NLA_BINARY } }; /* Generic netlink event operation definition */ static struct genl_ops up_genl_ops_cmd = { .cmd = UP_NL_C_CMD, .flags = 0, .policy = up_genl_policy, .doit = up_handle_cmd_msg_nrt, .dumpit = NULL }; /* Generic netlink family */ static struct genl_family up_genl_family = { .id = GENL_ID_GENERATE, .hdrsize = UP_GENL_HDRLEN, .name = "up", .version = UP_NL_VERSION, .maxattr = UP_NL_A_MAX }; static void up_work_queue_handler(struct work_struct *work) { struct up_msg_buf *msg; struct sk_buff *skb; struct up_user_hdr *uhdr; rtdm_lockctx_t context; int rc; rtdm_lock_get_irqsave(&umsg_list_lock, context); while(!list_empty(&umsg_list)) { msg = (struct up_msg_buf *)umsg_list.next; list_del(&msg->list_entry); rtdm_lock_put_irqrestore(&umsg_list_lock, context); /* construct netlink message and send */ skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL); if(!skb) goto failure; uhdr = genlmsg_put(skb, msg->pid, msg->seq, &up_genl_family, 0, msg->cmd); if(!uhdr) goto skb_failure; memcpy(uhdr, &msg->hdr, sizeof(struct up_user_hdr)); rc = nla_put(skb, UP_NL_A_MSG, msg->hdr.plen, &msg->payload); if(rc != 0) goto skb_failure; genlmsg_end(skb, uhdr); rc =
Re: [Xenomai-core] kernel threads crash
On 2011-04-12 16:09, Jesper Christensen wrote: > Speaking of rtpc, could there be a race condition when using a > rtdm_lock_t to synchronize between a linux thread and a xenomai thread? Without seeing at some code, I can't comment on this meaningfully. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
Speaking of rtpc, could there be a race condition when using a rtdm_lock_t to synchronize between a linux thread and a xenomai thread? /Jesper On 2011-04-12 15:40, Jan Kiszka wrote: > On 2011-04-12 15:31, Jesper Christensen wrote: > >> >> >> I have managed to print the stack of a faulting thread: >> >> Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at >> nip=0xb911f940, lr=0xb911f940, r1=0xaf2c4580 after exception #1792 >> >> Xenomai: dumping stack at af2c4600 >> Xenomai: 0xaf2c45ec - 0xaf2c45fc: af2c4600 8009a334 >> >> Xenomai: 0xaf2c45d8 - 0xaf2c45e8: 0 >> >> Xenomai: 0xaf2c45c4 - 0xaf2c45d4: b911f518 0 af2c45f0 >> 8009a364 >> Xenomai: 0xaf2c45b0 - 0xaf2c45c0: 0 >> >> Xenomai: 0xaf2c459c - 0xaf2c45ac: 0 b911f518 >> 8009a334 >> Xenomai: 0xaf2c4588 - 0xaf2c4598: b911f4e0 af2c45d0 >> b911649c >> Xenomai: 0xaf2c4574 - 0xaf2c4584: 805e3988 800 805a89f0 >> 0001 b911f940 >> Xenomai: 0xaf2c4560 - 0xaf2c4570: b911f940 2000 2222 >> b911ebb8 0700 >> Xenomai: 0xaf2c454c - 0xaf2c455c: 805a89f0 b911f940 00029000 >> 8009b2e4 >> Xenomai: 0xaf2c4538 - 0xaf2c4548: b911ebb8 805e50c4 >> 805e3988 805a89f0 >> Xenomai: 0xaf2c4524 - 0xaf2c4534: 00100100 >> af2c4580 8000bf48 >> Xenomai: 0xaf2c4510 - 0xaf2c4520: 0 >> 00200200 >> Xenomai: 0xaf2c44fc - 0xaf2c450c: 805e3988 2222 >> >> >> Manually decoded link register words: >> - >> 8009a334: >> $ powerpc-linux-gnu-addr2line -e vmlinux 0x8009a334 >> linux-2.6.29.6/arch/powerpc/include/asm/xenomai/bits/pod.h:168 >> >> 8009a364: >> $ powerpc-linux-gnu-addr2line -e vmlinux 0x8009a364 >> linux-2.6.29.6/arch/powerpc/include/asm/xenomai/bits/pod.h:172 >> >> b911649c: >> $ powerpc-linux-gnu-addr2line -e >> ../3rd_party/XM-Linux/rtnet_build/stack/rtnet.ko 0x249c >> rtnet_build/stack/rtnet_rtpc.c:201 >> >> 8000bf48: >> $ powerpc-linux-gnu-addr2line -e vmlinux 0x8000bf48 >> linux-2.6.29.6/arch/powerpc/kernel/ipipe.c:429 >> (ipipe_trigger_irq(unsigned irq) at local_irq_restore_hw(flags);) >> >> - >> >> Notice the "r1" register in the first line i assume should point to a >> back chain word, but the value is 0001 and the "link register" word >> immediately after is b911f940 which points to: >> # grep b911f940 /proc/kallsyms >> b911f940 b pending_calls_lock [rtnet] >> >> >> I'm not sure of the significance of the stack frame after that one. >> >> > IIRC, you said that you are using rtnet-rtpc for a special use case. > Given the fact that this interface very well documented and highly > intuitive to use ;), I wouldn't be too surprised if you ran into a race > or an invalid use case. > > Is the rtpc-using code part of your tarball? If so, can you break it out > and explain on it how you use rtpc? > > Jan > > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
Unfortunately it's not part of the tar ball, but i might be able to post a source file that sums it up pretty well. /Jesper On 2011-04-12 15:40, Jan Kiszka wrote: > On 2011-04-12 15:31, Jesper Christensen wrote: > >> >> >> I have managed to print the stack of a faulting thread: >> >> Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at >> nip=0xb911f940, lr=0xb911f940, r1=0xaf2c4580 after exception #1792 >> >> Xenomai: dumping stack at af2c4600 >> Xenomai: 0xaf2c45ec - 0xaf2c45fc: af2c4600 8009a334 >> >> Xenomai: 0xaf2c45d8 - 0xaf2c45e8: 0 >> >> Xenomai: 0xaf2c45c4 - 0xaf2c45d4: b911f518 0 af2c45f0 >> 8009a364 >> Xenomai: 0xaf2c45b0 - 0xaf2c45c0: 0 >> >> Xenomai: 0xaf2c459c - 0xaf2c45ac: 0 b911f518 >> 8009a334 >> Xenomai: 0xaf2c4588 - 0xaf2c4598: b911f4e0 af2c45d0 >> b911649c >> Xenomai: 0xaf2c4574 - 0xaf2c4584: 805e3988 800 805a89f0 >> 0001 b911f940 >> Xenomai: 0xaf2c4560 - 0xaf2c4570: b911f940 2000 2222 >> b911ebb8 0700 >> Xenomai: 0xaf2c454c - 0xaf2c455c: 805a89f0 b911f940 00029000 >> 8009b2e4 >> Xenomai: 0xaf2c4538 - 0xaf2c4548: b911ebb8 805e50c4 >> 805e3988 805a89f0 >> Xenomai: 0xaf2c4524 - 0xaf2c4534: 00100100 >> af2c4580 8000bf48 >> Xenomai: 0xaf2c4510 - 0xaf2c4520: 0 >> 00200200 >> Xenomai: 0xaf2c44fc - 0xaf2c450c: 805e3988 2222 >> >> >> Manually decoded link register words: >> - >> 8009a334: >> $ powerpc-linux-gnu-addr2line -e vmlinux 0x8009a334 >> linux-2.6.29.6/arch/powerpc/include/asm/xenomai/bits/pod.h:168 >> >> 8009a364: >> $ powerpc-linux-gnu-addr2line -e vmlinux 0x8009a364 >> linux-2.6.29.6/arch/powerpc/include/asm/xenomai/bits/pod.h:172 >> >> b911649c: >> $ powerpc-linux-gnu-addr2line -e >> ../3rd_party/XM-Linux/rtnet_build/stack/rtnet.ko 0x249c >> rtnet_build/stack/rtnet_rtpc.c:201 >> >> 8000bf48: >> $ powerpc-linux-gnu-addr2line -e vmlinux 0x8000bf48 >> linux-2.6.29.6/arch/powerpc/kernel/ipipe.c:429 >> (ipipe_trigger_irq(unsigned irq) at local_irq_restore_hw(flags);) >> >> - >> >> Notice the "r1" register in the first line i assume should point to a >> back chain word, but the value is 0001 and the "link register" word >> immediately after is b911f940 which points to: >> # grep b911f940 /proc/kallsyms >> b911f940 b pending_calls_lock [rtnet] >> >> >> I'm not sure of the significance of the stack frame after that one. >> >> > IIRC, you said that you are using rtnet-rtpc for a special use case. > Given the fact that this interface very well documented and highly > intuitive to use ;), I wouldn't be too surprised if you ran into a race > or an invalid use case. > > Is the rtpc-using code part of your tarball? If so, can you break it out > and explain on it how you use rtpc? > > Jan > > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
Sorry about that, yes i merged that in right away. /Jesper On 2011-04-12 15:39, Gilles Chanteperdrix wrote: > Jesper Christensen wrote: > >> >> >> I have managed to print the stack of a faulting thread: >> > Did you test the patch I directed you to? We may be chasing an already > known issue here... > > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On 2011-04-12 15:31, Jesper Christensen wrote: > > > I have managed to print the stack of a faulting thread: > > Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at > nip=0xb911f940, lr=0xb911f940, r1=0xaf2c4580 after exception #1792 > > Xenomai: dumping stack at af2c4600 > Xenomai: 0xaf2c45ec - 0xaf2c45fc: af2c4600 8009a334 > > Xenomai: 0xaf2c45d8 - 0xaf2c45e8: 0 > > Xenomai: 0xaf2c45c4 - 0xaf2c45d4: b911f518 0 af2c45f0 > 8009a364 > Xenomai: 0xaf2c45b0 - 0xaf2c45c0: 0 > > Xenomai: 0xaf2c459c - 0xaf2c45ac: 0 b911f518 > 8009a334 > Xenomai: 0xaf2c4588 - 0xaf2c4598: b911f4e0 af2c45d0 > b911649c > Xenomai: 0xaf2c4574 - 0xaf2c4584: 805e3988 800 805a89f0 > 0001 b911f940 > Xenomai: 0xaf2c4560 - 0xaf2c4570: b911f940 2000 2222 > b911ebb8 0700 > Xenomai: 0xaf2c454c - 0xaf2c455c: 805a89f0 b911f940 00029000 > 8009b2e4 > Xenomai: 0xaf2c4538 - 0xaf2c4548: b911ebb8 805e50c4 > 805e3988 805a89f0 > Xenomai: 0xaf2c4524 - 0xaf2c4534: 00100100 > af2c4580 8000bf48 > Xenomai: 0xaf2c4510 - 0xaf2c4520: 0 > 00200200 > Xenomai: 0xaf2c44fc - 0xaf2c450c: 805e3988 2222 > > > Manually decoded link register words: > - > 8009a334: > $ powerpc-linux-gnu-addr2line -e vmlinux 0x8009a334 > linux-2.6.29.6/arch/powerpc/include/asm/xenomai/bits/pod.h:168 > > 8009a364: > $ powerpc-linux-gnu-addr2line -e vmlinux 0x8009a364 > linux-2.6.29.6/arch/powerpc/include/asm/xenomai/bits/pod.h:172 > > b911649c: > $ powerpc-linux-gnu-addr2line -e > ../3rd_party/XM-Linux/rtnet_build/stack/rtnet.ko 0x249c > rtnet_build/stack/rtnet_rtpc.c:201 > > 8000bf48: > $ powerpc-linux-gnu-addr2line -e vmlinux 0x8000bf48 > linux-2.6.29.6/arch/powerpc/kernel/ipipe.c:429 > (ipipe_trigger_irq(unsigned irq) at local_irq_restore_hw(flags);) > > - > > Notice the "r1" register in the first line i assume should point to a > back chain word, but the value is 0001 and the "link register" word > immediately after is b911f940 which points to: > # grep b911f940 /proc/kallsyms > b911f940 b pending_calls_lock [rtnet] > > > I'm not sure of the significance of the stack frame after that one. > IIRC, you said that you are using rtnet-rtpc for a special use case. Given the fact that this interface very well documented and highly intuitive to use ;), I wouldn't be too surprised if you ran into a race or an invalid use case. Is the rtpc-using code part of your tarball? If so, can you break it out and explain on it how you use rtpc? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
Jesper Christensen wrote: > > > I have managed to print the stack of a faulting thread: Did you test the patch I directed you to? We may be chasing an already known issue here... -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
I have managed to print the stack of a faulting thread: Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at nip=0xb911f940, lr=0xb911f940, r1=0xaf2c4580 after exception #1792 Xenomai: dumping stack at af2c4600 Xenomai: 0xaf2c45ec - 0xaf2c45fc: af2c4600 8009a334 Xenomai: 0xaf2c45d8 - 0xaf2c45e8: 0 Xenomai: 0xaf2c45c4 - 0xaf2c45d4: b911f518 0 af2c45f0 8009a364 Xenomai: 0xaf2c45b0 - 0xaf2c45c0: 0 Xenomai: 0xaf2c459c - 0xaf2c45ac: 0 b911f518 8009a334 Xenomai: 0xaf2c4588 - 0xaf2c4598: b911f4e0 af2c45d0 b911649c Xenomai: 0xaf2c4574 - 0xaf2c4584: 805e3988 800 805a89f0 0001 b911f940 Xenomai: 0xaf2c4560 - 0xaf2c4570: b911f940 2000 2222 b911ebb8 0700 Xenomai: 0xaf2c454c - 0xaf2c455c: 805a89f0 b911f940 00029000 8009b2e4 Xenomai: 0xaf2c4538 - 0xaf2c4548: b911ebb8 805e50c4 805e3988 805a89f0 Xenomai: 0xaf2c4524 - 0xaf2c4534: 00100100 af2c4580 8000bf48 Xenomai: 0xaf2c4510 - 0xaf2c4520: 0 00200200 Xenomai: 0xaf2c44fc - 0xaf2c450c: 805e3988 2222 Manually decoded link register words: - 8009a334: $ powerpc-linux-gnu-addr2line -e vmlinux 0x8009a334 linux-2.6.29.6/arch/powerpc/include/asm/xenomai/bits/pod.h:168 8009a364: $ powerpc-linux-gnu-addr2line -e vmlinux 0x8009a364 linux-2.6.29.6/arch/powerpc/include/asm/xenomai/bits/pod.h:172 b911649c: $ powerpc-linux-gnu-addr2line -e ../3rd_party/XM-Linux/rtnet_build/stack/rtnet.ko 0x249c rtnet_build/stack/rtnet_rtpc.c:201 8000bf48: $ powerpc-linux-gnu-addr2line -e vmlinux 0x8000bf48 linux-2.6.29.6/arch/powerpc/kernel/ipipe.c:429 (ipipe_trigger_irq(unsigned irq) at local_irq_restore_hw(flags);) - Notice the "r1" register in the first line i assume should point to a back chain word, but the value is 0001 and the "link register" word immediately after is b911f940 which points to: # grep b911f940 /proc/kallsyms b911f940 b pending_calls_lock [rtnet] I'm not sure of the significance of the stack frame after that one. /Jesper On 2011-04-11 17:31, Jesper Christensen wrote: > hmm... > > Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at > nip=0x1088860, lr=0x1088862 after exception #1025 > > LR points to nowhere...Maybe i should do a hexdump of the stack and > manually decode it. > > /Jesper > > > On 2011-04-11 16:49, Jesper Christensen wrote: > >> I'll just give them a run and see, thanks! >> >> /Jesper >> >> >> On 2011-04-11 16:39, Philippe Gerum wrote: >> >> >>> On Mon, 2011-04-11 at 16:32 +0200, Jesper Christensen wrote: >>> >>> >>> How do i see that? >>> diff --git a/include/asm-powerpc/system.h b/include/asm-powerpc/system.h >>> index 5cc4a23..8dbc537 100644 >>> --- a/include/asm-powerpc/system.h >>> +++ b/include/asm-powerpc/system.h >>> @@ -104,7 +104,7 @@ typedef struct xnarch_fltinfo { >>> #define xnarch_fault_trap(fi) ((unsigned int)(fi)->regs->trap) >>> #define xnarch_fault_code(fi) ((fi)->regs->dar) >>> #define xnarch_fault_pc(fi) ((fi)->regs->nip) >>> -#define xnarch_fault_pc(fi) ((fi)->regs->nip) >>> +#define xnarch_fault_lr(fi) ((fi)->regs->link) >>> /* FIXME: FPU faults ignored by the nanokernel on PPC. */ >>> #define xnarch_fault_fpu_p(fi) (0) >>> /* The following predicates are only usable over a regular Linux stack >>> diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c >>> index b5ddbaa..c1722e7 100644 >>> --- a/ksrc/nucleus/pod.c >>> +++ b/ksrc/nucleus/pod.c >>> @@ -2591,8 +2591,8 @@ int xnpod_trap_fault(xnarch_fltinfo_t *fltinfo) >>> >>> if (!xnpod_userspace_p()) { >>> xnprintf >>> - ("suspending kernel thread %p ('%s') at 0x%lx after >>> exception #%u\n", >>> -thread, thread->name, xnarch_fault_pc(fltinfo), >>> + ("suspending kernel thread %p ('%s') at nip=0x%lx, lr=0x%lx >>> after exception #%u\n", >>> +thread, thread->name, xnarch_fault_pc(fltinfo), >>> xnarch_fault_lr(fltinfo), >>> xnarch_fault_trap(fltinfo)); >>> >>> xnpod_suspend_thread(thread, XNSUSP, XN_INFINITE, XN_RELATIVE, >>> NULL); >>> >>> >>> /Jesper On 2011-04-11 16:27, Philippe Gerum wrote: > On Mon, 2011-04-11 at 16:20 +0200, Jesper Christensen wrote: > > > > >> Problem is the NIP in question is the address of the thread structure as >> seen in the error message. >> >>
Re: [Xenomai-core] kernel threads crash
hmm... Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at nip=0x1088860, lr=0x1088862 after exception #1025 LR points to nowhere...Maybe i should do a hexdump of the stack and manually decode it. /Jesper On 2011-04-11 16:49, Jesper Christensen wrote: > I'll just give them a run and see, thanks! > > /Jesper > > > On 2011-04-11 16:39, Philippe Gerum wrote: > >> On Mon, 2011-04-11 at 16:32 +0200, Jesper Christensen wrote: >> >> >>> How do i see that? >>> >>> >>> >> diff --git a/include/asm-powerpc/system.h b/include/asm-powerpc/system.h >> index 5cc4a23..8dbc537 100644 >> --- a/include/asm-powerpc/system.h >> +++ b/include/asm-powerpc/system.h >> @@ -104,7 +104,7 @@ typedef struct xnarch_fltinfo { >> #define xnarch_fault_trap(fi) ((unsigned int)(fi)->regs->trap) >> #define xnarch_fault_code(fi) ((fi)->regs->dar) >> #define xnarch_fault_pc(fi) ((fi)->regs->nip) >> -#define xnarch_fault_pc(fi) ((fi)->regs->nip) >> +#define xnarch_fault_lr(fi) ((fi)->regs->link) >> /* FIXME: FPU faults ignored by the nanokernel on PPC. */ >> #define xnarch_fault_fpu_p(fi) (0) >> /* The following predicates are only usable over a regular Linux stack >> diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c >> index b5ddbaa..c1722e7 100644 >> --- a/ksrc/nucleus/pod.c >> +++ b/ksrc/nucleus/pod.c >> @@ -2591,8 +2591,8 @@ int xnpod_trap_fault(xnarch_fltinfo_t *fltinfo) >> >> if (!xnpod_userspace_p()) { >> xnprintf >> -("suspending kernel thread %p ('%s') at 0x%lx after >> exception #%u\n", >> - thread, thread->name, xnarch_fault_pc(fltinfo), >> +("suspending kernel thread %p ('%s') at nip=0x%lx, lr=0x%lx >> after exception #%u\n", >> + thread, thread->name, xnarch_fault_pc(fltinfo), >> xnarch_fault_lr(fltinfo), >> xnarch_fault_trap(fltinfo)); >> >> xnpod_suspend_thread(thread, XNSUSP, XN_INFINITE, XN_RELATIVE, >> NULL); >> >> >>> /Jesper >>> >>> >>> On 2011-04-11 16:27, Philippe Gerum wrote: >>> >>> On Mon, 2011-04-11 at 16:20 +0200, Jesper Christensen wrote: > Problem is the NIP in question is the address of the thread structure as > seen in the error message. > > > LR? > /Jesper > > > On 2011-04-11 16:18, Philippe Gerum wrote: > > > >> On Mon, 2011-04-11 at 16:13 +0200, Jesper Christensen wrote: >> >> >> >> >>> I have updated to xenomai 2.5.6, but i'm still seeing exceptions >>> (considerably less often though): >>> >>> Xenomai: suspending kernel thread b92a39d0 ('tt_upgw_0') at 0xb92a39d0 >>> after exception #1792 >>> >>> >>> >>> >> You should build your code statically into the kernel, not as a module, >> and find out which code raises the MCE. >> >> CONFIG_DEBUG_INFO=y, then objdump -dl vmlinux, looking for the NIP >> mentioned. >> >> >> >> >> >>> /Jesper >>> >>> >>> On 2011-04-08 15:12, Philippe Gerum wrote: >>> >>> >>> >>> On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: > Hi > > I'm trying to implement some gateway functionality in the kernel on a > emerson CPCI6200 board, but have run into some strange errors. The > kernel module is made up of two threads that run every 1 ms. I have > also > made use of the rtpc dispatcher in rtnet to dispatch control messages > from a netlink socket to the RT part of my kernel module. > > The problem is that when loaded the threads get suspended due to > exceptions: > > Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 > after exception #1792 > > or > > Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after > exception #1025 > > or > > Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at > 0xb911f940 > after exception #1792 > > > I have ported the "gianfar" driver from linux to rtnet. > > The versions and hardware are listed below. The errors are most likely > due to faulty software on my part, but i would like to ask if there > are > any known issues with the versions or hardware i'm using. I would also > like to ask if there are any ways of further debugging the errors as i > am not getting very far wi
Re: [Xenomai-core] kernel threads crash
I'll just give them a run and see, thanks! /Jesper On 2011-04-11 16:39, Philippe Gerum wrote: > On Mon, 2011-04-11 at 16:32 +0200, Jesper Christensen wrote: > >> How do i see that? >> >> > diff --git a/include/asm-powerpc/system.h b/include/asm-powerpc/system.h > index 5cc4a23..8dbc537 100644 > --- a/include/asm-powerpc/system.h > +++ b/include/asm-powerpc/system.h > @@ -104,7 +104,7 @@ typedef struct xnarch_fltinfo { > #define xnarch_fault_trap(fi) ((unsigned int)(fi)->regs->trap) > #define xnarch_fault_code(fi) ((fi)->regs->dar) > #define xnarch_fault_pc(fi) ((fi)->regs->nip) > -#define xnarch_fault_pc(fi) ((fi)->regs->nip) > +#define xnarch_fault_lr(fi) ((fi)->regs->link) > /* FIXME: FPU faults ignored by the nanokernel on PPC. */ > #define xnarch_fault_fpu_p(fi) (0) > /* The following predicates are only usable over a regular Linux stack > diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c > index b5ddbaa..c1722e7 100644 > --- a/ksrc/nucleus/pod.c > +++ b/ksrc/nucleus/pod.c > @@ -2591,8 +2591,8 @@ int xnpod_trap_fault(xnarch_fltinfo_t *fltinfo) > > if (!xnpod_userspace_p()) { > xnprintf > - ("suspending kernel thread %p ('%s') at 0x%lx after > exception #%u\n", > - thread, thread->name, xnarch_fault_pc(fltinfo), > + ("suspending kernel thread %p ('%s') at nip=0x%lx, lr=0x%lx > after exception #%u\n", > + thread, thread->name, xnarch_fault_pc(fltinfo), > xnarch_fault_lr(fltinfo), >xnarch_fault_trap(fltinfo)); > > xnpod_suspend_thread(thread, XNSUSP, XN_INFINITE, XN_RELATIVE, > NULL); > >> /Jesper >> >> >> On 2011-04-11 16:27, Philippe Gerum wrote: >> >>> On Mon, 2011-04-11 at 16:20 +0200, Jesper Christensen wrote: >>> >>> Problem is the NIP in question is the address of the thread structure as seen in the error message. >>> LR? >>> >>> >>> /Jesper On 2011-04-11 16:18, Philippe Gerum wrote: > On Mon, 2011-04-11 at 16:13 +0200, Jesper Christensen wrote: > > > >> I have updated to xenomai 2.5.6, but i'm still seeing exceptions >> (considerably less often though): >> >> Xenomai: suspending kernel thread b92a39d0 ('tt_upgw_0') at 0xb92a39d0 >> after exception #1792 >> >> >> > You should build your code statically into the kernel, not as a module, > and find out which code raises the MCE. > > CONFIG_DEBUG_INFO=y, then objdump -dl vmlinux, looking for the NIP > mentioned. > > > > >> /Jesper >> >> >> On 2011-04-08 15:12, Philippe Gerum wrote: >> >> >> >>> On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: >>> >>> >>> >>> Hi I'm trying to implement some gateway functionality in the kernel on a emerson CPCI6200 board, but have run into some strange errors. The kernel module is made up of two threads that run every 1 ms. I have also made use of the rtpc dispatcher in rtnet to dispatch control messages from a netlink socket to the RT part of my kernel module. The problem is that when loaded the threads get suspended due to exceptions: Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 after exception #1792 or Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after exception #1025 or Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 after exception #1792 I have ported the "gianfar" driver from linux to rtnet. The versions and hardware are listed below. The errors are most likely due to faulty software on my part, but i would like to ask if there are any known issues with the versions or hardware i'm using. I would also like to ask if there are any ways of further debugging the errors as i am not getting very far with the above messages. >>> A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, >>> which would cause exactly the kind of weird behavior you are seeing >>> right now. The bug triggered random code execution due to stack memory >>> pollution at init on powerpc for Xenomai kthreads: >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c >>> http://git.xenomai.or
Re: [Xenomai-core] kernel threads crash
On Mon, 2011-04-11 at 16:32 +0200, Jesper Christensen wrote: > How do i see that? > diff --git a/include/asm-powerpc/system.h b/include/asm-powerpc/system.h index 5cc4a23..8dbc537 100644 --- a/include/asm-powerpc/system.h +++ b/include/asm-powerpc/system.h @@ -104,7 +104,7 @@ typedef struct xnarch_fltinfo { #define xnarch_fault_trap(fi) ((unsigned int)(fi)->regs->trap) #define xnarch_fault_code(fi) ((fi)->regs->dar) #define xnarch_fault_pc(fi) ((fi)->regs->nip) -#define xnarch_fault_pc(fi) ((fi)->regs->nip) +#define xnarch_fault_lr(fi) ((fi)->regs->link) /* FIXME: FPU faults ignored by the nanokernel on PPC. */ #define xnarch_fault_fpu_p(fi) (0) /* The following predicates are only usable over a regular Linux stack diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c index b5ddbaa..c1722e7 100644 --- a/ksrc/nucleus/pod.c +++ b/ksrc/nucleus/pod.c @@ -2591,8 +2591,8 @@ int xnpod_trap_fault(xnarch_fltinfo_t *fltinfo) if (!xnpod_userspace_p()) { xnprintf - ("suspending kernel thread %p ('%s') at 0x%lx after exception #%u\n", -thread, thread->name, xnarch_fault_pc(fltinfo), + ("suspending kernel thread %p ('%s') at nip=0x%lx, lr=0x%lx after exception #%u\n", +thread, thread->name, xnarch_fault_pc(fltinfo), xnarch_fault_lr(fltinfo), xnarch_fault_trap(fltinfo)); xnpod_suspend_thread(thread, XNSUSP, XN_INFINITE, XN_RELATIVE, NULL); > /Jesper > > > On 2011-04-11 16:27, Philippe Gerum wrote: > > On Mon, 2011-04-11 at 16:20 +0200, Jesper Christensen wrote: > > > >> Problem is the NIP in question is the address of the thread structure as > >> seen in the error message. > >> > > LR? > > > > > >> /Jesper > >> > >> > >> On 2011-04-11 16:18, Philippe Gerum wrote: > >> > >>> On Mon, 2011-04-11 at 16:13 +0200, Jesper Christensen wrote: > >>> > >>> > I have updated to xenomai 2.5.6, but i'm still seeing exceptions > (considerably less often though): > > Xenomai: suspending kernel thread b92a39d0 ('tt_upgw_0') at 0xb92a39d0 > after exception #1792 > > > >>> You should build your code statically into the kernel, not as a module, > >>> and find out which code raises the MCE. > >>> > >>> CONFIG_DEBUG_INFO=y, then objdump -dl vmlinux, looking for the NIP > >>> mentioned. > >>> > >>> > >>> > /Jesper > > > On 2011-04-08 15:12, Philippe Gerum wrote: > > > > On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: > > > > > > > >> Hi > >> > >> I'm trying to implement some gateway functionality in the kernel on a > >> emerson CPCI6200 board, but have run into some strange errors. The > >> kernel module is made up of two threads that run every 1 ms. I have > >> also > >> made use of the rtpc dispatcher in rtnet to dispatch control messages > >> from a netlink socket to the RT part of my kernel module. > >> > >> The problem is that when loaded the threads get suspended due to > >> exceptions: > >> > >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 > >> after exception #1792 > >> > >> or > >> > >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after > >> exception #1025 > >> > >> or > >> > >> Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 > >> after exception #1792 > >> > >> > >> I have ported the "gianfar" driver from linux to rtnet. > >> > >> The versions and hardware are listed below. The errors are most likely > >> due to faulty software on my part, but i would like to ask if there are > >> any known issues with the versions or hardware i'm using. I would also > >> like to ask if there are any ways of further debugging the errors as i > >> am not getting very far with the above messages. > >> > >> > >> > > A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, > > which would cause exactly the kind of weird behavior you are seeing > > right now. The bug triggered random code execution due to stack memory > > pollution at init on powerpc for Xenomai kthreads: > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 > > > > You need at the very least those three patches (from the top of my > > head), but it would be much better to upgrade to 2.5.6. > > > > > > > > > >> System info: > >> > >> Linux kernel: 2.6.29.6 > >> i-pi
Re: [Xenomai-core] kernel threads crash
Jesper Christensen wrote: > I have updated to xenomai 2.5.6, but i'm still seeing exceptions > (considerably less often though): > > Xenomai: suspending kernel thread b92a39d0 ('tt_upgw_0') at 0xb92a39d0 > after exception #1792 There was an alignment issue with rtnet on ARM some time ago, which was solved by the following patch: http://rtnet.git.sourceforge.net/git/gitweb.cgi?p=rtnet/rtnet;a=commit;h=1b38434c6137d3b4a708e00d8fef6a4b422c6593 Maybe it is related? Regards. -- Gilles. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
Only during init: # cat /proc/xenomai/sched CPU PIDCLASS PRI TIMEOUT TIMEBASE STAT NAME 0 0 idle-1 - master R ROOT/0 1 0 idle-1 - master R ROOT/1 0 0 rt 98 - master W rtnet-stack 0 0 rt 0 - master W rtnet-rtpc 0 0 rt 99 - master S tt_upgw_0 1 0 rt 99 27us master D tt_upgw_1 0 720rt 0 - master X upgmu 0 724rt 0 979ms599us master D upgmu /Jesper On 2011-04-11 16:34, Philippe Gerum wrote: > On Mon, 2011-04-11 at 16:20 +0200, Jesper Christensen wrote: > >> Problem is the NIP in question is the address of the thread structure as >> seen in the error message. >> >> > Is your code spawning -rt kernel threads frequently/periodically, or > only when the application initializes? > > >> /Jesper >> >> >> On 2011-04-11 16:18, Philippe Gerum wrote: >> >>> On Mon, 2011-04-11 at 16:13 +0200, Jesper Christensen wrote: >>> >>> I have updated to xenomai 2.5.6, but i'm still seeing exceptions (considerably less often though): Xenomai: suspending kernel thread b92a39d0 ('tt_upgw_0') at 0xb92a39d0 after exception #1792 >>> You should build your code statically into the kernel, not as a module, >>> and find out which code raises the MCE. >>> >>> CONFIG_DEBUG_INFO=y, then objdump -dl vmlinux, looking for the NIP >>> mentioned. >>> >>> >>> /Jesper On 2011-04-08 15:12, Philippe Gerum wrote: > On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: > > > >> Hi >> >> I'm trying to implement some gateway functionality in the kernel on a >> emerson CPCI6200 board, but have run into some strange errors. The >> kernel module is made up of two threads that run every 1 ms. I have also >> made use of the rtpc dispatcher in rtnet to dispatch control messages >> from a netlink socket to the RT part of my kernel module. >> >> The problem is that when loaded the threads get suspended due to >> exceptions: >> >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 >> after exception #1792 >> >> or >> >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after >> exception #1025 >> >> or >> >> Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 >> after exception #1792 >> >> >> I have ported the "gianfar" driver from linux to rtnet. >> >> The versions and hardware are listed below. The errors are most likely >> due to faulty software on my part, but i would like to ask if there are >> any known issues with the versions or hardware i'm using. I would also >> like to ask if there are any ways of further debugging the errors as i >> am not getting very far with the above messages. >> >> >> > A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, > which would cause exactly the kind of weird behavior you are seeing > right now. The bug triggered random code execution due to stack memory > pollution at init on powerpc for Xenomai kthreads: > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 > > You need at the very least those three patches (from the top of my > head), but it would be much better to upgrade to 2.5.6. > > > > >> System info: >> >> Linux kernel: 2.6.29.6 >> i-pipe version: 2.7-04 >> processor: powerpc mpc8572 >> xenomai version: 2.5.3 >> rtnet version: 0.9.12 >> >> >> >> > > > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core >>> >>> >> > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On Mon, 2011-04-11 at 16:20 +0200, Jesper Christensen wrote: > Problem is the NIP in question is the address of the thread structure as > seen in the error message. > Is your code spawning -rt kernel threads frequently/periodically, or only when the application initializes? > /Jesper > > > On 2011-04-11 16:18, Philippe Gerum wrote: > > On Mon, 2011-04-11 at 16:13 +0200, Jesper Christensen wrote: > > > >> I have updated to xenomai 2.5.6, but i'm still seeing exceptions > >> (considerably less often though): > >> > >> Xenomai: suspending kernel thread b92a39d0 ('tt_upgw_0') at 0xb92a39d0 > >> after exception #1792 > >> > > You should build your code statically into the kernel, not as a module, > > and find out which code raises the MCE. > > > > CONFIG_DEBUG_INFO=y, then objdump -dl vmlinux, looking for the NIP > > mentioned. > > > > > >> /Jesper > >> > >> > >> On 2011-04-08 15:12, Philippe Gerum wrote: > >> > >>> On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: > >>> > >>> > Hi > > I'm trying to implement some gateway functionality in the kernel on a > emerson CPCI6200 board, but have run into some strange errors. The > kernel module is made up of two threads that run every 1 ms. I have also > made use of the rtpc dispatcher in rtnet to dispatch control messages > from a netlink socket to the RT part of my kernel module. > > The problem is that when loaded the threads get suspended due to > exceptions: > > Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 > after exception #1792 > > or > > Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after > exception #1025 > > or > > Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 > after exception #1792 > > > I have ported the "gianfar" driver from linux to rtnet. > > The versions and hardware are listed below. The errors are most likely > due to faulty software on my part, but i would like to ask if there are > any known issues with the versions or hardware i'm using. I would also > like to ask if there are any ways of further debugging the errors as i > am not getting very far with the above messages. > > > >>> A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, > >>> which would cause exactly the kind of weird behavior you are seeing > >>> right now. The bug triggered random code execution due to stack memory > >>> pollution at init on powerpc for Xenomai kthreads: > >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a > >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c > >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 > >>> > >>> You need at the very least those three patches (from the top of my > >>> head), but it would be much better to upgrade to 2.5.6. > >>> > >>> > >>> > > System info: > > Linux kernel: 2.6.29.6 > i-pipe version: 2.7-04 > processor: powerpc mpc8572 > xenomai version: 2.5.3 > rtnet version: 0.9.12 > > > > >>> > >>> > >> > >> ___ > >> Xenomai-core mailing list > >> Xenomai-core@gna.org > >> https://mail.gna.org/listinfo/xenomai-core > >> > > > -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
How do i see that? /Jesper On 2011-04-11 16:27, Philippe Gerum wrote: > On Mon, 2011-04-11 at 16:20 +0200, Jesper Christensen wrote: > >> Problem is the NIP in question is the address of the thread structure as >> seen in the error message. >> > LR? > > >> /Jesper >> >> >> On 2011-04-11 16:18, Philippe Gerum wrote: >> >>> On Mon, 2011-04-11 at 16:13 +0200, Jesper Christensen wrote: >>> >>> I have updated to xenomai 2.5.6, but i'm still seeing exceptions (considerably less often though): Xenomai: suspending kernel thread b92a39d0 ('tt_upgw_0') at 0xb92a39d0 after exception #1792 >>> You should build your code statically into the kernel, not as a module, >>> and find out which code raises the MCE. >>> >>> CONFIG_DEBUG_INFO=y, then objdump -dl vmlinux, looking for the NIP >>> mentioned. >>> >>> >>> /Jesper On 2011-04-08 15:12, Philippe Gerum wrote: > On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: > > > >> Hi >> >> I'm trying to implement some gateway functionality in the kernel on a >> emerson CPCI6200 board, but have run into some strange errors. The >> kernel module is made up of two threads that run every 1 ms. I have also >> made use of the rtpc dispatcher in rtnet to dispatch control messages >> from a netlink socket to the RT part of my kernel module. >> >> The problem is that when loaded the threads get suspended due to >> exceptions: >> >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 >> after exception #1792 >> >> or >> >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after >> exception #1025 >> >> or >> >> Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 >> after exception #1792 >> >> >> I have ported the "gianfar" driver from linux to rtnet. >> >> The versions and hardware are listed below. The errors are most likely >> due to faulty software on my part, but i would like to ask if there are >> any known issues with the versions or hardware i'm using. I would also >> like to ask if there are any ways of further debugging the errors as i >> am not getting very far with the above messages. >> >> >> > A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, > which would cause exactly the kind of weird behavior you are seeing > right now. The bug triggered random code execution due to stack memory > pollution at init on powerpc for Xenomai kthreads: > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 > > You need at the very least those three patches (from the top of my > head), but it would be much better to upgrade to 2.5.6. > > > > >> System info: >> >> Linux kernel: 2.6.29.6 >> i-pipe version: 2.7-04 >> processor: powerpc mpc8572 >> xenomai version: 2.5.3 >> rtnet version: 0.9.12 >> >> >> >> > > > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core >>> >>> >> > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On Mon, 2011-04-11 at 16:20 +0200, Jesper Christensen wrote: > Problem is the NIP in question is the address of the thread structure as > seen in the error message. LR? > > /Jesper > > > On 2011-04-11 16:18, Philippe Gerum wrote: > > On Mon, 2011-04-11 at 16:13 +0200, Jesper Christensen wrote: > > > >> I have updated to xenomai 2.5.6, but i'm still seeing exceptions > >> (considerably less often though): > >> > >> Xenomai: suspending kernel thread b92a39d0 ('tt_upgw_0') at 0xb92a39d0 > >> after exception #1792 > >> > > You should build your code statically into the kernel, not as a module, > > and find out which code raises the MCE. > > > > CONFIG_DEBUG_INFO=y, then objdump -dl vmlinux, looking for the NIP > > mentioned. > > > > > >> /Jesper > >> > >> > >> On 2011-04-08 15:12, Philippe Gerum wrote: > >> > >>> On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: > >>> > >>> > Hi > > I'm trying to implement some gateway functionality in the kernel on a > emerson CPCI6200 board, but have run into some strange errors. The > kernel module is made up of two threads that run every 1 ms. I have also > made use of the rtpc dispatcher in rtnet to dispatch control messages > from a netlink socket to the RT part of my kernel module. > > The problem is that when loaded the threads get suspended due to > exceptions: > > Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 > after exception #1792 > > or > > Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after > exception #1025 > > or > > Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 > after exception #1792 > > > I have ported the "gianfar" driver from linux to rtnet. > > The versions and hardware are listed below. The errors are most likely > due to faulty software on my part, but i would like to ask if there are > any known issues with the versions or hardware i'm using. I would also > like to ask if there are any ways of further debugging the errors as i > am not getting very far with the above messages. > > > >>> A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, > >>> which would cause exactly the kind of weird behavior you are seeing > >>> right now. The bug triggered random code execution due to stack memory > >>> pollution at init on powerpc for Xenomai kthreads: > >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a > >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c > >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 > >>> > >>> You need at the very least those three patches (from the top of my > >>> head), but it would be much better to upgrade to 2.5.6. > >>> > >>> > >>> > > System info: > > Linux kernel: 2.6.29.6 > i-pipe version: 2.7-04 > processor: powerpc mpc8572 > xenomai version: 2.5.3 > rtnet version: 0.9.12 > > > > >>> > >>> > >> > >> ___ > >> Xenomai-core mailing list > >> Xenomai-core@gna.org > >> https://mail.gna.org/listinfo/xenomai-core > >> > > > -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On Mon, 2011-04-11 at 16:18 +0200, Philippe Gerum wrote: > On Mon, 2011-04-11 at 16:13 +0200, Jesper Christensen wrote: > > I have updated to xenomai 2.5.6, but i'm still seeing exceptions > > (considerably less often though): > > > > Xenomai: suspending kernel thread b92a39d0 ('tt_upgw_0') at 0xb92a39d0 > > after exception #1792 > > You should build your code statically into the kernel, not as a module, > and find out which code raises the MCE. It's a program check exception, not a machine check, but the rest remains applicable. > > CONFIG_DEBUG_INFO=y, then objdump -dl vmlinux, looking for the NIP > mentioned. > > > > > /Jesper > > > > > > On 2011-04-08 15:12, Philippe Gerum wrote: > > > On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: > > > > > >> Hi > > >> > > >> I'm trying to implement some gateway functionality in the kernel on a > > >> emerson CPCI6200 board, but have run into some strange errors. The > > >> kernel module is made up of two threads that run every 1 ms. I have also > > >> made use of the rtpc dispatcher in rtnet to dispatch control messages > > >> from a netlink socket to the RT part of my kernel module. > > >> > > >> The problem is that when loaded the threads get suspended due to > > >> exceptions: > > >> > > >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 > > >> after exception #1792 > > >> > > >> or > > >> > > >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after > > >> exception #1025 > > >> > > >> or > > >> > > >> Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 > > >> after exception #1792 > > >> > > >> > > >> I have ported the "gianfar" driver from linux to rtnet. > > >> > > >> The versions and hardware are listed below. The errors are most likely > > >> due to faulty software on my part, but i would like to ask if there are > > >> any known issues with the versions or hardware i'm using. I would also > > >> like to ask if there are any ways of further debugging the errors as i > > >> am not getting very far with the above messages. > > >> > > > A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, > > > which would cause exactly the kind of weird behavior you are seeing > > > right now. The bug triggered random code execution due to stack memory > > > pollution at init on powerpc for Xenomai kthreads: > > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a > > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c > > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 > > > > > > You need at the very least those three patches (from the top of my > > > head), but it would be much better to upgrade to 2.5.6. > > > > > > > > >> > > >> > > >> System info: > > >> > > >> Linux kernel: 2.6.29.6 > > >> i-pipe version: 2.7-04 > > >> processor: powerpc mpc8572 > > >> xenomai version: 2.5.3 > > >> rtnet version: 0.9.12 > > >> > > >> > > > > > > > > > ___ > > Xenomai-core mailing list > > Xenomai-core@gna.org > > https://mail.gna.org/listinfo/xenomai-core > -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
Problem is the NIP in question is the address of the thread structure as seen in the error message. /Jesper On 2011-04-11 16:18, Philippe Gerum wrote: > On Mon, 2011-04-11 at 16:13 +0200, Jesper Christensen wrote: > >> I have updated to xenomai 2.5.6, but i'm still seeing exceptions >> (considerably less often though): >> >> Xenomai: suspending kernel thread b92a39d0 ('tt_upgw_0') at 0xb92a39d0 >> after exception #1792 >> > You should build your code statically into the kernel, not as a module, > and find out which code raises the MCE. > > CONFIG_DEBUG_INFO=y, then objdump -dl vmlinux, looking for the NIP > mentioned. > > >> /Jesper >> >> >> On 2011-04-08 15:12, Philippe Gerum wrote: >> >>> On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: >>> >>> Hi I'm trying to implement some gateway functionality in the kernel on a emerson CPCI6200 board, but have run into some strange errors. The kernel module is made up of two threads that run every 1 ms. I have also made use of the rtpc dispatcher in rtnet to dispatch control messages from a netlink socket to the RT part of my kernel module. The problem is that when loaded the threads get suspended due to exceptions: Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 after exception #1792 or Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after exception #1025 or Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 after exception #1792 I have ported the "gianfar" driver from linux to rtnet. The versions and hardware are listed below. The errors are most likely due to faulty software on my part, but i would like to ask if there are any known issues with the versions or hardware i'm using. I would also like to ask if there are any ways of further debugging the errors as i am not getting very far with the above messages. >>> A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, >>> which would cause exactly the kind of weird behavior you are seeing >>> right now. The bug triggered random code execution due to stack memory >>> pollution at init on powerpc for Xenomai kthreads: >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 >>> >>> You need at the very least those three patches (from the top of my >>> head), but it would be much better to upgrade to 2.5.6. >>> >>> >>> System info: Linux kernel: 2.6.29.6 i-pipe version: 2.7-04 processor: powerpc mpc8572 xenomai version: 2.5.3 rtnet version: 0.9.12 >>> >>> >> >> ___ >> Xenomai-core mailing list >> Xenomai-core@gna.org >> https://mail.gna.org/listinfo/xenomai-core >> > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On Mon, 2011-04-11 at 16:13 +0200, Jesper Christensen wrote: > I have updated to xenomai 2.5.6, but i'm still seeing exceptions > (considerably less often though): > > Xenomai: suspending kernel thread b92a39d0 ('tt_upgw_0') at 0xb92a39d0 > after exception #1792 You should build your code statically into the kernel, not as a module, and find out which code raises the MCE. CONFIG_DEBUG_INFO=y, then objdump -dl vmlinux, looking for the NIP mentioned. > > /Jesper > > > On 2011-04-08 15:12, Philippe Gerum wrote: > > On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: > > > >> Hi > >> > >> I'm trying to implement some gateway functionality in the kernel on a > >> emerson CPCI6200 board, but have run into some strange errors. The > >> kernel module is made up of two threads that run every 1 ms. I have also > >> made use of the rtpc dispatcher in rtnet to dispatch control messages > >> from a netlink socket to the RT part of my kernel module. > >> > >> The problem is that when loaded the threads get suspended due to > >> exceptions: > >> > >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 > >> after exception #1792 > >> > >> or > >> > >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after > >> exception #1025 > >> > >> or > >> > >> Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 > >> after exception #1792 > >> > >> > >> I have ported the "gianfar" driver from linux to rtnet. > >> > >> The versions and hardware are listed below. The errors are most likely > >> due to faulty software on my part, but i would like to ask if there are > >> any known issues with the versions or hardware i'm using. I would also > >> like to ask if there are any ways of further debugging the errors as i > >> am not getting very far with the above messages. > >> > > A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, > > which would cause exactly the kind of weird behavior you are seeing > > right now. The bug triggered random code execution due to stack memory > > pollution at init on powerpc for Xenomai kthreads: > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 > > > > You need at the very least those three patches (from the top of my > > head), but it would be much better to upgrade to 2.5.6. > > > > > >> > >> > >> System info: > >> > >> Linux kernel: 2.6.29.6 > >> i-pipe version: 2.7-04 > >> processor: powerpc mpc8572 > >> xenomai version: 2.5.3 > >> rtnet version: 0.9.12 > >> > >> > > > > > ___ > Xenomai-core mailing list > Xenomai-core@gna.org > https://mail.gna.org/listinfo/xenomai-core -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
I have updated to xenomai 2.5.6, but i'm still seeing exceptions (considerably less often though): Xenomai: suspending kernel thread b92a39d0 ('tt_upgw_0') at 0xb92a39d0 after exception #1792 /Jesper On 2011-04-08 15:12, Philippe Gerum wrote: > On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: > >> Hi >> >> I'm trying to implement some gateway functionality in the kernel on a >> emerson CPCI6200 board, but have run into some strange errors. The >> kernel module is made up of two threads that run every 1 ms. I have also >> made use of the rtpc dispatcher in rtnet to dispatch control messages >> from a netlink socket to the RT part of my kernel module. >> >> The problem is that when loaded the threads get suspended due to exceptions: >> >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 >> after exception #1792 >> >> or >> >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after >> exception #1025 >> >> or >> >> Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 >> after exception #1792 >> >> >> I have ported the "gianfar" driver from linux to rtnet. >> >> The versions and hardware are listed below. The errors are most likely >> due to faulty software on my part, but i would like to ask if there are >> any known issues with the versions or hardware i'm using. I would also >> like to ask if there are any ways of further debugging the errors as i >> am not getting very far with the above messages. >> > A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, > which would cause exactly the kind of weird behavior you are seeing > right now. The bug triggered random code execution due to stack memory > pollution at init on powerpc for Xenomai kthreads: > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 > > You need at the very least those three patches (from the top of my > head), but it would be much better to upgrade to 2.5.6. > > >> >> >> System info: >> >> Linux kernel: 2.6.29.6 >> i-pipe version: 2.7-04 >> processor: powerpc mpc8572 >> xenomai version: 2.5.3 >> rtnet version: 0.9.12 >> >> > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
Yeah we've been stupid enough not to use the version from the git repository, but i'll see what i can do over the next few days (snowed under at work :( ). I should be able to do some diffs against the git repository. /Jesper On 2011-04-11 11:18, Jan Kiszka wrote: > On 2011-04-11 08:59, Jesper Christensen wrote: > >> There you go. >> >> > Even better would be individual patches (one for each topic) so that we > can more easily review/merge your improvements into RTnet. If you are > not familiar with this process, maybe you can convince Richard to do the > break-up... :) > > Also, please follow up on RTnet-list(s) about RTnet topics. > > TIA, > Jan > > >> On 2011-04-11 08:55, Richard Cochran wrote: >> >>> On Mon, Apr 11, 2011 at 08:52:04AM +0200, Jesper Christensen wrote: >>> >>> I have made a number of changes to rtnet (scatter gather support, icmp fixes etc.) so maybe i could send you a tar ball of the entire directory? >>> Yes, please do. >>> >>> thanks, >>> Richard >>> >>> >>> On 2011-04-08 21:15, Richard Cochran wrote: > On Fri, Apr 08, 2011 at 02:58:33PM +0200, Jesper Christensen wrote: > > > > >> I have ported the "gianfar" driver from linux to rtnet. >> >> >> > Can you publish this driver? I have been wanting to make a rtnet > gianfar myself. > > Thanks, > > Richard > > > > > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On 2011-04-11 08:59, Jesper Christensen wrote: > There you go. > Even better would be individual patches (one for each topic) so that we can more easily review/merge your improvements into RTnet. If you are not familiar with this process, maybe you can convince Richard to do the break-up... :) Also, please follow up on RTnet-list(s) about RTnet topics. TIA, Jan > > On 2011-04-11 08:55, Richard Cochran wrote: >> On Mon, Apr 11, 2011 at 08:52:04AM +0200, Jesper Christensen wrote: >> >>> I have made a number of changes to rtnet (scatter gather support, icmp >>> fixes etc.) so maybe i could send you a tar ball of the entire directory? >>> >> Yes, please do. >> >> thanks, >> Richard >> >> >>> On 2011-04-08 21:15, Richard Cochran wrote: >>> On Fri, Apr 08, 2011 at 02:58:33PM +0200, Jesper Christensen wrote: > I have ported the "gianfar" driver from linux to rtnet. > > Can you publish this driver? I have been wanting to make a rtnet gianfar myself. Thanks, Richard -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On Mon, Apr 11, 2011 at 08:52:04AM +0200, Jesper Christensen wrote: > I have made a number of changes to rtnet (scatter gather support, icmp > fixes etc.) so maybe i could send you a tar ball of the entire directory? Yes, please do. thanks, Richard > On 2011-04-08 21:15, Richard Cochran wrote: > > On Fri, Apr 08, 2011 at 02:58:33PM +0200, Jesper Christensen wrote: > > > > > >> I have ported the "gianfar" driver from linux to rtnet. > >> > > Can you publish this driver? I have been wanting to make a rtnet > > gianfar myself. > > > > Thanks, > > > > Richard > > > > > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
I have made a number of changes to rtnet (scatter gather support, icmp fixes etc.) so maybe i could send you a tar ball of the entire directory? /Jesper On 2011-04-08 21:15, Richard Cochran wrote: > On Fri, Apr 08, 2011 at 02:58:33PM +0200, Jesper Christensen wrote: > > >> I have ported the "gianfar" driver from linux to rtnet. >> > Can you publish this driver? I have been wanting to make a rtnet > gianfar myself. > > Thanks, > > Richard > > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On Fri, Apr 08, 2011 at 02:58:33PM +0200, Jesper Christensen wrote: > I have ported the "gianfar" driver from linux to rtnet. Can you publish this driver? I have been wanting to make a rtnet gianfar myself. Thanks, Richard ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
With the risk of jinx'ing it the 2.5.6 version seems to have done the trick. Huge thanks to you Philippe, you have saved my head from decapitation :) /Jesper On 2011-04-08 15:39, Philippe Gerum wrote: > On Fri, 2011-04-08 at 15:20 +0200, Jesper Christensen wrote: > >> Thanks i'll give 2.5.6 a shot. >> >> Also it has come to my attention that there is some source files >> (arch/powerpc/platforms/85xx/cpci6200.c, >> arch/powerpc/platforms/85xx/cpci6200.h, >> arch/powerpc/platforms/85xx/cpci6200_timer.c) that are probably not >> covered by the adeos patch. Am i correct in assuming these need some >> work to support i-pipe? >> >> > I can't tell since I have no access to them, this is probably not a > mainline port. > > In any case, if any of those files implements the support for the > programmable interrupt controller, hw timer, gpios and/or any form of > cascaded interrupt handling, this is correct: they should be made I-pipe > aware. > > >> /Jesper >> >> >> On 2011-04-08 15:12, Philippe Gerum wrote: >> >>> On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: >>> >>> Hi I'm trying to implement some gateway functionality in the kernel on a emerson CPCI6200 board, but have run into some strange errors. The kernel module is made up of two threads that run every 1 ms. I have also made use of the rtpc dispatcher in rtnet to dispatch control messages from a netlink socket to the RT part of my kernel module. The problem is that when loaded the threads get suspended due to exceptions: Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 after exception #1792 or Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after exception #1025 or Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 after exception #1792 I have ported the "gianfar" driver from linux to rtnet. The versions and hardware are listed below. The errors are most likely due to faulty software on my part, but i would like to ask if there are any known issues with the versions or hardware i'm using. I would also like to ask if there are any ways of further debugging the errors as i am not getting very far with the above messages. >>> A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, >>> which would cause exactly the kind of weird behavior you are seeing >>> right now. The bug triggered random code execution due to stack memory >>> pollution at init on powerpc for Xenomai kthreads: >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c >>> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 >>> >>> You need at the very least those three patches (from the top of my >>> head), but it would be much better to upgrade to 2.5.6. >>> >>> >>> System info: Linux kernel: 2.6.29.6 i-pipe version: 2.7-04 processor: powerpc mpc8572 xenomai version: 2.5.3 rtnet version: 0.9.12 >>> >>> >> > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On Fri, 2011-04-08 at 15:20 +0200, Jesper Christensen wrote: > Thanks i'll give 2.5.6 a shot. > > Also it has come to my attention that there is some source files > (arch/powerpc/platforms/85xx/cpci6200.c, > arch/powerpc/platforms/85xx/cpci6200.h, > arch/powerpc/platforms/85xx/cpci6200_timer.c) that are probably not > covered by the adeos patch. Am i correct in assuming these need some > work to support i-pipe? > I can't tell since I have no access to them, this is probably not a mainline port. In any case, if any of those files implements the support for the programmable interrupt controller, hw timer, gpios and/or any form of cascaded interrupt handling, this is correct: they should be made I-pipe aware. > /Jesper > > > On 2011-04-08 15:12, Philippe Gerum wrote: > > On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: > > > >> Hi > >> > >> I'm trying to implement some gateway functionality in the kernel on a > >> emerson CPCI6200 board, but have run into some strange errors. The > >> kernel module is made up of two threads that run every 1 ms. I have also > >> made use of the rtpc dispatcher in rtnet to dispatch control messages > >> from a netlink socket to the RT part of my kernel module. > >> > >> The problem is that when loaded the threads get suspended due to > >> exceptions: > >> > >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 > >> after exception #1792 > >> > >> or > >> > >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after > >> exception #1025 > >> > >> or > >> > >> Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 > >> after exception #1792 > >> > >> > >> I have ported the "gianfar" driver from linux to rtnet. > >> > >> The versions and hardware are listed below. The errors are most likely > >> due to faulty software on my part, but i would like to ask if there are > >> any known issues with the versions or hardware i'm using. I would also > >> like to ask if there are any ways of further debugging the errors as i > >> am not getting very far with the above messages. > >> > > A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, > > which would cause exactly the kind of weird behavior you are seeing > > right now. The bug triggered random code execution due to stack memory > > pollution at init on powerpc for Xenomai kthreads: > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c > > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 > > > > You need at the very least those three patches (from the top of my > > head), but it would be much better to upgrade to 2.5.6. > > > > > >> > >> > >> System info: > >> > >> Linux kernel: 2.6.29.6 > >> i-pipe version: 2.7-04 > >> processor: powerpc mpc8572 > >> xenomai version: 2.5.3 > >> rtnet version: 0.9.12 > >> > >> > > > -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
Thanks i'll give 2.5.6 a shot. Also it has come to my attention that there is some source files (arch/powerpc/platforms/85xx/cpci6200.c, arch/powerpc/platforms/85xx/cpci6200.h, arch/powerpc/platforms/85xx/cpci6200_timer.c) that are probably not covered by the adeos patch. Am i correct in assuming these need some work to support i-pipe? /Jesper On 2011-04-08 15:12, Philippe Gerum wrote: > On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: > >> Hi >> >> I'm trying to implement some gateway functionality in the kernel on a >> emerson CPCI6200 board, but have run into some strange errors. The >> kernel module is made up of two threads that run every 1 ms. I have also >> made use of the rtpc dispatcher in rtnet to dispatch control messages >> from a netlink socket to the RT part of my kernel module. >> >> The problem is that when loaded the threads get suspended due to exceptions: >> >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 >> after exception #1792 >> >> or >> >> Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after >> exception #1025 >> >> or >> >> Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 >> after exception #1792 >> >> >> I have ported the "gianfar" driver from linux to rtnet. >> >> The versions and hardware are listed below. The errors are most likely >> due to faulty software on my part, but i would like to ask if there are >> any known issues with the versions or hardware i'm using. I would also >> like to ask if there are any ways of further debugging the errors as i >> am not getting very far with the above messages. >> > A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, > which would cause exactly the kind of weird behavior you are seeing > right now. The bug triggered random code execution due to stack memory > pollution at init on powerpc for Xenomai kthreads: > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 > > You need at the very least those three patches (from the top of my > head), but it would be much better to upgrade to 2.5.6. > > >> >> >> System info: >> >> Linux kernel: 2.6.29.6 >> i-pipe version: 2.7-04 >> processor: powerpc mpc8572 >> xenomai version: 2.5.3 >> rtnet version: 0.9.12 >> >> > ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] kernel threads crash
On Fri, 2011-04-08 at 14:58 +0200, Jesper Christensen wrote: > Hi > > I'm trying to implement some gateway functionality in the kernel on a > emerson CPCI6200 board, but have run into some strange errors. The > kernel module is made up of two threads that run every 1 ms. I have also > made use of the rtpc dispatcher in rtnet to dispatch control messages > from a netlink socket to the RT part of my kernel module. > > The problem is that when loaded the threads get suspended due to exceptions: > > Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 > after exception #1792 > > or > > Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after > exception #1025 > > or > > Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 > after exception #1792 > > > I have ported the "gianfar" driver from linux to rtnet. > > The versions and hardware are listed below. The errors are most likely > due to faulty software on my part, but i would like to ask if there are > any known issues with the versions or hardware i'm using. I would also > like to ask if there are any ways of further debugging the errors as i > am not getting very far with the above messages. A severe bug at kthread init was fixed in the 2.5.5.2 - 2.5.6 timeframe, which would cause exactly the kind of weird behavior you are seeing right now. The bug triggered random code execution due to stack memory pollution at init on powerpc for Xenomai kthreads: http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=90699565cbce41f2cec193d57857bb5817efc19a http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=da20c20d4b4d892d40c657ad1d32ddb6d0ceb47c http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=a5886b354dc18f054b187b58cfbacfb60bccaf47 You need at the very least those three patches (from the top of my head), but it would be much better to upgrade to 2.5.6. > > > > System info: > > Linux kernel: 2.6.29.6 > i-pipe version: 2.7-04 > processor: powerpc mpc8572 > xenomai version: 2.5.3 > rtnet version: 0.9.12 > -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] kernel threads crash
Hi I'm trying to implement some gateway functionality in the kernel on a emerson CPCI6200 board, but have run into some strange errors. The kernel module is made up of two threads that run every 1 ms. I have also made use of the rtpc dispatcher in rtnet to dispatch control messages from a netlink socket to the RT part of my kernel module. The problem is that when loaded the threads get suspended due to exceptions: Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0xb929cbc0 after exception #1792 or Xenomai: suspending kernel thread b929cbc0 ('tt_upgw_0') at 0x0 after exception #1025 or Xenomai: suspending kernel thread b911f518 ('rtnet-rtpc') at 0xb911f940 after exception #1792 I have ported the "gianfar" driver from linux to rtnet. The versions and hardware are listed below. The errors are most likely due to faulty software on my part, but i would like to ask if there are any known issues with the versions or hardware i'm using. I would also like to ask if there are any ways of further debugging the errors as i am not getting very far with the above messages. System info: Linux kernel: 2.6.29.6 i-pipe version: 2.7-04 processor: powerpc mpc8572 xenomai version: 2.5.3 rtnet version: 0.9.12 -- /Jesper ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core