Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash
2011/5/2 Jean-Michel Hautbois jhautb...@gmail.com 2011/5/2 Philippe Gerum r...@xenomai.org: On Mon, 2011-05-02 at 11:56 +0200, Jean-Michel Hautbois wrote: 2011/5/2 Jean-Michel Hautbois jhautb...@gmail.com: 2011/4/30 Philippe Gerum r...@xenomai.org: On Fri, 2011-04-29 at 18:08 +0200, Jean-Michel Hautbois wrote: 2011/4/29 Philippe Gerum r...@xenomai.org: On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote: 2011/4/27 Philippe Gerum r...@xenomai.org: On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote: Hi list, I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch. I am facing a scheduling issue on a P2020 (dual core PowerPC), and I get the following message : Badness at arch/powerpc/mm/mmu_context_nohash.c:209 NIP: c0018d20 LR: c039b94c CTR: c00343e4 REGS: ecfadce0 TRAP: 0700 Tainted: GW(2.6.35.11) MSR: 00021000 ME,CE CR: 24000488 XER: TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700 0003 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 ecfac040 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 00029000 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 ec5df340 NIP [c0018d20] switch_mmu_context+0x80/0x438 LR [c039b94c] schedule+0x774/0x7dc Call Trace: [ecfadd90] [44000484] 0x44000484 (unreliable) [ecfadde0] [c039b94c] schedule+0x774/0x7dc [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c --- Exception: c01 at 0xffa6cc4 LR = 0xffa6cb0 Instruction dump: 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c 419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c Do you have a clue on how to start debugging it ? Yes, but that can't be easily summarized here. In short, we have a serious problem with the sharing of the MMU context between the Linux and Xenomai schedulers in the SMP case on powerpc. OK, good to know that it is a known issue. If there is a thread with some thoughts about it, I am interested ;). It is happening quite randomly... :). Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue? Well, yes and no. It starts well, but when booting the kernel I get : The mm switch issue was specifically addressed by this patch, which is part of 2.12-01: http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048 However, it the last 2.6.35 patch issued was based on 2.6.35.7, not 2.6.35.11, so there is still the possibility that something went wrong while you forward ported this code. - Please check that mmu_context_nohash.c does contain the fix above as it should It is ok, I have the fix. Does 2.6.35.7-2.12-02 exhibit the issue as well? It doesn't seem to exhibit the issue... I didn't try during a long time though... - Please try Richard's suggestion, i.e. moving to 2.6.36, which may give us more hints. It is better. I don't have the badness on mmu context anymore. This gives some hints ;). Yes and no. The mmu management code involved was untouched between 2.6.35 and 2.6.36, so I still don't get why this activity counter gets trashed yet. Badness at kernel/lockdep.c:2327 NIP: c006e554 LR: c006e53c CTR: 000186a0 Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have a look at this. Disable CONFIG_TRACE_IRQFLAGS. Yes, but I *want* to have the CONFIG_TRACE_IRQFLAGS on. I just wanted to tell that I had the problem, in order to be sure it is known ;). Sure, but one issue at a time. JM -- Philippe. OK, the badness disappears, but the 2.6.36 kernel seems more stable than 2.6.35 with this patch. What does more stable mean? Do you have lockups, any issue reported in the kernel log? Any weird Xenomai behavior? Well, my applications work very well with the 2.6.36, comparing to the 2.6.35 were it can crash without any informative message. I can't say much more because I don't have much more :). JM OK, I will give some more details, and correct myself BTW : - A 2.6.35-11 with the 2.6.35.7-powerpc-2.11.02 is showing the badness - A 2.6.35-11 with the 2.6.35.7-powerpc-2.12.01 is showing the badness - A 2.6.35-11 with the 2.6.35.7-powerpc-2.12.02 is NOT showing the badness So, the problem is solved from my point of view, sorry for the noise... JM ___
Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash
2011/4/30 Philippe Gerum r...@xenomai.org: On Fri, 2011-04-29 at 18:08 +0200, Jean-Michel Hautbois wrote: 2011/4/29 Philippe Gerum r...@xenomai.org: On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote: 2011/4/27 Philippe Gerum r...@xenomai.org: On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote: Hi list, I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch. I am facing a scheduling issue on a P2020 (dual core PowerPC), and I get the following message : Badness at arch/powerpc/mm/mmu_context_nohash.c:209 NIP: c0018d20 LR: c039b94c CTR: c00343e4 REGS: ecfadce0 TRAP: 0700 Tainted: G W (2.6.35.11) MSR: 00021000 ME,CE CR: 24000488 XER: TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700 0003 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 ecfac040 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 00029000 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 ec5df340 NIP [c0018d20] switch_mmu_context+0x80/0x438 LR [c039b94c] schedule+0x774/0x7dc Call Trace: [ecfadd90] [44000484] 0x44000484 (unreliable) [ecfadde0] [c039b94c] schedule+0x774/0x7dc [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c --- Exception: c01 at 0xffa6cc4 LR = 0xffa6cb0 Instruction dump: 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c 419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c Do you have a clue on how to start debugging it ? Yes, but that can't be easily summarized here. In short, we have a serious problem with the sharing of the MMU context between the Linux and Xenomai schedulers in the SMP case on powerpc. OK, good to know that it is a known issue. If there is a thread with some thoughts about it, I am interested ;). It is happening quite randomly... :). Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue? Well, yes and no. It starts well, but when booting the kernel I get : The mm switch issue was specifically addressed by this patch, which is part of 2.12-01: http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048 However, it the last 2.6.35 patch issued was based on 2.6.35.7, not 2.6.35.11, so there is still the possibility that something went wrong while you forward ported this code. - Please check that mmu_context_nohash.c does contain the fix above as it should It is ok, I have the fix. Does 2.6.35.7-2.12-02 exhibit the issue as well? It doesn't seem to exhibit the issue... I didn't try during a long time though... - Please try Richard's suggestion, i.e. moving to 2.6.36, which may give us more hints. It is better. I don't have the badness on mmu context anymore. This gives some hints ;). Yes and no. The mmu management code involved was untouched between 2.6.35 and 2.6.36, so I still don't get why this activity counter gets trashed yet. Badness at kernel/lockdep.c:2327 NIP: c006e554 LR: c006e53c CTR: 000186a0 Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have a look at this. Disable CONFIG_TRACE_IRQFLAGS. Yes, but I *want* to have the CONFIG_TRACE_IRQFLAGS on. I just wanted to tell that I had the problem, in order to be sure it is known ;). Sure, but one issue at a time. JM -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash
2011/5/2 Jean-Michel Hautbois jhautb...@gmail.com: 2011/4/30 Philippe Gerum r...@xenomai.org: On Fri, 2011-04-29 at 18:08 +0200, Jean-Michel Hautbois wrote: 2011/4/29 Philippe Gerum r...@xenomai.org: On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote: 2011/4/27 Philippe Gerum r...@xenomai.org: On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote: Hi list, I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch. I am facing a scheduling issue on a P2020 (dual core PowerPC), and I get the following message : Badness at arch/powerpc/mm/mmu_context_nohash.c:209 NIP: c0018d20 LR: c039b94c CTR: c00343e4 REGS: ecfadce0 TRAP: 0700 Tainted: G W (2.6.35.11) MSR: 00021000 ME,CE CR: 24000488 XER: TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700 0003 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 ecfac040 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 00029000 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 ec5df340 NIP [c0018d20] switch_mmu_context+0x80/0x438 LR [c039b94c] schedule+0x774/0x7dc Call Trace: [ecfadd90] [44000484] 0x44000484 (unreliable) [ecfadde0] [c039b94c] schedule+0x774/0x7dc [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c --- Exception: c01 at 0xffa6cc4 LR = 0xffa6cb0 Instruction dump: 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c 419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c Do you have a clue on how to start debugging it ? Yes, but that can't be easily summarized here. In short, we have a serious problem with the sharing of the MMU context between the Linux and Xenomai schedulers in the SMP case on powerpc. OK, good to know that it is a known issue. If there is a thread with some thoughts about it, I am interested ;). It is happening quite randomly... :). Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue? Well, yes and no. It starts well, but when booting the kernel I get : The mm switch issue was specifically addressed by this patch, which is part of 2.12-01: http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048 However, it the last 2.6.35 patch issued was based on 2.6.35.7, not 2.6.35.11, so there is still the possibility that something went wrong while you forward ported this code. - Please check that mmu_context_nohash.c does contain the fix above as it should It is ok, I have the fix. Does 2.6.35.7-2.12-02 exhibit the issue as well? It doesn't seem to exhibit the issue... I didn't try during a long time though... - Please try Richard's suggestion, i.e. moving to 2.6.36, which may give us more hints. It is better. I don't have the badness on mmu context anymore. This gives some hints ;). Yes and no. The mmu management code involved was untouched between 2.6.35 and 2.6.36, so I still don't get why this activity counter gets trashed yet. Badness at kernel/lockdep.c:2327 NIP: c006e554 LR: c006e53c CTR: 000186a0 Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have a look at this. Disable CONFIG_TRACE_IRQFLAGS. Yes, but I *want* to have the CONFIG_TRACE_IRQFLAGS on. I just wanted to tell that I had the problem, in order to be sure it is known ;). Sure, but one issue at a time. JM -- Philippe. OK, the badness disappears, but the 2.6.36 kernel seems more stable than 2.6.35 with this patch. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash
2011/5/2 Philippe Gerum r...@xenomai.org: On Mon, 2011-05-02 at 11:56 +0200, Jean-Michel Hautbois wrote: 2011/5/2 Jean-Michel Hautbois jhautb...@gmail.com: 2011/4/30 Philippe Gerum r...@xenomai.org: On Fri, 2011-04-29 at 18:08 +0200, Jean-Michel Hautbois wrote: 2011/4/29 Philippe Gerum r...@xenomai.org: On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote: 2011/4/27 Philippe Gerum r...@xenomai.org: On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote: Hi list, I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch. I am facing a scheduling issue on a P2020 (dual core PowerPC), and I get the following message : Badness at arch/powerpc/mm/mmu_context_nohash.c:209 NIP: c0018d20 LR: c039b94c CTR: c00343e4 REGS: ecfadce0 TRAP: 0700 Tainted: G W (2.6.35.11) MSR: 00021000 ME,CE CR: 24000488 XER: TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700 0003 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 ecfac040 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 00029000 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 ec5df340 NIP [c0018d20] switch_mmu_context+0x80/0x438 LR [c039b94c] schedule+0x774/0x7dc Call Trace: [ecfadd90] [44000484] 0x44000484 (unreliable) [ecfadde0] [c039b94c] schedule+0x774/0x7dc [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c --- Exception: c01 at 0xffa6cc4 LR = 0xffa6cb0 Instruction dump: 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c 419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c Do you have a clue on how to start debugging it ? Yes, but that can't be easily summarized here. In short, we have a serious problem with the sharing of the MMU context between the Linux and Xenomai schedulers in the SMP case on powerpc. OK, good to know that it is a known issue. If there is a thread with some thoughts about it, I am interested ;). It is happening quite randomly... :). Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue? Well, yes and no. It starts well, but when booting the kernel I get : The mm switch issue was specifically addressed by this patch, which is part of 2.12-01: http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048 However, it the last 2.6.35 patch issued was based on 2.6.35.7, not 2.6.35.11, so there is still the possibility that something went wrong while you forward ported this code. - Please check that mmu_context_nohash.c does contain the fix above as it should It is ok, I have the fix. Does 2.6.35.7-2.12-02 exhibit the issue as well? It doesn't seem to exhibit the issue... I didn't try during a long time though... - Please try Richard's suggestion, i.e. moving to 2.6.36, which may give us more hints. It is better. I don't have the badness on mmu context anymore. This gives some hints ;). Yes and no. The mmu management code involved was untouched between 2.6.35 and 2.6.36, so I still don't get why this activity counter gets trashed yet. Badness at kernel/lockdep.c:2327 NIP: c006e554 LR: c006e53c CTR: 000186a0 Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have a look at this. Disable CONFIG_TRACE_IRQFLAGS. Yes, but I *want* to have the CONFIG_TRACE_IRQFLAGS on. I just wanted to tell that I had the problem, in order to be sure it is known ;). Sure, but one issue at a time. JM -- Philippe. OK, the badness disappears, but the 2.6.36 kernel seems more stable than 2.6.35 with this patch. What does more stable mean? Do you have lockups, any issue reported in the kernel log? Any weird Xenomai behavior? Well, my applications work very well with the 2.6.36, comparing to the 2.6.35 were it can crash without any informative message. I can't say much more because I don't have much more :). JM ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash
On Fri, 2011-04-29 at 18:08 +0200, Jean-Michel Hautbois wrote: 2011/4/29 Philippe Gerum r...@xenomai.org: On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote: 2011/4/27 Philippe Gerum r...@xenomai.org: On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote: Hi list, I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch. I am facing a scheduling issue on a P2020 (dual core PowerPC), and I get the following message : Badness at arch/powerpc/mm/mmu_context_nohash.c:209 NIP: c0018d20 LR: c039b94c CTR: c00343e4 REGS: ecfadce0 TRAP: 0700 Tainted: GW(2.6.35.11) MSR: 00021000 ME,CE CR: 24000488 XER: TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700 0003 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 ecfac040 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 00029000 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 ec5df340 NIP [c0018d20] switch_mmu_context+0x80/0x438 LR [c039b94c] schedule+0x774/0x7dc Call Trace: [ecfadd90] [44000484] 0x44000484 (unreliable) [ecfadde0] [c039b94c] schedule+0x774/0x7dc [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c --- Exception: c01 at 0xffa6cc4 LR = 0xffa6cb0 Instruction dump: 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c 419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c Do you have a clue on how to start debugging it ? Yes, but that can't be easily summarized here. In short, we have a serious problem with the sharing of the MMU context between the Linux and Xenomai schedulers in the SMP case on powerpc. OK, good to know that it is a known issue. If there is a thread with some thoughts about it, I am interested ;). It is happening quite randomly... :). Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue? Well, yes and no. It starts well, but when booting the kernel I get : The mm switch issue was specifically addressed by this patch, which is part of 2.12-01: http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048 However, it the last 2.6.35 patch issued was based on 2.6.35.7, not 2.6.35.11, so there is still the possibility that something went wrong while you forward ported this code. - Please check that mmu_context_nohash.c does contain the fix above as it should It is ok, I have the fix. Does 2.6.35.7-2.12-02 exhibit the issue as well? - Please try Richard's suggestion, i.e. moving to 2.6.36, which may give us more hints. It is better. I don't have the badness on mmu context anymore. This gives some hints ;). Yes and no. The mmu management code involved was untouched between 2.6.35 and 2.6.36, so I still don't get why this activity counter gets trashed yet. Badness at kernel/lockdep.c:2327 NIP: c006e554 LR: c006e53c CTR: 000186a0 Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have a look at this. Disable CONFIG_TRACE_IRQFLAGS. Yes, but I *want* to have the CONFIG_TRACE_IRQFLAGS on. I just wanted to tell that I had the problem, in order to be sure it is known ;). Sure, but one issue at a time. JM -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash
On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote: 2011/4/27 Philippe Gerum r...@xenomai.org: On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote: Hi list, I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch. I am facing a scheduling issue on a P2020 (dual core PowerPC), and I get the following message : Badness at arch/powerpc/mm/mmu_context_nohash.c:209 NIP: c0018d20 LR: c039b94c CTR: c00343e4 REGS: ecfadce0 TRAP: 0700 Tainted: GW(2.6.35.11) MSR: 00021000 ME,CE CR: 24000488 XER: TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700 0003 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 ecfac040 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 00029000 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 ec5df340 NIP [c0018d20] switch_mmu_context+0x80/0x438 LR [c039b94c] schedule+0x774/0x7dc Call Trace: [ecfadd90] [44000484] 0x44000484 (unreliable) [ecfadde0] [c039b94c] schedule+0x774/0x7dc [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c --- Exception: c01 at 0xffa6cc4 LR = 0xffa6cb0 Instruction dump: 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c 419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c Do you have a clue on how to start debugging it ? Yes, but that can't be easily summarized here. In short, we have a serious problem with the sharing of the MMU context between the Linux and Xenomai schedulers in the SMP case on powerpc. OK, good to know that it is a known issue. If there is a thread with some thoughts about it, I am interested ;). It is happening quite randomly... :). Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue? Well, yes and no. It starts well, but when booting the kernel I get : The mm switch issue was specifically addressed by this patch, which is part of 2.12-01: http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048 However, it the last 2.6.35 patch issued was based on 2.6.35.7, not 2.6.35.11, so there is still the possibility that something went wrong while you forward ported this code. - Please check that mmu_context_nohash.c does contain the fix above as it should - Please try Richard's suggestion, i.e. moving to 2.6.36, which may give us more hints. Badness at kernel/lockdep.c:2327 NIP: c006e554 LR: c006e53c CTR: 000186a0 Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have a look at this. Disable CONFIG_TRACE_IRQFLAGS. -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash
2011/4/29 Philippe Gerum r...@xenomai.org: On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote: 2011/4/27 Philippe Gerum r...@xenomai.org: On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote: Hi list, I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch. I am facing a scheduling issue on a P2020 (dual core PowerPC), and I get the following message : Badness at arch/powerpc/mm/mmu_context_nohash.c:209 NIP: c0018d20 LR: c039b94c CTR: c00343e4 REGS: ecfadce0 TRAP: 0700 Tainted: G W (2.6.35.11) MSR: 00021000 ME,CE CR: 24000488 XER: TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700 0003 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 ecfac040 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 00029000 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 ec5df340 NIP [c0018d20] switch_mmu_context+0x80/0x438 LR [c039b94c] schedule+0x774/0x7dc Call Trace: [ecfadd90] [44000484] 0x44000484 (unreliable) [ecfadde0] [c039b94c] schedule+0x774/0x7dc [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c --- Exception: c01 at 0xffa6cc4 LR = 0xffa6cb0 Instruction dump: 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c 419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c Do you have a clue on how to start debugging it ? Yes, but that can't be easily summarized here. In short, we have a serious problem with the sharing of the MMU context between the Linux and Xenomai schedulers in the SMP case on powerpc. OK, good to know that it is a known issue. If there is a thread with some thoughts about it, I am interested ;). It is happening quite randomly... :). Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue? Well, yes and no. It starts well, but when booting the kernel I get : The mm switch issue was specifically addressed by this patch, which is part of 2.12-01: http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048 However, it the last 2.6.35 patch issued was based on 2.6.35.7, not 2.6.35.11, so there is still the possibility that something went wrong while you forward ported this code. - Please check that mmu_context_nohash.c does contain the fix above as it should It is ok, I have the fix. - Please try Richard's suggestion, i.e. moving to 2.6.36, which may give us more hints. It is better. I don't have the badness on mmu context anymore. This gives some hints ;). Badness at kernel/lockdep.c:2327 NIP: c006e554 LR: c006e53c CTR: 000186a0 Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have a look at this. Disable CONFIG_TRACE_IRQFLAGS. Yes, but I *want* to have the CONFIG_TRACE_IRQFLAGS on. I just wanted to tell that I had the problem, in order to be sure it is known ;). JM ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash
2011/4/27 Philippe Gerum r...@xenomai.org: On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote: Hi list, I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch. I am facing a scheduling issue on a P2020 (dual core PowerPC), and I get the following message : Badness at arch/powerpc/mm/mmu_context_nohash.c:209 NIP: c0018d20 LR: c039b94c CTR: c00343e4 REGS: ecfadce0 TRAP: 0700 Tainted: G W (2.6.35.11) MSR: 00021000 ME,CE CR: 24000488 XER: TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700 0003 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 ecfac040 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 00029000 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 ec5df340 NIP [c0018d20] switch_mmu_context+0x80/0x438 LR [c039b94c] schedule+0x774/0x7dc Call Trace: [ecfadd90] [44000484] 0x44000484 (unreliable) [ecfadde0] [c039b94c] schedule+0x774/0x7dc [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c --- Exception: c01 at 0xffa6cc4 LR = 0xffa6cb0 Instruction dump: 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c 419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c Do you have a clue on how to start debugging it ? Yes, but that can't be easily summarized here. In short, we have a serious problem with the sharing of the MMU context between the Linux and Xenomai schedulers in the SMP case on powerpc. OK, good to know that it is a known issue. If there is a thread with some thoughts about it, I am interested ;). It is happening quite randomly... :). Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue? Well, yes and no. It starts well, but when booting the kernel I get : Badness at kernel/lockdep.c:2327 NIP: c006e554 LR: c006e53c CTR: 000186a0 REGS: effe9e00 TRAP: 0700 Not tainted (2.6.35.11) MSR: 00021000 ME,CE CR: 24242022 XER: TASK = c0508398[0] 'swapper' THREAD: c052e000 CPU: 0 GPR00: effe9eb0 c0508398 0001 80021000 ea50 0060 0003 GPR08: c0501e80 c053 c051 0001 44242028 100488d8 3ff91200 GPR16: 3ff85950 3ff85950 3ffb1254 c0a446e0 c0a44700 c053 c0a44704 GPR24: c0537084 0010 c0539838 80029000 c001444c c053 c0508398 NIP [c006e554] trace_hardirqs_on_caller+0x148/0x18c LR [c006e53c] trace_hardirqs_on_caller+0x130/0x18c Call Trace: [effe9eb0] [c006e50c] trace_hardirqs_on_caller+0x100/0x18c (unreliable) [effe9ed0] [c001444c] restore+0x10/0x64 [effe9f90] [0010] 0x10 [effe9fb0] [c001c568] mpic_unmask_irq+0x84/0xb8 [effe9fd0] [c00816f4] handle_fasteoi_irq+0xe4/0x138 [effe9ff0] [c0013628] call_handle_irq+0x18/0x28 [c052fdc0] [c0004fe0] handle_one_irq+0x94/0x100 [c052fde0] [c000b120] __ipipe_do_IRQ+0x78/0xa8 [c052fe10] [c0085234] __ipipe_sync_stage+0x1b0/0x33c [c052fe50] [c000a404] __ipipe_handle_irq+0x20c/0x260 [c052fe90] [c000a614] __ipipe_grab_irq+0x4c/0x188 [c052fec0] [c0014a60] __ipipe_ret_from_except+0x0/0xc [c052ff80] [c0009224] cpu_idle+0x64/0xe0 [c052ffa0] [c000234c] rest_init+0xd0/0xe4 [c052ffc0] [c04c9ae0] start_kernel+0x2b0/0x348 [c052fff0] [c3c4] skpinv+0x2dc/0x318 Instruction dump: 2f80 409eff5c 800205d8 2f80 419eff50 481b8281 2f83 41beff00 3d20c053 8009732c 2f80 40befef0 0fe0 4bfffee8 7fe3fb78 3881 And after starting my applications, it gets really bad : Badness at arch/powerpc/mm/mmu_context_nohash.c:209 NIP: c0017958 LR: c039f560 CTR: c00347f8 REGS: ed38bdd0 TRAP: 0700 Tainted: G W (2.6.35.11) MSR: 00021000 ME,CE CR: 24000828 XER: TASK = ecb2c260[405] 'ethctl' THREAD: ed38a000 CPU: 1 GPR00: 0001 ed38be80 ecb2c260 ec70f200 ec89a200 0025 c161ccc0 0003 GPR08: ec7fb000 0001 ec89a3c0 c052 0002a106 101110d0 c04e9cc0 c04e9cc0 GPR16: c04e9cc0 c04e9cc0 c04e92e0 c04e9cc0 ed38a000 c051b084 c051b084 c04e9cc0 GPR24: ecb2c4d0 c0519b2c 0001 c04e92e0 c161ccc0 ecb2c260 ec89a200 ec70f200 NIP [c0017958] switch_mmu_context+0x54/0x3d8 LR [c039f560] schedule+0x780/0x824 Call Trace: [ed38be80] [24000822] 0x24000822 (unreliable) [ed38bed0] [c039f560] schedule+0x780/0x824 [ed38bf40] [c00133c0] recheck+0x0/0x24 Instruction dump: 7f49b02e 7ca6 54008ffe 0f00 812401c8 2f83 39290001 912401c8 419e0020 800301c8 7c34 5400d97e 0f00 812301c8 3929 912301c8 [ cut here ] Badness at arch/powerpc/mm/mmu_context_nohash.c:209 NIP: c0017958 LR: c039f560 CTR: c0034a20 REGS: ed38bdd0 TRAP: 0700 Tainted: G W (2.6.35.11) MSR: 00021000 ME,CE CR: 24000888 XER: TASK = ecb2c260[405] 'ethctl' THREAD: ed38a000 CPU: 1 GPR00: 0001 ed38be80 ecb2c260
Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash
On Wed, Apr 27, 2011 at 11:53:32PM +0200, Philippe Gerum wrote: Yes, but that can't be easily summarized here. In short, we have a serious problem with the sharing of the MMU context between the Linux and Xenomai schedulers in the SMP case on powerpc. BTW, I have been running xenomai 2.5 on ipipe 2.6.36 on p2020ds and p2020rdb for several weeks now. Mostly, it seems stable. At least it runs better than in that report. There are a few rough edges, but I have not had the time to work on them or report them yet. In any case, I suggest trying 2.6.36. HTH, Richard (Here is the exact version I have been using...) commit 0f88f18483390d8c4c9ccf7615120e83193fd3c8 Author: Philippe Gerum r...@xenomai.org Date: Tue Mar 8 06:52:33 2011 +0100 ipipe-2.6.36-powerpc-2.12-03 commit ec97ca753f417eb56973111573d367395b676333 Author: Philippe Gerum r...@xenomai.org Date: Tue Mar 8 06:51:10 2011 +0100 powerpc/ipipe: sanitize IRQ cascading with uic commit edb402799e8d65639fddcd03c2b7b78615fd4ef3 Author: Philippe Gerum r...@xenomai.org Date: Mon Mar 7 17:46:31 2011 +0100 powerpc/ipipe: fix IRQ cascading with fsl_msi chips ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [PowerPC]Badness at mmu_context_nohash
Hi list, I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch. I am facing a scheduling issue on a P2020 (dual core PowerPC), and I get the following message : Badness at arch/powerpc/mm/mmu_context_nohash.c:209 NIP: c0018d20 LR: c039b94c CTR: c00343e4 REGS: ecfadce0 TRAP: 0700 Tainted: GW(2.6.35.11) MSR: 00021000 ME,CE CR: 24000488 XER: TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700 0003 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 ecfac040 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 00029000 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 ec5df340 NIP [c0018d20] switch_mmu_context+0x80/0x438 LR [c039b94c] schedule+0x774/0x7dc Call Trace: [ecfadd90] [44000484] 0x44000484 (unreliable) [ecfadde0] [c039b94c] schedule+0x774/0x7dc [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c --- Exception: c01 at 0xffa6cc4 LR = 0xffa6cb0 Instruction dump: 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c 419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c Do you have a clue on how to start debugging it ? It is happening quite randomly... :). Thanks in advance ! JM ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash
On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote: Hi list, I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch. I am facing a scheduling issue on a P2020 (dual core PowerPC), and I get the following message : Badness at arch/powerpc/mm/mmu_context_nohash.c:209 NIP: c0018d20 LR: c039b94c CTR: c00343e4 REGS: ecfadce0 TRAP: 0700 Tainted: GW(2.6.35.11) MSR: 00021000 ME,CE CR: 24000488 XER: TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700 0003 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 ecfac040 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 00029000 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 ec5df340 NIP [c0018d20] switch_mmu_context+0x80/0x438 LR [c039b94c] schedule+0x774/0x7dc Call Trace: [ecfadd90] [44000484] 0x44000484 (unreliable) [ecfadde0] [c039b94c] schedule+0x774/0x7dc [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c --- Exception: c01 at 0xffa6cc4 LR = 0xffa6cb0 Instruction dump: 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c 419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c Do you have a clue on how to start debugging it ? Yes, but that can't be easily summarized here. In short, we have a serious problem with the sharing of the MMU context between the Linux and Xenomai schedulers in the SMP case on powerpc. It is happening quite randomly... :). Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue? Thanks in advance ! JM ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core -- Philippe. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core