Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash

2011-05-04 Thread Jean-Michel Hautbois
2011/5/2 Jean-Michel Hautbois jhautb...@gmail.com

 2011/5/2 Philippe Gerum r...@xenomai.org:
  On Mon, 2011-05-02 at 11:56 +0200, Jean-Michel Hautbois wrote:
  2011/5/2 Jean-Michel Hautbois jhautb...@gmail.com:
   2011/4/30 Philippe Gerum r...@xenomai.org:
   On Fri, 2011-04-29 at 18:08 +0200, Jean-Michel Hautbois wrote:
   2011/4/29 Philippe Gerum r...@xenomai.org:
On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote:
2011/4/27 Philippe Gerum r...@xenomai.org:
 On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote:
 Hi list,

 I am currently using a Xenomai port on a linux 2.6.35.11 linux
 kernel
 and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch.
 I am facing a scheduling issue on a P2020 (dual core PowerPC),
 and I
 get the following message :

 Badness at arch/powerpc/mm/mmu_context_nohash.c:209
 NIP: c0018d20 LR: c039b94c CTR: c00343e4
 REGS: ecfadce0 TRAP: 0700   Tainted: GW(2.6.35.11)
 MSR: 00021000 ME,CE  CR: 24000488  XER: 
 TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1
 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700 
  0003
 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8
 c04a5b40 ecfac040
 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c
 ecfac000 00029000
 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78
 c0b23b40 ec5df340
 NIP [c0018d20] switch_mmu_context+0x80/0x438
 LR [c039b94c] schedule+0x774/0x7dc
 Call Trace:
 [ecfadd90] [44000484] 0x44000484 (unreliable)
 [ecfadde0] [c039b94c] schedule+0x774/0x7dc
 [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114
 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158
 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4
 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c
 --- Exception: c01 at 0xffa6cc4
LR = 0xffa6cb0
 Instruction dump:
 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001
 913b018c
 419e0020 8003018c 7c34 5400d97e 0f00 8123018c
 3929 9123018c

 Do you have a clue on how to start debugging it ?

 Yes, but that can't be easily summarized here. In short, we
 have a
 serious problem with the sharing of the MMU context between the
 Linux
 and Xenomai schedulers in the SMP case on powerpc.
   
OK, good to know that it is a known issue. If there is a thread
 with
some thoughts about it, I am interested ;).
   
 It is happening quite randomly... :).

 Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue?

   
Well, yes and no. It starts well, but when booting the kernel I
 get :
   
   
The mm switch issue was specifically addressed by this patch,
 which is
part of 2.12-01:
   
 http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048
   
However, it the last 2.6.35 patch issued was based on 2.6.35.7,
 not
2.6.35.11, so there is still the possibility that something went
 wrong
while you forward ported this code.
   
- Please check that mmu_context_nohash.c does contain the fix
 above as
it should
  
   It is ok, I have the fix.
  
   Does 2.6.35.7-2.12-02 exhibit the issue as well?
  
   It doesn't seem to exhibit the issue... I didn't try during a long
   time though...
  
  
- Please try Richard's suggestion, i.e. moving to 2.6.36, which
 may give
us more hints.
  
   It is better. I don't have the badness on mmu context anymore.
   This gives some hints ;).
  
  
   Yes and no. The mmu management code involved was untouched between
   2.6.35 and 2.6.36, so I still don't get why this activity counter
 gets
   trashed yet.
  
Badness at kernel/lockdep.c:2327
NIP: c006e554 LR: c006e53c CTR: 000186a0
   
Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll
 have a
look at this. Disable CONFIG_TRACE_IRQFLAGS.
  
   Yes, but I *want* to have the CONFIG_TRACE_IRQFLAGS on. I just
 wanted
   to tell that I had the problem, in order to be sure it is known ;).
  
  
   Sure, but one issue at a time.
  
   JM
  
   --
   Philippe.
  
 
  OK, the badness disappears, but the 2.6.36 kernel seems more stable
  than 2.6.35 with this patch.
 
  What does more stable mean? Do you have lockups, any issue reported in
  the kernel log? Any weird Xenomai behavior?
 

 Well, my applications work very well with the 2.6.36, comparing to the
 2.6.35 were it can crash without any informative message.
 I can't say much more because I don't have much more :).

 JM



OK, I will give some more details, and correct myself BTW :
- A 2.6.35-11 with the 2.6.35.7-powerpc-2.11.02 is showing the badness
- A 2.6.35-11 with the 2.6.35.7-powerpc-2.12.01 is showing the badness
- A 2.6.35-11 with the 2.6.35.7-powerpc-2.12.02 is NOT showing the badness

So, the problem is solved from my point of view, sorry for the noise...
JM
___

Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash

2011-05-02 Thread Jean-Michel Hautbois
2011/4/30 Philippe Gerum r...@xenomai.org:
 On Fri, 2011-04-29 at 18:08 +0200, Jean-Michel Hautbois wrote:
 2011/4/29 Philippe Gerum r...@xenomai.org:
  On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote:
  2011/4/27 Philippe Gerum r...@xenomai.org:
   On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote:
   Hi list,
  
   I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel
   and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch.
   I am facing a scheduling issue on a P2020 (dual core PowerPC), and I
   get the following message :
  
   Badness at arch/powerpc/mm/mmu_context_nohash.c:209
   NIP: c0018d20 LR: c039b94c CTR: c00343e4
   REGS: ecfadce0 TRAP: 0700   Tainted: G        W    (2.6.35.11)
   MSR: 00021000 ME,CE  CR: 24000488  XER: 
   TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1
   GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700   
   0003
   GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 
   ecfac040
   GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 
   00029000
   GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 
   ec5df340
   NIP [c0018d20] switch_mmu_context+0x80/0x438
   LR [c039b94c] schedule+0x774/0x7dc
   Call Trace:
   [ecfadd90] [44000484] 0x44000484 (unreliable)
   [ecfadde0] [c039b94c] schedule+0x774/0x7dc
   [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114
   [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158
   [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4
   [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c
   --- Exception: c01 at 0xffa6cc4
      LR = 0xffa6cb0
   Instruction dump:
   40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c
   419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 
   9123018c
  
   Do you have a clue on how to start debugging it ?
  
   Yes, but that can't be easily summarized here. In short, we have a
   serious problem with the sharing of the MMU context between the Linux
   and Xenomai schedulers in the SMP case on powerpc.
 
  OK, good to know that it is a known issue. If there is a thread with
  some thoughts about it, I am interested ;).
 
   It is happening quite randomly... :).
  
   Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue?
  
 
  Well, yes and no. It starts well, but when booting the kernel I get :
 
 
  The mm switch issue was specifically addressed by this patch, which is
  part of 2.12-01:
  http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048
 
  However, it the last 2.6.35 patch issued was based on 2.6.35.7, not
  2.6.35.11, so there is still the possibility that something went wrong
  while you forward ported this code.
 
  - Please check that mmu_context_nohash.c does contain the fix above as
  it should

 It is ok, I have the fix.

 Does 2.6.35.7-2.12-02 exhibit the issue as well?

It doesn't seem to exhibit the issue... I didn't try during a long
time though...


  - Please try Richard's suggestion, i.e. moving to 2.6.36, which may give
  us more hints.

 It is better. I don't have the badness on mmu context anymore.
 This gives some hints ;).


 Yes and no. The mmu management code involved was untouched between
 2.6.35 and 2.6.36, so I still don't get why this activity counter gets
 trashed yet.

  Badness at kernel/lockdep.c:2327
  NIP: c006e554 LR: c006e53c CTR: 000186a0
 
  Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have a
  look at this. Disable CONFIG_TRACE_IRQFLAGS.

 Yes, but I *want* to have the CONFIG_TRACE_IRQFLAGS on. I just wanted
 to tell that I had the problem, in order to be sure it is known ;).


 Sure, but one issue at a time.

 JM

 --
 Philippe.




___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash

2011-05-02 Thread Jean-Michel Hautbois
2011/5/2 Jean-Michel Hautbois jhautb...@gmail.com:
 2011/4/30 Philippe Gerum r...@xenomai.org:
 On Fri, 2011-04-29 at 18:08 +0200, Jean-Michel Hautbois wrote:
 2011/4/29 Philippe Gerum r...@xenomai.org:
  On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote:
  2011/4/27 Philippe Gerum r...@xenomai.org:
   On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote:
   Hi list,
  
   I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel
   and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch.
   I am facing a scheduling issue on a P2020 (dual core PowerPC), and I
   get the following message :
  
   Badness at arch/powerpc/mm/mmu_context_nohash.c:209
   NIP: c0018d20 LR: c039b94c CTR: c00343e4
   REGS: ecfadce0 TRAP: 0700   Tainted: G        W    (2.6.35.11)
   MSR: 00021000 ME,CE  CR: 24000488  XER: 
   TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1
   GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700   
   0003
   GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 
   ecfac040
   GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 
   00029000
   GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 
   ec5df340
   NIP [c0018d20] switch_mmu_context+0x80/0x438
   LR [c039b94c] schedule+0x774/0x7dc
   Call Trace:
   [ecfadd90] [44000484] 0x44000484 (unreliable)
   [ecfadde0] [c039b94c] schedule+0x774/0x7dc
   [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114
   [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158
   [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4
   [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c
   --- Exception: c01 at 0xffa6cc4
      LR = 0xffa6cb0
   Instruction dump:
   40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 
   913b018c
   419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 
   9123018c
  
   Do you have a clue on how to start debugging it ?
  
   Yes, but that can't be easily summarized here. In short, we have a
   serious problem with the sharing of the MMU context between the Linux
   and Xenomai schedulers in the SMP case on powerpc.
 
  OK, good to know that it is a known issue. If there is a thread with
  some thoughts about it, I am interested ;).
 
   It is happening quite randomly... :).
  
   Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue?
  
 
  Well, yes and no. It starts well, but when booting the kernel I get :
 
 
  The mm switch issue was specifically addressed by this patch, which is
  part of 2.12-01:
  http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048
 
  However, it the last 2.6.35 patch issued was based on 2.6.35.7, not
  2.6.35.11, so there is still the possibility that something went wrong
  while you forward ported this code.
 
  - Please check that mmu_context_nohash.c does contain the fix above as
  it should

 It is ok, I have the fix.

 Does 2.6.35.7-2.12-02 exhibit the issue as well?

 It doesn't seem to exhibit the issue... I didn't try during a long
 time though...


  - Please try Richard's suggestion, i.e. moving to 2.6.36, which may give
  us more hints.

 It is better. I don't have the badness on mmu context anymore.
 This gives some hints ;).


 Yes and no. The mmu management code involved was untouched between
 2.6.35 and 2.6.36, so I still don't get why this activity counter gets
 trashed yet.

  Badness at kernel/lockdep.c:2327
  NIP: c006e554 LR: c006e53c CTR: 000186a0
 
  Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have a
  look at this. Disable CONFIG_TRACE_IRQFLAGS.

 Yes, but I *want* to have the CONFIG_TRACE_IRQFLAGS on. I just wanted
 to tell that I had the problem, in order to be sure it is known ;).


 Sure, but one issue at a time.

 JM

 --
 Philippe.


OK, the badness disappears, but the 2.6.36 kernel seems more stable
than 2.6.35 with this patch.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash

2011-05-02 Thread Jean-Michel Hautbois
2011/5/2 Philippe Gerum r...@xenomai.org:
 On Mon, 2011-05-02 at 11:56 +0200, Jean-Michel Hautbois wrote:
 2011/5/2 Jean-Michel Hautbois jhautb...@gmail.com:
  2011/4/30 Philippe Gerum r...@xenomai.org:
  On Fri, 2011-04-29 at 18:08 +0200, Jean-Michel Hautbois wrote:
  2011/4/29 Philippe Gerum r...@xenomai.org:
   On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote:
   2011/4/27 Philippe Gerum r...@xenomai.org:
On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote:
Hi list,
   
I am currently using a Xenomai port on a linux 2.6.35.11 linux 
kernel
and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch.
I am facing a scheduling issue on a P2020 (dual core PowerPC), and 
I
get the following message :
   
Badness at arch/powerpc/mm/mmu_context_nohash.c:209
NIP: c0018d20 LR: c039b94c CTR: c00343e4
REGS: ecfadce0 TRAP: 0700   Tainted: G        W    (2.6.35.11)
MSR: 00021000 ME,CE  CR: 24000488  XER: 
TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1
GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700  
 0003
GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 
c04a5b40 ecfac040
GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c 
ecfac000 00029000
GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 
c0b23b40 ec5df340
NIP [c0018d20] switch_mmu_context+0x80/0x438
LR [c039b94c] schedule+0x774/0x7dc
Call Trace:
[ecfadd90] [44000484] 0x44000484 (unreliable)
[ecfadde0] [c039b94c] schedule+0x774/0x7dc
[ecfade50] [c039cb98] do_nanosleep+0xc8/0x114
[ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158
[ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4
[ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c
--- Exception: c01 at 0xffa6cc4
   LR = 0xffa6cb0
Instruction dump:
40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 
913b018c
419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 
9123018c
   
Do you have a clue on how to start debugging it ?
   
Yes, but that can't be easily summarized here. In short, we have a
serious problem with the sharing of the MMU context between the 
Linux
and Xenomai schedulers in the SMP case on powerpc.
  
   OK, good to know that it is a known issue. If there is a thread with
   some thoughts about it, I am interested ;).
  
It is happening quite randomly... :).
   
Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue?
   
  
   Well, yes and no. It starts well, but when booting the kernel I get :
  
  
   The mm switch issue was specifically addressed by this patch, which is
   part of 2.12-01:
   http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048
  
   However, it the last 2.6.35 patch issued was based on 2.6.35.7, not
   2.6.35.11, so there is still the possibility that something went wrong
   while you forward ported this code.
  
   - Please check that mmu_context_nohash.c does contain the fix above as
   it should
 
  It is ok, I have the fix.
 
  Does 2.6.35.7-2.12-02 exhibit the issue as well?
 
  It doesn't seem to exhibit the issue... I didn't try during a long
  time though...
 
 
   - Please try Richard's suggestion, i.e. moving to 2.6.36, which may 
   give
   us more hints.
 
  It is better. I don't have the badness on mmu context anymore.
  This gives some hints ;).
 
 
  Yes and no. The mmu management code involved was untouched between
  2.6.35 and 2.6.36, so I still don't get why this activity counter gets
  trashed yet.
 
   Badness at kernel/lockdep.c:2327
   NIP: c006e554 LR: c006e53c CTR: 000186a0
  
   Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have 
   a
   look at this. Disable CONFIG_TRACE_IRQFLAGS.
 
  Yes, but I *want* to have the CONFIG_TRACE_IRQFLAGS on. I just wanted
  to tell that I had the problem, in order to be sure it is known ;).
 
 
  Sure, but one issue at a time.
 
  JM
 
  --
  Philippe.
 

 OK, the badness disappears, but the 2.6.36 kernel seems more stable
 than 2.6.35 with this patch.

 What does more stable mean? Do you have lockups, any issue reported in
 the kernel log? Any weird Xenomai behavior?


Well, my applications work very well with the 2.6.36, comparing to the
2.6.35 were it can crash without any informative message.
I can't say much more because I don't have much more :).

JM

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash

2011-04-30 Thread Philippe Gerum
On Fri, 2011-04-29 at 18:08 +0200, Jean-Michel Hautbois wrote:
 2011/4/29 Philippe Gerum r...@xenomai.org:
  On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote:
  2011/4/27 Philippe Gerum r...@xenomai.org:
   On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote:
   Hi list,
  
   I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel
   and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch.
   I am facing a scheduling issue on a P2020 (dual core PowerPC), and I
   get the following message :
  
   Badness at arch/powerpc/mm/mmu_context_nohash.c:209
   NIP: c0018d20 LR: c039b94c CTR: c00343e4
   REGS: ecfadce0 TRAP: 0700   Tainted: GW(2.6.35.11)
   MSR: 00021000 ME,CE  CR: 24000488  XER: 
   TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1
   GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700   
   0003
   GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 
   ecfac040
   GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 
   00029000
   GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 
   ec5df340
   NIP [c0018d20] switch_mmu_context+0x80/0x438
   LR [c039b94c] schedule+0x774/0x7dc
   Call Trace:
   [ecfadd90] [44000484] 0x44000484 (unreliable)
   [ecfadde0] [c039b94c] schedule+0x774/0x7dc
   [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114
   [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158
   [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4
   [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c
   --- Exception: c01 at 0xffa6cc4
  LR = 0xffa6cb0
   Instruction dump:
   40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c
   419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 
   9123018c
  
   Do you have a clue on how to start debugging it ?
  
   Yes, but that can't be easily summarized here. In short, we have a
   serious problem with the sharing of the MMU context between the Linux
   and Xenomai schedulers in the SMP case on powerpc.
 
  OK, good to know that it is a known issue. If there is a thread with
  some thoughts about it, I am interested ;).
 
   It is happening quite randomly... :).
  
   Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue?
  
 
  Well, yes and no. It starts well, but when booting the kernel I get :
 
 
  The mm switch issue was specifically addressed by this patch, which is
  part of 2.12-01:
  http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048
 
  However, it the last 2.6.35 patch issued was based on 2.6.35.7, not
  2.6.35.11, so there is still the possibility that something went wrong
  while you forward ported this code.
 
  - Please check that mmu_context_nohash.c does contain the fix above as
  it should
 
 It is ok, I have the fix.

Does 2.6.35.7-2.12-02 exhibit the issue as well?

 
  - Please try Richard's suggestion, i.e. moving to 2.6.36, which may give
  us more hints.
 
 It is better. I don't have the badness on mmu context anymore.
 This gives some hints ;).
 

Yes and no. The mmu management code involved was untouched between
2.6.35 and 2.6.36, so I still don't get why this activity counter gets
trashed yet.

  Badness at kernel/lockdep.c:2327
  NIP: c006e554 LR: c006e53c CTR: 000186a0
 
  Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have a
  look at this. Disable CONFIG_TRACE_IRQFLAGS.
 
 Yes, but I *want* to have the CONFIG_TRACE_IRQFLAGS on. I just wanted
 to tell that I had the problem, in order to be sure it is known ;).
 

Sure, but one issue at a time.

 JM

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash

2011-04-29 Thread Philippe Gerum
On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote:
 2011/4/27 Philippe Gerum r...@xenomai.org:
  On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote:
  Hi list,
 
  I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel
  and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch.
  I am facing a scheduling issue on a P2020 (dual core PowerPC), and I
  get the following message :
 
  Badness at arch/powerpc/mm/mmu_context_nohash.c:209
  NIP: c0018d20 LR: c039b94c CTR: c00343e4
  REGS: ecfadce0 TRAP: 0700   Tainted: GW(2.6.35.11)
  MSR: 00021000 ME,CE  CR: 24000488  XER: 
  TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1
  GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700   
  0003
  GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 
  ecfac040
  GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 
  00029000
  GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 
  ec5df340
  NIP [c0018d20] switch_mmu_context+0x80/0x438
  LR [c039b94c] schedule+0x774/0x7dc
  Call Trace:
  [ecfadd90] [44000484] 0x44000484 (unreliable)
  [ecfadde0] [c039b94c] schedule+0x774/0x7dc
  [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114
  [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158
  [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4
  [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c
  --- Exception: c01 at 0xffa6cc4
 LR = 0xffa6cb0
  Instruction dump:
  40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c
  419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c
 
  Do you have a clue on how to start debugging it ?
 
  Yes, but that can't be easily summarized here. In short, we have a
  serious problem with the sharing of the MMU context between the Linux
  and Xenomai schedulers in the SMP case on powerpc.
 
 OK, good to know that it is a known issue. If there is a thread with
 some thoughts about it, I am interested ;).
 
  It is happening quite randomly... :).
 
  Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue?
 
 
 Well, yes and no. It starts well, but when booting the kernel I get :


The mm switch issue was specifically addressed by this patch, which is
part of 2.12-01:
http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048

However, it the last 2.6.35 patch issued was based on 2.6.35.7, not
2.6.35.11, so there is still the possibility that something went wrong
while you forward ported this code.

- Please check that mmu_context_nohash.c does contain the fix above as
it should
- Please try Richard's suggestion, i.e. moving to 2.6.36, which may give
us more hints.

 Badness at kernel/lockdep.c:2327
 NIP: c006e554 LR: c006e53c CTR: 000186a0

Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have a
look at this. Disable CONFIG_TRACE_IRQFLAGS.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash

2011-04-29 Thread Jean-Michel Hautbois
2011/4/29 Philippe Gerum r...@xenomai.org:
 On Thu, 2011-04-28 at 10:33 +0200, Jean-Michel Hautbois wrote:
 2011/4/27 Philippe Gerum r...@xenomai.org:
  On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote:
  Hi list,
 
  I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel
  and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch.
  I am facing a scheduling issue on a P2020 (dual core PowerPC), and I
  get the following message :
 
  Badness at arch/powerpc/mm/mmu_context_nohash.c:209
  NIP: c0018d20 LR: c039b94c CTR: c00343e4
  REGS: ecfadce0 TRAP: 0700   Tainted: G        W    (2.6.35.11)
  MSR: 00021000 ME,CE  CR: 24000488  XER: 
  TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1
  GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700   
  0003
  GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 
  ecfac040
  GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 
  00029000
  GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 
  ec5df340
  NIP [c0018d20] switch_mmu_context+0x80/0x438
  LR [c039b94c] schedule+0x774/0x7dc
  Call Trace:
  [ecfadd90] [44000484] 0x44000484 (unreliable)
  [ecfadde0] [c039b94c] schedule+0x774/0x7dc
  [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114
  [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158
  [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4
  [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c
  --- Exception: c01 at 0xffa6cc4
     LR = 0xffa6cb0
  Instruction dump:
  40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c
  419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c
 
  Do you have a clue on how to start debugging it ?
 
  Yes, but that can't be easily summarized here. In short, we have a
  serious problem with the sharing of the MMU context between the Linux
  and Xenomai schedulers in the SMP case on powerpc.

 OK, good to know that it is a known issue. If there is a thread with
 some thoughts about it, I am interested ;).

  It is happening quite randomly... :).
 
  Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue?
 

 Well, yes and no. It starts well, but when booting the kernel I get :


 The mm switch issue was specifically addressed by this patch, which is
 part of 2.12-01:
 http://git.denx.de/?p=ipipe-2.6.git;a=commit;h=c14a47630d62d0328de1957636dceb1d498f7048

 However, it the last 2.6.35 patch issued was based on 2.6.35.7, not
 2.6.35.11, so there is still the possibility that something went wrong
 while you forward ported this code.

 - Please check that mmu_context_nohash.c does contain the fix above as
 it should

It is ok, I have the fix.

 - Please try Richard's suggestion, i.e. moving to 2.6.36, which may give
 us more hints.

It is better. I don't have the badness on mmu context anymore.
This gives some hints ;).

 Badness at kernel/lockdep.c:2327
 NIP: c006e554 LR: c006e53c CTR: 000186a0

 Adeos sometimes conflicts with the vanilla IRQ state tracer. I'll have a
 look at this. Disable CONFIG_TRACE_IRQFLAGS.

Yes, but I *want* to have the CONFIG_TRACE_IRQFLAGS on. I just wanted
to tell that I had the problem, in order to be sure it is known ;).

JM

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash

2011-04-28 Thread Jean-Michel Hautbois
2011/4/27 Philippe Gerum r...@xenomai.org:
 On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote:
 Hi list,

 I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel
 and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch.
 I am facing a scheduling issue on a P2020 (dual core PowerPC), and I
 get the following message :

 Badness at arch/powerpc/mm/mmu_context_nohash.c:209
 NIP: c0018d20 LR: c039b94c CTR: c00343e4
 REGS: ecfadce0 TRAP: 0700   Tainted: G        W    (2.6.35.11)
 MSR: 00021000 ME,CE  CR: 24000488  XER: 
 TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1
 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700   
 0003
 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 
 ecfac040
 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 
 00029000
 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 
 ec5df340
 NIP [c0018d20] switch_mmu_context+0x80/0x438
 LR [c039b94c] schedule+0x774/0x7dc
 Call Trace:
 [ecfadd90] [44000484] 0x44000484 (unreliable)
 [ecfadde0] [c039b94c] schedule+0x774/0x7dc
 [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114
 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158
 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4
 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c
 --- Exception: c01 at 0xffa6cc4
    LR = 0xffa6cb0
 Instruction dump:
 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c
 419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c

 Do you have a clue on how to start debugging it ?

 Yes, but that can't be easily summarized here. In short, we have a
 serious problem with the sharing of the MMU context between the Linux
 and Xenomai schedulers in the SMP case on powerpc.

OK, good to know that it is a known issue. If there is a thread with
some thoughts about it, I am interested ;).

 It is happening quite randomly... :).

 Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue?


Well, yes and no. It starts well, but when booting the kernel I get :

Badness at kernel/lockdep.c:2327
NIP: c006e554 LR: c006e53c CTR: 000186a0
REGS: effe9e00 TRAP: 0700   Not tainted  (2.6.35.11)
MSR: 00021000 ME,CE  CR: 24242022  XER: 
TASK = c0508398[0] 'swapper' THREAD: c052e000 CPU: 0
GPR00:  effe9eb0 c0508398 0001 80021000 ea50 0060 0003
GPR08: c0501e80 c053 c051 0001 44242028 100488d8 3ff91200 
GPR16:  3ff85950 3ff85950 3ffb1254 c0a446e0 c0a44700 c053 c0a44704
GPR24: c0537084 0010  c0539838 80029000 c001444c c053 c0508398
NIP [c006e554] trace_hardirqs_on_caller+0x148/0x18c
LR [c006e53c] trace_hardirqs_on_caller+0x130/0x18c
Call Trace:
[effe9eb0] [c006e50c] trace_hardirqs_on_caller+0x100/0x18c (unreliable)
[effe9ed0] [c001444c] restore+0x10/0x64
[effe9f90] [0010] 0x10
[effe9fb0] [c001c568] mpic_unmask_irq+0x84/0xb8
[effe9fd0] [c00816f4] handle_fasteoi_irq+0xe4/0x138
[effe9ff0] [c0013628] call_handle_irq+0x18/0x28
[c052fdc0] [c0004fe0] handle_one_irq+0x94/0x100
[c052fde0] [c000b120] __ipipe_do_IRQ+0x78/0xa8
[c052fe10] [c0085234] __ipipe_sync_stage+0x1b0/0x33c
[c052fe50] [c000a404] __ipipe_handle_irq+0x20c/0x260
[c052fe90] [c000a614] __ipipe_grab_irq+0x4c/0x188
[c052fec0] [c0014a60] __ipipe_ret_from_except+0x0/0xc
[c052ff80] [c0009224] cpu_idle+0x64/0xe0
[c052ffa0] [c000234c] rest_init+0xd0/0xe4
[c052ffc0] [c04c9ae0] start_kernel+0x2b0/0x348
[c052fff0] [c3c4] skpinv+0x2dc/0x318
Instruction dump:
2f80 409eff5c 800205d8 2f80 419eff50 481b8281 2f83 41beff00
3d20c053 8009732c 2f80 40befef0 0fe0 4bfffee8 7fe3fb78 3881

And after starting my applications, it gets really bad :

Badness at arch/powerpc/mm/mmu_context_nohash.c:209
NIP: c0017958 LR: c039f560 CTR: c00347f8
REGS: ed38bdd0 TRAP: 0700   Tainted: G        W    (2.6.35.11)
MSR: 00021000 ME,CE  CR: 24000828  XER: 
TASK = ecb2c260[405] 'ethctl' THREAD: ed38a000 CPU: 1
GPR00: 0001 ed38be80 ecb2c260 ec70f200 ec89a200 0025 c161ccc0 0003
GPR08: ec7fb000 0001 ec89a3c0 c052 0002a106 101110d0 c04e9cc0 c04e9cc0
GPR16: c04e9cc0 c04e9cc0 c04e92e0 c04e9cc0 ed38a000 c051b084 c051b084 c04e9cc0
GPR24: ecb2c4d0 c0519b2c 0001 c04e92e0 c161ccc0 ecb2c260 ec89a200 ec70f200
NIP [c0017958] switch_mmu_context+0x54/0x3d8
LR [c039f560] schedule+0x780/0x824
Call Trace:
[ed38be80] [24000822] 0x24000822 (unreliable)
[ed38bed0] [c039f560] schedule+0x780/0x824
[ed38bf40] [c00133c0] recheck+0x0/0x24
Instruction dump:
7f49b02e 7ca6 54008ffe 0f00 812401c8 2f83 39290001 912401c8
419e0020 800301c8 7c34 5400d97e 0f00 812301c8 3929 912301c8
[ cut here ]
Badness at arch/powerpc/mm/mmu_context_nohash.c:209
NIP: c0017958 LR: c039f560 CTR: c0034a20
REGS: ed38bdd0 TRAP: 0700   Tainted: G        W    (2.6.35.11)
MSR: 00021000 ME,CE  CR: 24000888  XER: 
TASK = ecb2c260[405] 'ethctl' THREAD: ed38a000 CPU: 1
GPR00: 0001 ed38be80 ecb2c260 

Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash

2011-04-28 Thread Richard Cochran
On Wed, Apr 27, 2011 at 11:53:32PM +0200, Philippe Gerum wrote:
 Yes, but that can't be easily summarized here. In short, we have a
 serious problem with the sharing of the MMU context between the Linux
 and Xenomai schedulers in the SMP case on powerpc.

BTW, I have been running xenomai 2.5 on ipipe 2.6.36 on p2020ds and
p2020rdb for several weeks now. Mostly, it seems stable. At least it
runs better than in that report. There are a few rough edges, but I
have not had the time to work on them or report them yet.

In any case, I suggest trying 2.6.36.

HTH,

Richard

(Here is the exact version I have been using...)

commit 0f88f18483390d8c4c9ccf7615120e83193fd3c8
Author: Philippe Gerum r...@xenomai.org
Date:   Tue Mar 8 06:52:33 2011 +0100

ipipe-2.6.36-powerpc-2.12-03

commit ec97ca753f417eb56973111573d367395b676333
Author: Philippe Gerum r...@xenomai.org
Date:   Tue Mar 8 06:51:10 2011 +0100

powerpc/ipipe: sanitize IRQ cascading with uic

commit edb402799e8d65639fddcd03c2b7b78615fd4ef3
Author: Philippe Gerum r...@xenomai.org
Date:   Mon Mar 7 17:46:31 2011 +0100

powerpc/ipipe: fix IRQ cascading with fsl_msi chips

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PowerPC]Badness at mmu_context_nohash

2011-04-27 Thread Jean-Michel Hautbois
Hi list,

I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel
and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch.
I am facing a scheduling issue on a P2020 (dual core PowerPC), and I
get the following message :

Badness at arch/powerpc/mm/mmu_context_nohash.c:209
NIP: c0018d20 LR: c039b94c CTR: c00343e4
REGS: ecfadce0 TRAP: 0700   Tainted: GW(2.6.35.11)
MSR: 00021000 ME,CE  CR: 24000488  XER: 
TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1
GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700   0003
GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 ecfac040
GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 00029000
GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 ec5df340
NIP [c0018d20] switch_mmu_context+0x80/0x438
LR [c039b94c] schedule+0x774/0x7dc
Call Trace:
[ecfadd90] [44000484] 0x44000484 (unreliable)
[ecfadde0] [c039b94c] schedule+0x774/0x7dc
[ecfade50] [c039cb98] do_nanosleep+0xc8/0x114
[ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158
[ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4
[ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c
--- Exception: c01 at 0xffa6cc4
   LR = 0xffa6cb0
Instruction dump:
40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c
419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c

Do you have a clue on how to start debugging it ?
It is happening quite randomly... :).

Thanks in advance !
JM

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [PowerPC]Badness at mmu_context_nohash

2011-04-27 Thread Philippe Gerum
On Wed, 2011-04-27 at 20:42 +0200, Jean-Michel Hautbois wrote:
 Hi list,
 
 I am currently using a Xenomai port on a linux 2.6.35.11 linux kernel
 and the adeos-ipipe-2.6.35.7-powerpc-2.12-01.patch.
 I am facing a scheduling issue on a P2020 (dual core PowerPC), and I
 get the following message :
 
 Badness at arch/powerpc/mm/mmu_context_nohash.c:209
 NIP: c0018d20 LR: c039b94c CTR: c00343e4
 REGS: ecfadce0 TRAP: 0700   Tainted: GW(2.6.35.11)
 MSR: 00021000 ME,CE  CR: 24000488  XER: 
 TASK = ec5220d0[496] 'sipaq' THREAD: ecfac000 CPU: 1
 GPR00: 0001 ecfadd90 ec5220d0 ec5df340 ec58a700   0003
 GPR08: c04a2d98 0007 c04a2d98 0067e000 0002f385 1007f1f8 c04a5b40 ecfac040
 GPR16: c04a5b40 c04deb80 c04a2120 c04a2d98 c04a5b40 c04d008c ecfac000 00029000
 GPR24: c04d c04d1e6c 0001 ec58a700 eceaf390 c04d1e78 c0b23b40 ec5df340
 NIP [c0018d20] switch_mmu_context+0x80/0x438
 LR [c039b94c] schedule+0x774/0x7dc
 Call Trace:
 [ecfadd90] [44000484] 0x44000484 (unreliable)
 [ecfadde0] [c039b94c] schedule+0x774/0x7dc
 [ecfade50] [c039cb98] do_nanosleep+0xc8/0x114
 [ecfade80] [c0059bf8] hrtimer_nanosleep+0xd8/0x158
 [ecfadf10] [c0059d48] sys_nanosleep+0xd0/0xd4
 [ecfadf40] [c0013c0c] ret_from_syscall+0x0/0x3c
 --- Exception: c01 at 0xffa6cc4
LR = 0xffa6cb0
 Instruction dump:
 40a2fff0 4c00012c 2f80 409e0128 813b018c 2f83 39290001 913b018c
 419e0020 8003018c 7c34 5400d97e 0f00 8123018c 3929 9123018c
 
 Do you have a clue on how to start debugging it ?

Yes, but that can't be easily summarized here. In short, we have a
serious problem with the sharing of the MMU context between the Linux
and Xenomai schedulers in the SMP case on powerpc.

 It is happening quite randomly... :).

Does disabling CONFIG_XENO_HW_UNLOCKED_SWITCH clear this issue?

 
 Thanks in advance !
 JM
 
 ___
 Xenomai-core mailing list
 Xenomai-core@gna.org
 https://mail.gna.org/listinfo/xenomai-core

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core