[Xenomai-core] [BUG] Interrupt problem on powerpc

2006-01-30 Thread Anders Blomdell
On a PrPMC800 (PPC 7410 processor) withe Xenomai-2.1-rc2, I get the following if 
the interrupt handler takes too long (i.e. next interrupt gets generated before 
the previous one has finished)


[   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
[   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
[   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
[   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
[   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   42.923029]  [] 0x0
[   42.959695]  [c0038348] __do_IRQ+0x134/0x164
[   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
[   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
[   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
[   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
[   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   43.411145]  [c0006524] default_idle+0x10/0x60


Any ideas of where to look?

Regards

Anders Blomdell



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [BUG] Interrupt problem on powerpc

2006-01-30 Thread Anders Blomdell

Jan Kiszka wrote:

Anders Blomdell wrote:


On a PrPMC800 (PPC 7410 processor) withe Xenomai-2.1-rc2, I get the
following if the interrupt handler takes too long (i.e. next interrupt
gets generated before the previous one has finished)

[   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
[   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
[   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
[   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
[   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   42.923029]  [] 0x0
[   42.959695]  [c0038348] __do_IRQ+0x134/0x164
[   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
[   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
[   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
[   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
[   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   43.411145]  [c0006524] default_idle+0x10/0x60




I think some probably important information is missing above this
back-trace. 

You are so right!

 What does the kernel state before these lines?

[   42.346643] BUG: spinlock recursion on CPU#0, swapper/0
[   42.415438]  lock: c01c943c, .magic: dead4ead, .owner: swapper/0, 
.owner_cpu: 0
[   42.511681] Call trace:
[   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
[   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
[   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
[   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
[   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   42.923029]  [] 0x0
[   42.959695]  [c0038348] __do_IRQ+0x134/0x164
[   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
[   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
[   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
[   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
[   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   43.411145]  [c0006524] default_idle+0x10/0x60


It might be that the problem is related to the fact that the interrupt is a 
shared one (Harrier chip, Functional Exception), that is used for both 
message-passing (should be RT) and UART (Linux, i.e. non-RT), my current IRQ 
handler always pends the interrupt to the linux domain (RTDM_IRQ_PROPAGATE), 
because all other attempts (RTDM_IRQ_ENABLE when it wasn't a UART interrupt) has 
left the interrupts turned off.


What I believe should be done, is

  1. When UART interrupt is received, disable further non-RT interrupts
 on this IRQ-line, pend interrupt to Linux.
  2. Handle RT interrupts on this IRQ line
  3. When Linux has finished the pended interrupt, reenable non-RT interrupts.

but I have neither been able to achieve this, nor to verify that it is the right 
thing to do...


Regards

Anders Blomdell


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [BUG] Interrupt problem on powerpc

2006-01-30 Thread Jan Kiszka
Anders Blomdell wrote:
 Jan Kiszka wrote:
 Anders Blomdell wrote:

 On a PrPMC800 (PPC 7410 processor) withe Xenomai-2.1-rc2, I get the
 following if the interrupt handler takes too long (i.e. next interrupt
 gets generated before the previous one has finished)

 [   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
 [   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
 [   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
 [   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
 [   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
 [   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
 [   42.923029]  [] 0x0
 [   42.959695]  [c0038348] __do_IRQ+0x134/0x164
 [   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
 [   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
 [   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
 [   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
 [   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
 [   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
 [   43.411145]  [c0006524] default_idle+0x10/0x60



 I think some probably important information is missing above this
 back-trace. 
 You are so right!
 
 What does the kernel state before these lines?
 
 [   42.346643] BUG: spinlock recursion on CPU#0, swapper/0
 [   42.415438]  lock: c01c943c, .magic: dead4ead, .owner: swapper/0,
 .owner_cpu: 0
 [   42.511681] Call trace:
 [   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
 [   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
 [   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
 [   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
 [   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
 [   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
 [   42.923029]  [] 0x0
 [   42.959695]  [c0038348] __do_IRQ+0x134/0x164
 [   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
 [   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
 [   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
 [   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
 [   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
 [   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
 [   43.411145]  [c0006524] default_idle+0x10/0x60
 
 
 It might be that the problem is related to the fact that the interrupt
 is a shared one (Harrier chip, Functional Exception), that is used for
 both message-passing (should be RT) and UART (Linux, i.e. non-RT), my
 current IRQ handler always pends the interrupt to the linux domain
 (RTDM_IRQ_PROPAGATE), because all other attempts (RTDM_IRQ_ENABLE when
 it wasn't a UART interrupt) has left the interrupts turned off.
 
 What I believe should be done, is
 
   1. When UART interrupt is received, disable further non-RT interrupts
  on this IRQ-line, pend interrupt to Linux.
   2. Handle RT interrupts on this IRQ line
   3. When Linux has finished the pended interrupt, reenable non-RT
 interrupts.
 
 but I have neither been able to achieve this, nor to verify that it is
 the right thing to do...

Your approach is basically what I proposed some years back on rtai-dev
for handling unresolvable shared RT/NRT IRQs. I once successfully tested
such a setup with two network cards, one RT, the other Linux.

So when you are really doomed and cannot change the IRQ line of your RT
device, this is a kind of emergency workaround. Not nice and generic
(you have to write the stub for disabling the NRT IRQ source), but it
should work.


Anyway, I do not understand what made your spinlock recurs. This shared
IRQ scenario should only cause indeterminism to the RT driver (by
blocking the line until the Linux handler can release it), but it must
not trigger this bug.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [BUG] Interrupt problem on powerpc

2006-01-30 Thread Anders Blomdell

Jan Kiszka wrote:

Anders Blomdell wrote:


Jan Kiszka wrote:


Anders Blomdell wrote:



On a PrPMC800 (PPC 7410 processor) withe Xenomai-2.1-rc2, I get the
following if the interrupt handler takes too long (i.e. next interrupt
gets generated before the previous one has finished)

[   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
[   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
[   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
[   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
[   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   42.923029]  [] 0x0
[   42.959695]  [c0038348] __do_IRQ+0x134/0x164
[   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
[   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
[   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
[   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
[   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   43.411145]  [c0006524] default_idle+0x10/0x60




I think some probably important information is missing above this
back-trace. 


You are so right!



What does the kernel state before these lines?


[   42.346643] BUG: spinlock recursion on CPU#0, swapper/0
[   42.415438]  lock: c01c943c, .magic: dead4ead, .owner: swapper/0,
.owner_cpu: 0
[   42.511681] Call trace:
[   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
[   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
[   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
[   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
[   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   42.923029]  [] 0x0
[   42.959695]  [c0038348] __do_IRQ+0x134/0x164
[   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
[   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
[   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
[   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
[   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   43.411145]  [c0006524] default_idle+0x10/0x60


It might be that the problem is related to the fact that the interrupt
is a shared one (Harrier chip, Functional Exception), that is used for
both message-passing (should be RT) and UART (Linux, i.e. non-RT), my
current IRQ handler always pends the interrupt to the linux domain
(RTDM_IRQ_PROPAGATE), because all other attempts (RTDM_IRQ_ENABLE when
it wasn't a UART interrupt) has left the interrupts turned off.

What I believe should be done, is

 1. When UART interrupt is received, disable further non-RT interrupts
on this IRQ-line, pend interrupt to Linux.
 2. Handle RT interrupts on this IRQ line
 3. When Linux has finished the pended interrupt, reenable non-RT
interrupts.

but I have neither been able to achieve this, nor to verify that it is
the right thing to do...



Your approach is basically what I proposed some years back on rtai-dev
for handling unresolvable shared RT/NRT IRQs. I once successfully tested
such a setup with two network cards, one RT, the other Linux.

So when you are really doomed and cannot change the IRQ line of your RT
device, this is a kind of emergency workaround. Not nice and generic
(you have to write the stub for disabling the NRT IRQ source), but it
should work.

I'm doomed, the interrupts live in the same chip...
The problem is that I have not found any good place to reenable the non-RT 
interrupts.



Anyway, I do not understand what made your spinlock recurs. This shared
IRQ scenario should only cause indeterminism to the RT driver (by
blocking the line until the Linux handler can release it), but it must
not trigger this bug.

OK, seems like  have two problems then, I'll try to hunt it down


/Anders

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] [BUG] Interrupt problem on powerpc

2006-01-30 Thread Philippe Gerum

Anders Blomdell wrote:

Philippe Gerum wrote:


Anders Blomdell wrote:

On a PrPMC800 (PPC 7410 processor) withe Xenomai-2.1-rc2, I get the 
following if the interrupt handler takes too long (i.e. next 
interrupt gets generated before the previous one has finished)


[   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
[   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184




Someone (in arch/ppc64/kernel/*.c?) is spinlocking+irqsave desc-lock 


more likely arch/ppc/kernel/*.c :-)



Gah... looks like I'm still confused by ia64 issues I'm chasing right now. (Why on 
earth do we need so many bits on our CPUs that only serve the purpose of raising 
so many problems?)


for any given IRQ without using the Adeos *_hw() spinlock variant that 
masks the interrupt at hw level. So we seem to have:


spin_lock_irqsave(desc-lock)
hw IRQ
__ipipe_grab_irq
__ipipe_handle_irq
__ipipe_ack_irq
spin_lock...(desc-lock)
deadlock.

The point is about having spinlock_irqsave only _virtually_ masking 
the interrupts by preventing their associated Linux handler from being 
called, but despite this, Adeos still actually acquires and 
acknowledges the incoming hw events before logging them, even if their 
associated action happen to be postponed until spinlock_irq_restore() 
is called.


To solve this, all spinlocks potentially touched by the ipipe's 
primary IRQ handler and/or the code it calls indirectly, _must_ be 
operated using the _hw() call variant all over the kernel, so that no 
hw IRQ can be taken while those spinlocks are held by Linux. Usually, 
only the spinlock(s) protecting the interrupt descriptors or the PIC 
hardware are concerned.


So you will expect an addition to the ipipe patch then?



Yep. We first need to find out who's grabbing the shared spinlock using the 
vanilla Linux primitives.



/Anders




--

Philippe.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [BUG] Interrupt problem on powerpc

2006-01-30 Thread Anders Blomdell
On a PrPMC800 (PPC 7410 processor) withe Xenomai-2.1-rc2, I get the following if 
the interrupt handler takes too long (i.e. next interrupt gets generated before 
the previous one has finished)


[   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
[   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
[   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
[   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
[   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   42.923029]  [] 0x0
[   42.959695]  [c0038348] __do_IRQ+0x134/0x164
[   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
[   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
[   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
[   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
[   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   43.411145]  [c0006524] default_idle+0x10/0x60


Any ideas of where to look?

Regards

Anders Blomdell





Re: [Xenomai-core] [BUG] Interrupt problem on powerpc

2006-01-30 Thread Jan Kiszka
Anders Blomdell wrote:
 On a PrPMC800 (PPC 7410 processor) withe Xenomai-2.1-rc2, I get the
 following if the interrupt handler takes too long (i.e. next interrupt
 gets generated before the previous one has finished)
 
 [   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
 [   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
 [   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
 [   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
 [   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
 [   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
 [   42.923029]  [] 0x0
 [   42.959695]  [c0038348] __do_IRQ+0x134/0x164
 [   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
 [   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
 [   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
 [   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
 [   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
 [   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
 [   43.411145]  [c0006524] default_idle+0x10/0x60
 

I think some probably important information is missing above this
back-trace. What does the kernel state before these lines?

Jan



signature.asc
Description: OpenPGP digital signature


Re: [Xenomai-core] [BUG] Interrupt problem on powerpc

2006-01-30 Thread Anders Blomdell

Jan Kiszka wrote:

Anders Blomdell wrote:


On a PrPMC800 (PPC 7410 processor) withe Xenomai-2.1-rc2, I get the
following if the interrupt handler takes too long (i.e. next interrupt
gets generated before the previous one has finished)

[   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
[   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
[   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
[   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
[   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   42.923029]  [] 0x0
[   42.959695]  [c0038348] __do_IRQ+0x134/0x164
[   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
[   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
[   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
[   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
[   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   43.411145]  [c0006524] default_idle+0x10/0x60




I think some probably important information is missing above this
back-trace. 

You are so right!

 What does the kernel state before these lines?

[   42.346643] BUG: spinlock recursion on CPU#0, swapper/0
[   42.415438]  lock: c01c943c, .magic: dead4ead, .owner: swapper/0, 
.owner_cpu: 0
[   42.511681] Call trace:
[   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
[   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
[   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
[   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
[   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   42.923029]  [] 0x0
[   42.959695]  [c0038348] __do_IRQ+0x134/0x164
[   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
[   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
[   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
[   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
[   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   43.411145]  [c0006524] default_idle+0x10/0x60


It might be that the problem is related to the fact that the interrupt is a 
shared one (Harrier chip, Functional Exception), that is used for both 
message-passing (should be RT) and UART (Linux, i.e. non-RT), my current IRQ 
handler always pends the interrupt to the linux domain (RTDM_IRQ_PROPAGATE), 
because all other attempts (RTDM_IRQ_ENABLE when it wasn't a UART interrupt) has 
left the interrupts turned off.


What I believe should be done, is

  1. When UART interrupt is received, disable further non-RT interrupts
 on this IRQ-line, pend interrupt to Linux.
  2. Handle RT interrupts on this IRQ line
  3. When Linux has finished the pended interrupt, reenable non-RT interrupts.

but I have neither been able to achieve this, nor to verify that it is the right 
thing to do...


Regards

Anders Blomdell




Re: [Xenomai-core] [BUG] Interrupt problem on powerpc

2006-01-30 Thread Jan Kiszka
Anders Blomdell wrote:
 Jan Kiszka wrote:
 Anders Blomdell wrote:

 On a PrPMC800 (PPC 7410 processor) withe Xenomai-2.1-rc2, I get the
 following if the interrupt handler takes too long (i.e. next interrupt
 gets generated before the previous one has finished)

 [   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
 [   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
 [   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
 [   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
 [   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
 [   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
 [   42.923029]  [] 0x0
 [   42.959695]  [c0038348] __do_IRQ+0x134/0x164
 [   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
 [   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
 [   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
 [   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
 [   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
 [   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
 [   43.411145]  [c0006524] default_idle+0x10/0x60



 I think some probably important information is missing above this
 back-trace. 
 You are so right!
 
 What does the kernel state before these lines?
 
 [   42.346643] BUG: spinlock recursion on CPU#0, swapper/0
 [   42.415438]  lock: c01c943c, .magic: dead4ead, .owner: swapper/0,
 .owner_cpu: 0
 [   42.511681] Call trace:
 [   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
 [   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
 [   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
 [   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
 [   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
 [   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
 [   42.923029]  [] 0x0
 [   42.959695]  [c0038348] __do_IRQ+0x134/0x164
 [   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
 [   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
 [   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
 [   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
 [   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
 [   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
 [   43.411145]  [c0006524] default_idle+0x10/0x60
 
 
 It might be that the problem is related to the fact that the interrupt
 is a shared one (Harrier chip, Functional Exception), that is used for
 both message-passing (should be RT) and UART (Linux, i.e. non-RT), my
 current IRQ handler always pends the interrupt to the linux domain
 (RTDM_IRQ_PROPAGATE), because all other attempts (RTDM_IRQ_ENABLE when
 it wasn't a UART interrupt) has left the interrupts turned off.
 
 What I believe should be done, is
 
   1. When UART interrupt is received, disable further non-RT interrupts
  on this IRQ-line, pend interrupt to Linux.
   2. Handle RT interrupts on this IRQ line
   3. When Linux has finished the pended interrupt, reenable non-RT
 interrupts.
 
 but I have neither been able to achieve this, nor to verify that it is
 the right thing to do...

Your approach is basically what I proposed some years back on rtai-dev
for handling unresolvable shared RT/NRT IRQs. I once successfully tested
such a setup with two network cards, one RT, the other Linux.

So when you are really doomed and cannot change the IRQ line of your RT
device, this is a kind of emergency workaround. Not nice and generic
(you have to write the stub for disabling the NRT IRQ source), but it
should work.


Anyway, I do not understand what made your spinlock recurs. This shared
IRQ scenario should only cause indeterminism to the RT driver (by
blocking the line until the Linux handler can release it), but it must
not trigger this bug.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [Xenomai-core] [BUG] Interrupt problem on powerpc

2006-01-30 Thread Anders Blomdell

Jan Kiszka wrote:

Anders Blomdell wrote:


Jan Kiszka wrote:


Anders Blomdell wrote:



On a PrPMC800 (PPC 7410 processor) withe Xenomai-2.1-rc2, I get the
following if the interrupt handler takes too long (i.e. next interrupt
gets generated before the previous one has finished)

[   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
[   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
[   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
[   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
[   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   42.923029]  [] 0x0
[   42.959695]  [c0038348] __do_IRQ+0x134/0x164
[   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
[   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
[   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
[   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
[   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   43.411145]  [c0006524] default_idle+0x10/0x60




I think some probably important information is missing above this
back-trace. 


You are so right!



What does the kernel state before these lines?


[   42.346643] BUG: spinlock recursion on CPU#0, swapper/0
[   42.415438]  lock: c01c943c, .magic: dead4ead, .owner: swapper/0,
.owner_cpu: 0
[   42.511681] Call trace:
[   42.543765]  [c00c2008] spin_bug+0xa8/0xc4
[   42.597617]  [c00c22d4] _raw_spin_lock+0x180/0x184
[   42.660637]  [c000f388] __ipipe_ack_irq+0x88/0x130
[   42.723657]  [c000efe4] __ipipe_handle_irq+0x140/0x268
[   42.791259]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   42.854279]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   42.923029]  [] 0x0
[   42.959695]  [c0038348] __do_IRQ+0x134/0x164
[   43.015839]  [c000ed04] __ipipe_do_IRQ+0x2c/0x44
[   43.076567]  [c000eb08] __ipipe_sync_stage+0x1ec/0x228
[   43.144170]  [c0039420] ipipe_suspend_domain+0x7c/0xc4
[   43.211774]  [c000f0b0] __ipipe_handle_irq+0x20c/0x268
[   43.279377]  [c000f144] __ipipe_grab_irq+0x38/0xa4
[   43.342396]  [c0005058] __ipipe_ret_from_except+0x0/0xc
[   43.411145]  [c0006524] default_idle+0x10/0x60


It might be that the problem is related to the fact that the interrupt
is a shared one (Harrier chip, Functional Exception), that is used for
both message-passing (should be RT) and UART (Linux, i.e. non-RT), my
current IRQ handler always pends the interrupt to the linux domain
(RTDM_IRQ_PROPAGATE), because all other attempts (RTDM_IRQ_ENABLE when
it wasn't a UART interrupt) has left the interrupts turned off.

What I believe should be done, is

 1. When UART interrupt is received, disable further non-RT interrupts
on this IRQ-line, pend interrupt to Linux.
 2. Handle RT interrupts on this IRQ line
 3. When Linux has finished the pended interrupt, reenable non-RT
interrupts.

but I have neither been able to achieve this, nor to verify that it is
the right thing to do...



Your approach is basically what I proposed some years back on rtai-dev
for handling unresolvable shared RT/NRT IRQs. I once successfully tested
such a setup with two network cards, one RT, the other Linux.

So when you are really doomed and cannot change the IRQ line of your RT
device, this is a kind of emergency workaround. Not nice and generic
(you have to write the stub for disabling the NRT IRQ source), but it
should work.

I'm doomed, the interrupts live in the same chip...
The problem is that I have not found any good place to reenable the non-RT 
interrupts.



Anyway, I do not understand what made your spinlock recurs. This shared
IRQ scenario should only cause indeterminism to the RT driver (by
blocking the line until the Linux handler can release it), but it must
not trigger this bug.

OK, seems like  have two problems then, I'll try to hunt it down


/Anders