Hi,

It turns out the root cause of the problem is a race condition in the apic
code.

I attach a patch to fix this problem. It works on my machine.

Haohui


On Mon, Apr 5, 2010 at 12:15 AM, Haohui Mai <haohui....@gmail.com> wrote:

> Hello,
>
> I'm playing around with L4:Pistachio on a 8-core machine a little bit, and
> it seems that I'm having a race condition in interrupt handling.
>
> at src/api/v4/interrupt.cc, around Line 423:
>
>            current->dequeue_send(handler_tcb);
>            handler_tcb->set_tag(msg_tag_t::irq_tag());
>            handler_tcb->set_partner(current->get_global_id());
>            handler_tcb->unlock();
>
> The routine overrides the handler_tcb directly. However, the handler might
> be in the middle of IPC (where it's in locked_waiting state in my machine),
> thus the handler enters an undefined state.
>
> So what happens to my machine is that the program runs on L4:Pistachio, it
> can process the mouse interrupt for a while, then the interrupt controller
> goes into an inconsistent state:
>
>
>    IRQ 12: IOAPIC 8, Line 12: vec 80, phys, high,  edge, masked dest 0
>         redir entries mismatch hw        0      50 != soft        0   10050
>            hw: vec 80, phys, high,  edge, unmasked dest 0
>
> I have no idea of how to fix it. Any suggestions are highly appreciated.
>
> Cheers,
>
> Haohui
>
Index: kernel/src/platform/generic/intctrl-apic.h
===================================================================
--- kernel/src/platform/generic/intctrl-apic.h	(revision 1653)
+++ kernel/src/platform/generic/intctrl-apic.h	(working copy)
@@ -79,6 +79,7 @@
 	ioapic_redir_t entry;
 	ioapic_t* ioapic;
 	word_t line;
+        spinlock_t lock;
 	bool pending;
     };
     
Index: kernel/src/platform/generic/intctrl-apic.cc
===================================================================
--- kernel/src/platform/generic/intctrl-apic.cc	(revision 1653)
+++ kernel/src/platform/generic/intctrl-apic.cc	(working copy)
@@ -490,11 +490,13 @@
 #endif
     ASSERT(redir[irq].is_valid());
 
+    redir[irq].lock.lock();
     if (redir[irq].entry.is_edge_triggered())
     {
 	if (redir[irq].pending)
 	{
 	    redir[irq].pending = false;
+            redir[irq].lock.unlock();
 	    return true; // leave IRQ masked, since there was another pending
 	}
 	redir[irq].entry.unmask_irq();
@@ -504,6 +506,7 @@
 	redir[irq].entry.unmask_irq();
 	sync_redir_entry(&redir[irq], sync_low);
     }
+    redir[irq].lock.unlock();
     return false;    
     
 }
@@ -580,6 +583,7 @@
 {
     bool deliver = true;
 
+    redir[irq].lock.lock();
     // edge triggered IRQs are marked as pending if masked
     if ( redir[irq].entry.is_edge_triggered() &&
 	 redir[irq].entry.x.mask )
@@ -587,6 +591,7 @@
 	redir[irq].pending = true;
 	deliver = false;
     }
+    redir[irq].lock.unlock();
 
     mask(irq);
     local_apic.EOI();

Reply via email to