Re: [XenPPC] IPI problems
On Fri, 2007-01-12 at 19:41 -0500, Amos Waterland wrote: On Fri, Jan 12, 2007 at 05:45:03PM -0600, Hollis Blanchard wrote: We seem to have an IPI problem, which causes vcpu_pause() to hang the system. The following patch, tested on JS20 and JS21, illustrates it. Before dom0 starts, IPIs work fine. After Linux's mpic_init(), IPIs (as triggered by the 'I' keyhandler) lock the machine. Actually, it looks like a message is trying to get out, because after a while we see a '(' emitted (presumably the first character in (XEN)). No, this is almost certainly our code that checks that the IPI start is acked. If you run with `sync_console' you should see periodic messages about start stalls. (When I comment out mpic_init() in dom0, Xen IPIs continue to work but real IRQs (e.g. the IDE controller) fail in dom0.) Make sure you did not merge this out: http://lists.xensource.com/archives/html/xen-ppc-devel/2006-11/msg00149.html Hmmm. I had never pulled this (Linux) changeset. However, now that I have, it doesn't seem to be helping. -- Hollis Blanchard IBM Linux Technology Center ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] IPI problems
On Fri, 2007-01-12 at 20:34 -0500, Jimi Xenidis wrote: I just built clean xenppc-unstable.hg (assuming it has the issues you state below) and all IPI ^A*3 tests (esp 't' and 'd') work just fine on my maple What about xm destroy? I can boot fine and start a domU, but xm destroy locks my system spinning in vcpu_pause(). -- Hollis Blanchard IBM Linux Technology Center ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] IPI problems
On Mon, 2007-01-15 at 11:23 -0600, Hollis Blanchard wrote: On Fri, 2007-01-12 at 19:41 -0500, Amos Waterland wrote: On Fri, Jan 12, 2007 at 05:45:03PM -0600, Hollis Blanchard wrote: We seem to have an IPI problem, which causes vcpu_pause() to hang the system. The following patch, tested on JS20 and JS21, illustrates it. Before dom0 starts, IPIs work fine. After Linux's mpic_init(), IPIs (as triggered by the 'I' keyhandler) lock the machine. Actually, it looks like a message is trying to get out, because after a while we see a '(' emitted (presumably the first character in (XEN)). No, this is almost certainly our code that checks that the IPI start is acked. If you run with `sync_console' you should see periodic messages about start stalls. (When I comment out mpic_init() in dom0, Xen IPIs continue to work but real IRQs (e.g. the IDE controller) fail in dom0.) Make sure you did not merge this out: http://lists.xensource.com/archives/html/xen-ppc-devel/2006-11/msg00149.html Hmmm. I had never pulled this (Linux) changeset. However, now that I have, it doesn't seem to be helping. Correction: the ^A commands work, so IPIs seem to be working. I still have a hang under xm destroy I'm debugging. -- Hollis Blanchard IBM Linux Technology Center ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] IPI problems
I mentioned that I accidentally pushed an upstream merge to xenppc-unstable while it's still broken. There are a couple broken things. First, DomU console stops mid-string early in boot. Could be an event channel problem with the ring buffer; haven't investigated. We seem to have an IPI problem, which causes vcpu_pause() to hang the system. The following patch, tested on JS20 and JS21, illustrates it. Before dom0 starts, IPIs work fine. After Linux's mpic_init(), IPIs (as triggered by the 'I' keyhandler) lock the machine. Actually, it looks like a message is trying to get out, because after a while we see a '(' emitted (presumably the first character in (XEN)). (When I comment out mpic_init() in dom0, Xen IPIs continue to work but real IRQs (e.g. the IDE controller) fail in dom0.) Why is this problem occurring only after an upstream merge? I don't know. It's possible that some common IRQ code has changed to no longer call the same arch-specific code, but I'm just speculating. diff -r d6481755ade6 xen/arch/powerpc/setup.c --- a/xen/arch/powerpc/setup.c Thu Jan 11 13:39:27 2007 -0600 +++ b/xen/arch/powerpc/setup.c Fri Jan 12 17:12:27 2007 -0600 @@ -438,7 +438,9 @@ static void __init __start_xen(multiboot domain_unpause_by_systemcontroller(dom0); #ifdef DEBUG_IPI -ipi_torture_test(); +//ipi_torture_test(); +extern void do_ipi_test(char c); +do_ipi_test(0); #endif startup_cpu_idle_loop(); } diff -r d6481755ade6 xen/common/keyhandler.c --- a/xen/common/keyhandler.c Thu Jan 11 13:39:27 2007 -0600 +++ b/xen/common/keyhandler.c Fri Jan 12 17:44:46 2007 -0600 @@ -260,6 +260,16 @@ static void do_debug_key(unsigned char k bit. */ } +static void got_ipi(void *info) +{ +printk(CPU %u got IPI\n, smp_processor_id()); +} + +void do_ipi_test(unsigned char key) +{ +smp_call_function(got_ipi, NULL, 0, 0); +} + void initialize_keytable(void) { open_softirq(KEYPRESS_SOFTIRQ, keypress_softirq); @@ -286,6 +296,8 @@ void initialize_keytable(void) #endif register_irq_keyhandler('%', do_debug_key, Trap to xendbg); + +register_keyhandler('I', do_ipi_test, IPI test); } /* diff -r d6481755ade6 xen/drivers/char/console.c --- a/xen/drivers/char/console.cThu Jan 11 13:39:27 2007 -0600 +++ b/xen/drivers/char/console.cFri Jan 12 17:09:01 2007 -0600 @@ -246,7 +246,7 @@ static void sercon_puts(const char *s) /* CTRL-switch_char switches input direction between Xen and DOM0. */ #define SWITCH_CODE (opt_conswitch[0]-'a'+1) -static int xen_rx = 1; /* FALSE = serial input passed to domain 0. */ +static int xen_rx = 0; /* FALSE = serial input passed to domain 0. */ static void switch_serial_input(void) { diff -r d6481755ade6 xen/include/asm-powerpc/smp.h --- a/xen/include/asm-powerpc/smp.h Thu Jan 11 13:39:27 2007 -0600 +++ b/xen/include/asm-powerpc/smp.h Fri Jan 12 17:03:59 2007 -0600 @@ -52,7 +52,7 @@ void smp_event_check_interrupt(void); void smp_event_check_interrupt(void); void send_IPI_mask(cpumask_t mask, int vector); -#undef DEBUG_IPI +#define DEBUG_IPI #ifdef DEBUG_IPI void ipi_torture_test(void); #endif -- Hollis Blanchard IBM Linux Technology Center ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] IPI problems
Please check if you linux kernel is up to date. I just built clean xenppc-unstable.hg (assuming it has the issues you state below) and all IPI ^A*3 tests (esp 't' and 'd') work just fine on my maple I created an NFS domain and Did get: (XEN) Assertion '!cpu_isset(nxt, cpu_core_map[cpu])' failed, line 465, file schc (XEN) BUG at sched_credit.c:465 (XEN) [ Xen-3.0-unstable ] (XEN) CPU: 0001 DOMID: 0001 (XEN) pc c003c3d0 msr 80009032 (XEN) lr c0045f54 ctr c0045f40 (XEN) srr0 srr1 (XEN) r00: 2488 c0673cc0 c066df00 (XEN) r04: 0001 2482 c00100a8 (XEN) r08: c0670080 c0045f40 c067 c0045e78 (XEN) r12: c11ceb90 c0546100 (XEN) r16: (XEN) r20: (XEN) r24: 4000 c000 (XEN) r28: c06c4fc8 c05552e8 0001 (XEN) (XEN) (XEN) Panic on CPU 1: (XEN) BUG at sched_credit.c:465 (XEN) (XEN) (XEN) Reboot in five seconds... (XEN) [ Xen-3.0-unstable ] (XEN) CPU: 0001 DOMID: 0001 (XEN) pc c003c3d0 msr 80009032 (XEN) lr c0045f54 ctr c0045f40 (XEN) srr0 srr1 (XEN) r00: 2488 c0673cc0 c066df00 (XEN) r04: 0001 2482 c00100a8 (XEN) r08: c0670080 c0045f40 c067 c0045e78 (XEN) r12: c11ceb90 c0546100 (XEN) r16: (XEN) r20: (XEN) r24: 4000 c000 (XEN) r28: c06c4fc8 c05552e8 0001 (XEN) [0033B6F0] 00435364 .debugger_trap_immediate +0x18/0x38 (XEN) [0033B770] 004352C8 .panic+0xe8/0x16c (XEN) [0033B890] 0043544C .__bug+0x5c/0x6c (XEN) [0033B910] 0041E3A0 .csched_cpu_pick+0x328/0x458 (XEN) [0033B9E0] 0041ED70 .csched_vcpu_acct+0x144/0x1dc (XEN) [0033BA70] 00421170 .csched_tick+0x48/0xe8 (XEN) [0033BB10] 00429DD8 .t_timer_fn+0xec/0x164 (XEN) [0033BBC0] 0042DBA4 .timer_softirq_action +0xd0/0x1b8 (XEN) [0033BC90] 0042A758 .do_softirq+0xc4/0xec (XEN) [0033BD20] 00455AC4 test_all_events+0x5c/0x64 (XEN) [0043EDE0] 80010001FBE1FFF8 (XEN) SP (60004bd8) is not in xen space On Jan 12, 2007, at 6:45 PM, Hollis Blanchard wrote: I mentioned that I accidentally pushed an upstream merge to xenppc-unstable while it's still broken. There are a couple broken things. First, DomU console stops mid-string early in boot. Could be an event channel problem with the ring buffer; haven't investigated. We seem to have an IPI problem, which causes vcpu_pause() to hang the system. The following patch, tested on JS20 and JS21, illustrates it. Before dom0 starts, IPIs work fine. After Linux's mpic_init(), IPIs (as triggered by the 'I' keyhandler) lock the machine. Actually, it looks like a message is trying to get out, because after a while we see a '(' emitted (presumably the first character in (XEN)). (When I comment out mpic_init() in dom0, Xen IPIs continue to work but real IRQs (e.g. the IDE controller) fail in dom0.) Why is this problem occurring only after an upstream merge? I don't know. It's possible that some common IRQ code has changed to no longer call the same arch-specific code, but I'm just speculating. diff -r d6481755ade6 xen/arch/powerpc/setup.c --- a/xen/arch/powerpc/setup.c Thu Jan 11 13:39:27 2007 -0600 +++ b/xen/arch/powerpc/setup.c Fri Jan 12 17:12:27 2007 -0600 @@ -438,7 +438,9 @@ static void __init __start_xen(multiboot domain_unpause_by_systemcontroller(dom0); #ifdef DEBUG_IPI -ipi_torture_test(); +//ipi_torture_test(); +extern void do_ipi_test(char c); +do_ipi_test(0); #endif startup_cpu_idle_loop(); } diff -r d6481755ade6 xen/common/keyhandler.c --- a/xen/common/keyhandler.c Thu Jan 11 13:39:27 2007 -0600 +++ b/xen/common/keyhandler.c Fri Jan 12 17:44:46 2007 -0600 @@ -260,6 +260,16 @@ static void do_debug_key(unsigned char k bit. */ } +static void got_ipi(void *info) +{ +printk(CPU %u got IPI\n, smp_processor_id()); +} + +void do_ipi_test(unsigned char key) +{ +