Re: FYI: Re: [XenPPC] problem of ssh to domU on js21
Hi Jimi, It's great you can reproduce the problem. I do not recall that I have seen those lines. I need to check it in 30 mins. Right now, the environment is under rebuilding. Thanks, Hao Jimi Xenidis <[EMAIL PROTECTED] .com> To Hao Yu/Watson/[EMAIL PROTECTED] 11/01/2006 03:19 cc PMxen-ppc-devel@lists.xensource.com Subject Re: FYI: Re: [XenPPC] problem of ssh to domU on js21 Thanks Hao, I'm able to reproduce this. The only reason we can actually use anything is because networking is S forgiving. I think I have the solution to this, it will require some re-writing which should get done by the end of the week. BTW: you should have gotten lines like: (XEN) (file=grant_table.c, line=356) Bad handle (2). (XEN) (file=grant_table.c, line=356) Bad handle (13). (XEN) (file=grant_table.c, line=356) Bad handle (6). gnt_unmap: -2 (XEN) (file=grant_table.c, line=356) Bad handle (16). gnt_unmap: -2 (XEN) (file=grant_table.c, line=356) Bad handle (2). out of the machine console, do you see those as well? Thanks. -JX On Nov 1, 2006, at 12:18 PM, Hao Yu wrote: > Hi, Here is backtrace message from the 0.mon> console > > 0:mon> t > [c06ab530] c02dbf7c .network_tx_buf_gc+0x11c/0x2f0 > (unreliable) > [c06ab610] c02deed4 .netif_int+0x54/0x120 > [c06ab6b0] c008b824 .handle_IRQ_event+0x84/0x100 > [c06ab750] c008ba70 .__do_IRQ+0x1d0/0x2b0 > [c06ab810] c02c777c .evtchn_do_upcall+0x11c/0x170 > [c06ab8d0] c00445a0 .xen_get_irq+0x10/0x30 > [c06ab950] c000bf10 .do_IRQ+0x70/0x100 > [c06ab9d0] c00041ec hardware_interrupt_entry+0xc/0x10 > --- Exception: 501 (Hardware Interrupt) at c003bcc0 > .plpar_hcall_norets+0x10/0x1c > [link register ] c0045534 .HYPERVISOR_sched_op+0x124/0x150 > [c06abcc0] c05a06e0 (unreliable) > [c06abd70] c004609c .xen_power_save+0x7c/0xa0 > [c06abdf0] c0012060 .cpu_idle+0xe0/0x150 > [c06abe70] c00095dc .rest_init+0x3c/0x60 > [c06abef0] c052d958 .start_kernel+0x278/0x2e0 > [c06abf90] c00084fc .start_here_common+0x50/0x54 > 0:mon> X > Oops: Kernel access of bad area, sig: 11 [#1] > SMP NR_CPUS=32 > Modules linked in: > NIP: C02DBFB0 LR: C02DBF8C CTR: C02DEE80 > REGS: c06ab2b0 TRAP: 0300 Not tainted (2.6.17-Xen) > MSR: 80001432 CR: 2882 XER: 6005 > DAR: 00C2, DSISR: 4000 > TASK = c05748a0[0] 'swapper' THREAD: c06a8000 CPU: 0 > GPR00: C06AB530 C06A9718 > C0003FFB9678 > GPR04: C071B5D8 C06AB510 > > GPR08: 0003 D8008000 > 00C2 > GPR12: 80009032 C0575100 > > GPR16: > C0003FFB8000 > GPR20: C0568100 C0003FFB8670 0004 > 004C > GPR24: 0048 0020 0002 > 004B > GPR28: 0004 C0003FFB9680 C05FB020 > C0003FFB8500 > NIP [C02DBFB0] .network_tx_buf_gc+0x150/0x2f0 > LR [C02DBF8C] .network_tx_buf_gc+0x12c/0x2f0 > Call Trace: > [C06AB530] [C02DBF7C] .network_tx_buf_gc+0x11c/0x2f0 > (unreliable) > [C06AB610] [C02DEED4] .netif_int+0x54/0x120 > [C06AB6B0] [C008B824] .handle_IRQ_event+0x84/0x100 > [C06AB750] [C008BA70] .__do_IRQ+0x1d0/0x2b0 > [C06AB810] [C02C777C] .evtchn_do_upcall+0x11c/0x170 > [C06AB8D0
Re: FYI: Re: [XenPPC] problem of ssh to domU on js21
Thanks Hao, I'm able to reproduce this. The only reason we can actually use anything is because networking is S forgiving. I think I have the solution to this, it will require some re-writing which should get done by the end of the week. BTW: you should have gotten lines like: (XEN) (file=grant_table.c, line=356) Bad handle (2). (XEN) (file=grant_table.c, line=356) Bad handle (13). (XEN) (file=grant_table.c, line=356) Bad handle (6). gnt_unmap: -2 (XEN) (file=grant_table.c, line=356) Bad handle (16). gnt_unmap: -2 (XEN) (file=grant_table.c, line=356) Bad handle (2). out of the machine console, do you see those as well? Thanks. -JX On Nov 1, 2006, at 12:18 PM, Hao Yu wrote: Hi, Here is backtrace message from the 0.mon> console 0:mon> t [c06ab530] c02dbf7c .network_tx_buf_gc+0x11c/0x2f0 (unreliable) [c06ab610] c02deed4 .netif_int+0x54/0x120 [c06ab6b0] c008b824 .handle_IRQ_event+0x84/0x100 [c06ab750] c008ba70 .__do_IRQ+0x1d0/0x2b0 [c06ab810] c02c777c .evtchn_do_upcall+0x11c/0x170 [c06ab8d0] c00445a0 .xen_get_irq+0x10/0x30 [c06ab950] c000bf10 .do_IRQ+0x70/0x100 [c06ab9d0] c00041ec hardware_interrupt_entry+0xc/0x10 --- Exception: 501 (Hardware Interrupt) at c003bcc0 .plpar_hcall_norets+0x10/0x1c [link register ] c0045534 .HYPERVISOR_sched_op+0x124/0x150 [c06abcc0] c05a06e0 (unreliable) [c06abd70] c004609c .xen_power_save+0x7c/0xa0 [c06abdf0] c0012060 .cpu_idle+0xe0/0x150 [c06abe70] c00095dc .rest_init+0x3c/0x60 [c06abef0] c052d958 .start_kernel+0x278/0x2e0 [c06abf90] c00084fc .start_here_common+0x50/0x54 0:mon> X Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=32 Modules linked in: NIP: C02DBFB0 LR: C02DBF8C CTR: C02DEE80 REGS: c06ab2b0 TRAP: 0300 Not tainted (2.6.17-Xen) MSR: 80001432 CR: 2882 XER: 6005 DAR: 00C2, DSISR: 4000 TASK = c05748a0[0] 'swapper' THREAD: c06a8000 CPU: 0 GPR00: C06AB530 C06A9718 C0003FFB9678 GPR04: C071B5D8 C06AB510 GPR08: 0003 D8008000 00C2 GPR12: 80009032 C0575100 GPR16: C0003FFB8000 GPR20: C0568100 C0003FFB8670 0004 004C GPR24: 0048 0020 0002 004B GPR28: 0004 C0003FFB9680 C05FB020 C0003FFB8500 NIP [C02DBFB0] .network_tx_buf_gc+0x150/0x2f0 LR [C02DBF8C] .network_tx_buf_gc+0x12c/0x2f0 Call Trace: [C06AB530] [C02DBF7C] .network_tx_buf_gc+0x11c/0x2f0 (unreliable) [C06AB610] [C02DEED4] .netif_int+0x54/0x120 [C06AB6B0] [C008B824] .handle_IRQ_event+0x84/0x100 [C06AB750] [C008BA70] .__do_IRQ+0x1d0/0x2b0 [C06AB810] [C02C777C] .evtchn_do_upcall+0x11c/0x170 [C06AB8D0] [C00445A0] .xen_get_irq+0x10/0x30 [C06AB950] [C000BF10] .do_IRQ+0x70/0x100 [C06AB9D0] [C00041EC] hardware_interrupt_entry+0xc/ 0x10 --- Exception: 501 at .plpar_hcall_norets+0x10/0x1c LR = .HYPERVISOR_sched_op+0x124/0x150 [C06ABCC0] [C05A06E0] 0xc05a06e0 (unreliable) [C06ABD70] [C004609C] .xen_power_save+0x7c/0xa0 [C06ABDF0] [C0012060] .cpu_idle+0xe0/0x150 [C06ABE70] [C00095DC] .rest_init+0x3c/0x60 [C06ABEF0] [C052D958] .start_kernel+0x278/0x2e0 [C06ABF90] [C00084FC] .start_here_common+0x50/0x54 Instruction dump: 809d000c 387f1178 4bfec5c9 6000 3800 397a00c0 901d000c 6000 e93f0170 7d35c92a fb9f0170 7c2004ac <7c005828> 3000 7c00592d 40c2fff4 <0>Kernel panic - not syncing: Fatal exception in interrupt <0>Rebooting in 180 seconds.. Hao Yu Jimi Xenidis <[EMAIL PROTECTED] .com> To Hao Yu/Watson/[EMAIL PROTECTED] 11/01/2006 12:01 cc PMxen-ppc- [EMAIL PROTECTED] Subject FYI: Re: [XenPPC] problem of ssh to domU on js21 Thank you Hao for posting the issue to the list! I'll be posting this highly experimental patch to the list shortly. -JX On Nov 1, 2006, at 11:55 AM, Hao Yu wrote: With Jimi's new patch, the ping works fine, lasting forever (stay
Re: FYI: Re: [XenPPC] problem of ssh to domU on js21
Hi Jimi, Just found out the problem is not only about 'ssh'. dom0 breaks when I tried invoke java in dom0. Here are the console message and backtraces. Just hope the info can help you on the potentially broken xen you mentioned when you put out the two patches. Thanks, Hao cso83:/ # cpu 0x0: Vector: 400 (Instruction Access) at [c00031873090] pc: : .__start+0x4000/0x8 lr: c00b5e10: .get_vma_policy+0x60/0xd0 sp: c00031873310 msr: 800040009032 current = 0xc00032626040 paca= 0xc058c100 pid = 7287, comm = java enter ? for help 0:mon> 0:mon> t [link register ] c00b5e10 .get_vma_policy+0x60/0xd0 [c00031873310] d02a0fd8 .cxiTraceExit+0x78/0x98 [mmfslinux] (unreliable) [c00031873390] c00b73f4 .alloc_page_vma+0x34/0x140 [c00031873430] c00a577c .__handle_mm_fault+0xb5c/0xf90 [c00031873540] c0032868 .do_page_fault+0x558/0x830 [c00031873720] c00048e0 .handle_page_fault+0x20/0x54 --- Exception: 301 (Data Access) at c0037830 .__clear_user+0x14/0x7c [link register ] c0118b28 .padzero+0x88/0x130 [c00031873a10] c0118b00 .padzero+0x60/0x130 (unreliable) [c00031873aa0] c011a40c .load_elf_binary+0x8fc/0x1c70 [c00031873c30] c00dc4c8 .search_binary_handler+0xe8/0x3f0 [c00031873ce0] c0110870 .compat_do_execve+0x1a0/0x2c0 [c00031873d90] c0017724 .compat_sys_execve+0x74/0x100 [c00031873e30] c000861c syscall_exit+0x0/0x40 --- Exception: c01 (System Call) at 0fe10c68 SP (ffde07f0) is in userspace 0:mon> X Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=32 NUMA Modules linked in: mmfs mmfslinux tracedev NIP: LR: C00B5E10 CTR: REGS: c00031873090 TRAP: 0400 Tainted: PF (2.6.17-Xen) MSR: 800040009032 CR: 84224488 XER: TASK = c00032626040[7287] 'java' THREAD: c0003187 CPU: 0 GPR00: C00031873310 C00075CE59E8 GPR04: 10024316 10024316 C00AFFFC C00075CE59E8 GPR08: D02E5E58 GPR12: D02B97E8 C058C100 6DB6DB6DB6DB6DB7 GPR16: C000318734A0 100214E8 C000 GPR20: C00075DB7000 C00075CE59E8 C000756A0998 10024316 GPR24: 10024000 C00031ADE210 0120 10024316 GPR28: 000200D2 C00031AD8120 C05BB350 C00075CE59E8 NIP [] .__start+0x4000/0x8 LR [C00B5E10] .get_vma_policy+0x60/0xd0 Call Trace: [C00031873310] [D02A0FD8] .cxiTraceExit+0x78/0x98 [mmfslinux] (unreliable) [C00031873390] [C00B73F4] .alloc_page_vma+0x34/0x140 [C00031873430] [C00A577C] .__handle_mm_fault+0xb5c/0xf90 [C00031873540] [C0032868] .do_page_fault+0x558/0x830 [C00031873720] [C00048E0] .handle_page_fault+0x20/0x54 --- Exception: 301 at .__clear_user+0x14/0x7c LR = .padzero+0x88/0x130 [C00031873A10] [C0118B00] .padzero+0x60/0x130 (unreliable) [C00031873AA0] [C011A40C] .load_elf_binary+0x8fc/0x1c70 [C00031873C30] [C00DC4C8] .search_binary_handler+0xe8/0x3f0 [C00031873CE0] [C0110870] .compat_do_execve+0x1a0/0x2c0 [C00031873D90] [C0017724] .compat_sys_execve+0x74/0x100 [C00031873E30] [C000861C] syscall_exit+0x0/0x40 Instruction dump: <3>Badness in __mutex_unlock_slowpath at /root/xen-maria-latest/linux/linux-xen-ppc/kernel/mutex.c:209 Call Trace: [C00031B7F240] [C000F864] .show_stack+0x54/0x1f0 (unreliable) [C00031B7F2F0] [C00278E8] .program_check_exception+0x508/0x6a0 [C00031B7F3D0] [C00044EC] program_check_common+0xec/0x100 --- Exception: 700 at .__mutex_unlock_slowpath+0x21c/0x230 LR = .gpfs_fill_super+0x768/0x818 [mmfslinux] [C00031B7F6C0] [C00031B7F760] 0xc00031b7f760 (unreliable) [C00031B7F760] [D02AD5B4] .gpfs_fill_super+0x768/0x818 [mmfslinux] [C00031B7F870] [C00D6F68] .get_sb_nodev+0x88/0x150 [C00031B7F910] [D02AD6BC] .gpfs_get_sb+0x58/0xd0 [mmfslinux] [C00031B7F9A0] [C00D66EC] .vfs_kern_mount+0x7c/0x170 [C00031B7FA40] [C00D683C] .do_kern_mount+0x4c/0x80 [C00031B7FAE0] [C00F795C] .do_mount+0x31c/0x8f0 [C00031B7FD70] [C0110A68] .compat_sys_mount+0xd8/0x2b0 [C00031B7FE30] [C000861C] syscall_exit+0x0/0x40 Jimi Xenidis <[EMAIL PROTECTED]
Re: FYI: Re: [XenPPC] problem of ssh to domU on js21
Hi, Here is backtrace message from the 0.mon> console 0:mon> t [c06ab530] c02dbf7c .network_tx_buf_gc+0x11c/0x2f0 (unreliable) [c06ab610] c02deed4 .netif_int+0x54/0x120 [c06ab6b0] c008b824 .handle_IRQ_event+0x84/0x100 [c06ab750] c008ba70 .__do_IRQ+0x1d0/0x2b0 [c06ab810] c02c777c .evtchn_do_upcall+0x11c/0x170 [c06ab8d0] c00445a0 .xen_get_irq+0x10/0x30 [c06ab950] c000bf10 .do_IRQ+0x70/0x100 [c06ab9d0] c00041ec hardware_interrupt_entry+0xc/0x10 --- Exception: 501 (Hardware Interrupt) at c003bcc0 .plpar_hcall_norets+0x10/0x1c [link register ] c0045534 .HYPERVISOR_sched_op+0x124/0x150 [c06abcc0] c05a06e0 (unreliable) [c06abd70] c004609c .xen_power_save+0x7c/0xa0 [c06abdf0] c0012060 .cpu_idle+0xe0/0x150 [c06abe70] c00095dc .rest_init+0x3c/0x60 [c06abef0] c052d958 .start_kernel+0x278/0x2e0 [c06abf90] c00084fc .start_here_common+0x50/0x54 0:mon> X Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=32 Modules linked in: NIP: C02DBFB0 LR: C02DBF8C CTR: C02DEE80 REGS: c06ab2b0 TRAP: 0300 Not tainted (2.6.17-Xen) MSR: 80001432 CR: 2882 XER: 6005 DAR: 00C2, DSISR: 4000 TASK = c05748a0[0] 'swapper' THREAD: c06a8000 CPU: 0 GPR00: C06AB530 C06A9718 C0003FFB9678 GPR04: C071B5D8 C06AB510 GPR08: 0003 D8008000 00C2 GPR12: 80009032 C0575100 GPR16: C0003FFB8000 GPR20: C0568100 C0003FFB8670 0004 004C GPR24: 0048 0020 0002 004B GPR28: 0004 C0003FFB9680 C05FB020 C0003FFB8500 NIP [C02DBFB0] .network_tx_buf_gc+0x150/0x2f0 LR [C02DBF8C] .network_tx_buf_gc+0x12c/0x2f0 Call Trace: [C06AB530] [C02DBF7C] .network_tx_buf_gc+0x11c/0x2f0 (unreliable) [C06AB610] [C02DEED4] .netif_int+0x54/0x120 [C06AB6B0] [C008B824] .handle_IRQ_event+0x84/0x100 [C06AB750] [C008BA70] .__do_IRQ+0x1d0/0x2b0 [C06AB810] [C02C777C] .evtchn_do_upcall+0x11c/0x170 [C06AB8D0] [C00445A0] .xen_get_irq+0x10/0x30 [C06AB950] [C000BF10] .do_IRQ+0x70/0x100 [C06AB9D0] [C00041EC] hardware_interrupt_entry+0xc/0x10 --- Exception: 501 at .plpar_hcall_norets+0x10/0x1c LR = .HYPERVISOR_sched_op+0x124/0x150 [C06ABCC0] [C05A06E0] 0xc05a06e0 (unreliable) [C06ABD70] [C004609C] .xen_power_save+0x7c/0xa0 [C06ABDF0] [C0012060] .cpu_idle+0xe0/0x150 [C06ABE70] [C00095DC] .rest_init+0x3c/0x60 [C06ABEF0] [C052D958] .start_kernel+0x278/0x2e0 [C06ABF90] [C00084FC] .start_here_common+0x50/0x54 Instruction dump: 809d000c 387f1178 4bfec5c9 6000 3800 397a00c0 901d000c 6000 e93f0170 7d35c92a fb9f0170 7c2004ac <7c005828> 3000 7c00592d 40c2fff4 <0>Kernel panic - not syncing: Fatal exception in interrupt <0>Rebooting in 180 seconds.. Hao Yu Jimi Xenidis <[EMAIL PROTECTED] .com> To Hao Yu/Watson/[EMAIL PROTECTED] 11/01/2006 12:01 cc PMxen-ppc-devel@lists.xensource.com Subject FYI: Re: [XenPPC] problem of ssh to domU on js21 Thank you Hao for posting the issue to the list! I'll be posting this highly experimental patch to the list shortly. -JX On Nov 1, 2006, at 11:55 AM, Hao Yu wrote: > > With Jimi's new patch, the ping works fine, lasting forever (stayed > for 1.5 > hours). However, I could not stop it using ^C or ^Z . > >