Re: FYI: Re: [XenPPC] problem of ssh to domU on js21

2006-11-01 Thread Hao Yu
Hi Jimi,

It's great you can reproduce the problem. I do not recall that I have seen
those lines. I need to check it in 30 mins. Right now, the environment is
under rebuilding.

Thanks,
Hao



   
 Jimi Xenidis  
 <[EMAIL PROTECTED] 
 .com>  To 
   Hao Yu/Watson/[EMAIL PROTECTED]  
   
 11/01/2006 03:19   cc 
 PMxen-ppc-devel@lists.xensource.com   
   Subject 
           Re: FYI: Re: [XenPPC] problem of
       ssh to domU on js21 
   
   
   
   
   
   




Thanks Hao, I'm able to reproduce this.
The only reason we can actually use anything is because networking is
S forgiving.
I think I have the solution to this, it will require some re-writing
which should get done by the end of the week.

BTW: you should have gotten lines like:
   (XEN) (file=grant_table.c, line=356) Bad handle (2).
   (XEN) (file=grant_table.c, line=356) Bad handle (13).
   (XEN) (file=grant_table.c, line=356) Bad handle (6).
   gnt_unmap: -2
   (XEN) (file=grant_table.c, line=356) Bad handle (16).
   gnt_unmap: -2
   (XEN) (file=grant_table.c, line=356) Bad handle (2).

out of the machine console, do you see those as well?

Thanks.
-JX
On Nov 1, 2006, at 12:18 PM, Hao Yu wrote:

> Hi, Here is backtrace message from the 0.mon> console
>
> 0:mon> t
> [c06ab530] c02dbf7c .network_tx_buf_gc+0x11c/0x2f0
> (unreliable)
> [c06ab610] c02deed4 .netif_int+0x54/0x120
> [c06ab6b0] c008b824 .handle_IRQ_event+0x84/0x100
> [c06ab750] c008ba70 .__do_IRQ+0x1d0/0x2b0
> [c06ab810] c02c777c .evtchn_do_upcall+0x11c/0x170
> [c06ab8d0] c00445a0 .xen_get_irq+0x10/0x30
> [c06ab950] c000bf10 .do_IRQ+0x70/0x100
> [c06ab9d0] c00041ec hardware_interrupt_entry+0xc/0x10
> --- Exception: 501 (Hardware Interrupt) at c003bcc0
> .plpar_hcall_norets+0x10/0x1c
> [link register   ] c0045534 .HYPERVISOR_sched_op+0x124/0x150
> [c06abcc0] c05a06e0 (unreliable)
> [c06abd70] c004609c .xen_power_save+0x7c/0xa0
> [c06abdf0] c0012060 .cpu_idle+0xe0/0x150
> [c06abe70] c00095dc .rest_init+0x3c/0x60
> [c06abef0] c052d958 .start_kernel+0x278/0x2e0
> [c06abf90] c00084fc .start_here_common+0x50/0x54
> 0:mon> X
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=32
> Modules linked in:
> NIP: C02DBFB0 LR: C02DBF8C CTR: C02DEE80
> REGS: c06ab2b0 TRAP: 0300   Not tainted  (2.6.17-Xen)
> MSR: 80001432   CR: 2882  XER: 6005
> DAR: 00C2, DSISR: 4000
> TASK = c05748a0[0] 'swapper' THREAD: c06a8000 CPU: 0
> GPR00:  C06AB530 C06A9718
> C0003FFB9678
> GPR04:  C071B5D8 C06AB510
> 
> GPR08:  0003 D8008000
> 00C2
> GPR12: 80009032 C0575100 
> 
> GPR16:   
> C0003FFB8000
> GPR20: C0568100 C0003FFB8670 0004
> 004C
> GPR24: 0048 0020 0002
> 004B
> GPR28: 0004 C0003FFB9680 C05FB020
> C0003FFB8500
> NIP [C02DBFB0] .network_tx_buf_gc+0x150/0x2f0
> LR [C02DBF8C] .network_tx_buf_gc+0x12c/0x2f0
> Call Trace:
> [C06AB530] [C02DBF7C] .network_tx_buf_gc+0x11c/0x2f0
> (unreliable)
> [C06AB610] [C02DEED4] .netif_int+0x54/0x120
> [C06AB6B0] [C008B824] .handle_IRQ_event+0x84/0x100
> [C06AB750] [C008BA70] .__do_IRQ+0x1d0/0x2b0
> [C06AB810] [C02C777C] .evtchn_do_upcall+0x11c/0x170
> [C06AB8D0

Re: FYI: Re: [XenPPC] problem of ssh to domU on js21

2006-11-01 Thread Jimi Xenidis

Thanks Hao, I'm able to reproduce this.
The only reason we can actually use anything is because networking is  
S forgiving.
I think I have the solution to this, it will require some re-writing  
which should get done by the end of the week.


BTW: you should have gotten lines like:
  (XEN) (file=grant_table.c, line=356) Bad handle (2).
  (XEN) (file=grant_table.c, line=356) Bad handle (13).
  (XEN) (file=grant_table.c, line=356) Bad handle (6).
  gnt_unmap: -2
  (XEN) (file=grant_table.c, line=356) Bad handle (16).
  gnt_unmap: -2
  (XEN) (file=grant_table.c, line=356) Bad handle (2).

out of the machine console, do you see those as well?

Thanks.
-JX
On Nov 1, 2006, at 12:18 PM, Hao Yu wrote:


Hi, Here is backtrace message from the 0.mon> console

0:mon> t
[c06ab530] c02dbf7c .network_tx_buf_gc+0x11c/0x2f0
(unreliable)
[c06ab610] c02deed4 .netif_int+0x54/0x120
[c06ab6b0] c008b824 .handle_IRQ_event+0x84/0x100
[c06ab750] c008ba70 .__do_IRQ+0x1d0/0x2b0
[c06ab810] c02c777c .evtchn_do_upcall+0x11c/0x170
[c06ab8d0] c00445a0 .xen_get_irq+0x10/0x30
[c06ab950] c000bf10 .do_IRQ+0x70/0x100
[c06ab9d0] c00041ec hardware_interrupt_entry+0xc/0x10
--- Exception: 501 (Hardware Interrupt) at c003bcc0
.plpar_hcall_norets+0x10/0x1c
[link register   ] c0045534 .HYPERVISOR_sched_op+0x124/0x150
[c06abcc0] c05a06e0 (unreliable)
[c06abd70] c004609c .xen_power_save+0x7c/0xa0
[c06abdf0] c0012060 .cpu_idle+0xe0/0x150
[c06abe70] c00095dc .rest_init+0x3c/0x60
[c06abef0] c052d958 .start_kernel+0x278/0x2e0
[c06abf90] c00084fc .start_here_common+0x50/0x54
0:mon> X
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=32
Modules linked in:
NIP: C02DBFB0 LR: C02DBF8C CTR: C02DEE80
REGS: c06ab2b0 TRAP: 0300   Not tainted  (2.6.17-Xen)
MSR: 80001432   CR: 2882  XER: 6005
DAR: 00C2, DSISR: 4000
TASK = c05748a0[0] 'swapper' THREAD: c06a8000 CPU: 0
GPR00:  C06AB530 C06A9718  
C0003FFB9678
GPR04:  C071B5D8 C06AB510  

GPR08:  0003 D8008000  
00C2
GPR12: 80009032 C0575100   

GPR16:     
C0003FFB8000
GPR20: C0568100 C0003FFB8670 0004  
004C
GPR24: 0048 0020 0002  
004B
GPR28: 0004 C0003FFB9680 C05FB020  
C0003FFB8500

NIP [C02DBFB0] .network_tx_buf_gc+0x150/0x2f0
LR [C02DBF8C] .network_tx_buf_gc+0x12c/0x2f0
Call Trace:
[C06AB530] [C02DBF7C] .network_tx_buf_gc+0x11c/0x2f0
(unreliable)
[C06AB610] [C02DEED4] .netif_int+0x54/0x120
[C06AB6B0] [C008B824] .handle_IRQ_event+0x84/0x100
[C06AB750] [C008BA70] .__do_IRQ+0x1d0/0x2b0
[C06AB810] [C02C777C] .evtchn_do_upcall+0x11c/0x170
[C06AB8D0] [C00445A0] .xen_get_irq+0x10/0x30
[C06AB950] [C000BF10] .do_IRQ+0x70/0x100
[C06AB9D0] [C00041EC] hardware_interrupt_entry+0xc/ 
0x10

--- Exception: 501 at .plpar_hcall_norets+0x10/0x1c
LR = .HYPERVISOR_sched_op+0x124/0x150
[C06ABCC0] [C05A06E0] 0xc05a06e0 (unreliable)
[C06ABD70] [C004609C] .xen_power_save+0x7c/0xa0
[C06ABDF0] [C0012060] .cpu_idle+0xe0/0x150
[C06ABE70] [C00095DC] .rest_init+0x3c/0x60
[C06ABEF0] [C052D958] .start_kernel+0x278/0x2e0
[C06ABF90] [C00084FC] .start_here_common+0x50/0x54
Instruction dump:
809d000c 387f1178 4bfec5c9 6000 3800 397a00c0 901d000c  
6000
e93f0170 7d35c92a fb9f0170 7c2004ac <7c005828> 3000 7c00592d  
40c2fff4

 <0>Kernel panic - not syncing: Fatal exception in interrupt
 <0>Rebooting in 180 seconds..

Hao Yu





 Jimi Xenidis
 <[EMAIL PROTECTED]
 .com> 
  To

   Hao Yu/Watson/[EMAIL PROTECTED]
 11/01/2006  
12:01   cc
 PMxen-ppc- 
[EMAIL PROTECTED]

Subject
   FYI: Re: [XenPPC] problem of  
ssh to

   domU on js21










Thank you Hao for posting the issue to the list!
I'll be posting this highly experimental patch to the list shortly.
-JX

On Nov 1, 2006, at 11:55 AM, Hao Yu wrote:



With Jimi's new patch, the ping works fine, lasting forever (stay

Re: FYI: Re: [XenPPC] problem of ssh to domU on js21

2006-11-01 Thread Hao Yu
Hi Jimi,

Just found out the problem is not only about 'ssh'. dom0 breaks when I
tried invoke java in dom0. Here are the console message and backtraces.
Just hope the info can help you on the potentially broken xen you mentioned
when you put out the two patches.

Thanks,
Hao

cso83:/ # cpu 0x0: Vector: 400 (Instruction Access) at [c00031873090]
pc: : .__start+0x4000/0x8
lr: c00b5e10: .get_vma_policy+0x60/0xd0
sp: c00031873310
   msr: 800040009032
  current = 0xc00032626040
  paca= 0xc058c100
pid   = 7287, comm = java
enter ? for help
0:mon>

0:mon> t
[link register   ] c00b5e10 .get_vma_policy+0x60/0xd0
[c00031873310] d02a0fd8 .cxiTraceExit+0x78/0x98 [mmfslinux]
(unreliable)
[c00031873390] c00b73f4 .alloc_page_vma+0x34/0x140
[c00031873430] c00a577c .__handle_mm_fault+0xb5c/0xf90
[c00031873540] c0032868 .do_page_fault+0x558/0x830
[c00031873720] c00048e0 .handle_page_fault+0x20/0x54
--- Exception: 301 (Data Access) at c0037830
.__clear_user+0x14/0x7c
[link register   ] c0118b28 .padzero+0x88/0x130
[c00031873a10] c0118b00 .padzero+0x60/0x130 (unreliable)
[c00031873aa0] c011a40c .load_elf_binary+0x8fc/0x1c70
[c00031873c30] c00dc4c8 .search_binary_handler+0xe8/0x3f0
[c00031873ce0] c0110870 .compat_do_execve+0x1a0/0x2c0
[c00031873d90] c0017724 .compat_sys_execve+0x74/0x100
[c00031873e30] c000861c syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at 0fe10c68
SP (ffde07f0) is in userspace


0:mon> X
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=32 NUMA
Modules linked in: mmfs mmfslinux tracedev
NIP:  LR: C00B5E10 CTR: 
REGS: c00031873090 TRAP: 0400   Tainted: PF  (2.6.17-Xen)
MSR: 800040009032   CR: 84224488  XER: 
TASK = c00032626040[7287] 'java' THREAD: c0003187 CPU: 0
GPR00:  C00031873310  C00075CE59E8
GPR04: 10024316 10024316 C00AFFFC C00075CE59E8
GPR08:  D02E5E58  
GPR12: D02B97E8 C058C100 6DB6DB6DB6DB6DB7 
GPR16: C000318734A0 100214E8  C000
GPR20: C00075DB7000 C00075CE59E8 C000756A0998 10024316
GPR24: 10024000 C00031ADE210 0120 10024316
GPR28: 000200D2 C00031AD8120 C05BB350 C00075CE59E8
NIP [] .__start+0x4000/0x8
LR [C00B5E10] .get_vma_policy+0x60/0xd0
Call Trace:
[C00031873310] [D02A0FD8] .cxiTraceExit+0x78/0x98 [mmfslinux]
(unreliable)
[C00031873390] [C00B73F4] .alloc_page_vma+0x34/0x140
[C00031873430] [C00A577C] .__handle_mm_fault+0xb5c/0xf90
[C00031873540] [C0032868] .do_page_fault+0x558/0x830
[C00031873720] [C00048E0] .handle_page_fault+0x20/0x54
--- Exception: 301 at .__clear_user+0x14/0x7c
LR = .padzero+0x88/0x130
[C00031873A10] [C0118B00] .padzero+0x60/0x130 (unreliable)
[C00031873AA0] [C011A40C] .load_elf_binary+0x8fc/0x1c70
[C00031873C30] [C00DC4C8] .search_binary_handler+0xe8/0x3f0
[C00031873CE0] [C0110870] .compat_do_execve+0x1a0/0x2c0
[C00031873D90] [C0017724] .compat_sys_execve+0x74/0x100
[C00031873E30] [C000861C] syscall_exit+0x0/0x40
Instruction dump:
       
       
 <3>Badness in __mutex_unlock_slowpath at
/root/xen-maria-latest/linux/linux-xen-ppc/kernel/mutex.c:209
Call Trace:
[C00031B7F240] [C000F864] .show_stack+0x54/0x1f0 (unreliable)
[C00031B7F2F0] [C00278E8] .program_check_exception+0x508/0x6a0
[C00031B7F3D0] [C00044EC] program_check_common+0xec/0x100
--- Exception: 700 at .__mutex_unlock_slowpath+0x21c/0x230
LR = .gpfs_fill_super+0x768/0x818 [mmfslinux]
[C00031B7F6C0] [C00031B7F760] 0xc00031b7f760 (unreliable)
[C00031B7F760] [D02AD5B4] .gpfs_fill_super+0x768/0x818
[mmfslinux]
[C00031B7F870] [C00D6F68] .get_sb_nodev+0x88/0x150
[C00031B7F910] [D02AD6BC] .gpfs_get_sb+0x58/0xd0 [mmfslinux]
[C00031B7F9A0] [C00D66EC] .vfs_kern_mount+0x7c/0x170
[C00031B7FA40] [C00D683C] .do_kern_mount+0x4c/0x80
[C00031B7FAE0] [C00F795C] .do_mount+0x31c/0x8f0
[C00031B7FD70] [C0110A68] .compat_sys_mount+0xd8/0x2b0
[C00031B7FE30] [C000861C] syscall_exit+0x0/0x40






   
 Jimi Xenidis  
 <[EMAIL PROTECTED] 

Re: FYI: Re: [XenPPC] problem of ssh to domU on js21

2006-11-01 Thread Hao Yu
Hi, Here is backtrace message from the 0.mon> console

0:mon> t
[c06ab530] c02dbf7c .network_tx_buf_gc+0x11c/0x2f0
(unreliable)
[c06ab610] c02deed4 .netif_int+0x54/0x120
[c06ab6b0] c008b824 .handle_IRQ_event+0x84/0x100
[c06ab750] c008ba70 .__do_IRQ+0x1d0/0x2b0
[c06ab810] c02c777c .evtchn_do_upcall+0x11c/0x170
[c06ab8d0] c00445a0 .xen_get_irq+0x10/0x30
[c06ab950] c000bf10 .do_IRQ+0x70/0x100
[c06ab9d0] c00041ec hardware_interrupt_entry+0xc/0x10
--- Exception: 501 (Hardware Interrupt) at c003bcc0
.plpar_hcall_norets+0x10/0x1c
[link register   ] c0045534 .HYPERVISOR_sched_op+0x124/0x150
[c06abcc0] c05a06e0 (unreliable)
[c06abd70] c004609c .xen_power_save+0x7c/0xa0
[c06abdf0] c0012060 .cpu_idle+0xe0/0x150
[c06abe70] c00095dc .rest_init+0x3c/0x60
[c06abef0] c052d958 .start_kernel+0x278/0x2e0
[c06abf90] c00084fc .start_here_common+0x50/0x54
0:mon> X
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=32
Modules linked in:
NIP: C02DBFB0 LR: C02DBF8C CTR: C02DEE80
REGS: c06ab2b0 TRAP: 0300   Not tainted  (2.6.17-Xen)
MSR: 80001432   CR: 2882  XER: 6005
DAR: 00C2, DSISR: 4000
TASK = c05748a0[0] 'swapper' THREAD: c06a8000 CPU: 0
GPR00:  C06AB530 C06A9718 C0003FFB9678
GPR04:  C071B5D8 C06AB510 
GPR08:  0003 D8008000 00C2
GPR12: 80009032 C0575100  
GPR16:    C0003FFB8000
GPR20: C0568100 C0003FFB8670 0004 004C
GPR24: 0048 0020 0002 004B
GPR28: 0004 C0003FFB9680 C05FB020 C0003FFB8500
NIP [C02DBFB0] .network_tx_buf_gc+0x150/0x2f0
LR [C02DBF8C] .network_tx_buf_gc+0x12c/0x2f0
Call Trace:
[C06AB530] [C02DBF7C] .network_tx_buf_gc+0x11c/0x2f0
(unreliable)
[C06AB610] [C02DEED4] .netif_int+0x54/0x120
[C06AB6B0] [C008B824] .handle_IRQ_event+0x84/0x100
[C06AB750] [C008BA70] .__do_IRQ+0x1d0/0x2b0
[C06AB810] [C02C777C] .evtchn_do_upcall+0x11c/0x170
[C06AB8D0] [C00445A0] .xen_get_irq+0x10/0x30
[C06AB950] [C000BF10] .do_IRQ+0x70/0x100
[C06AB9D0] [C00041EC] hardware_interrupt_entry+0xc/0x10
--- Exception: 501 at .plpar_hcall_norets+0x10/0x1c
LR = .HYPERVISOR_sched_op+0x124/0x150
[C06ABCC0] [C05A06E0] 0xc05a06e0 (unreliable)
[C06ABD70] [C004609C] .xen_power_save+0x7c/0xa0
[C06ABDF0] [C0012060] .cpu_idle+0xe0/0x150
[C06ABE70] [C00095DC] .rest_init+0x3c/0x60
[C06ABEF0] [C052D958] .start_kernel+0x278/0x2e0
[C06ABF90] [C00084FC] .start_here_common+0x50/0x54
Instruction dump:
809d000c 387f1178 4bfec5c9 6000 3800 397a00c0 901d000c 6000
e93f0170 7d35c92a fb9f0170 7c2004ac <7c005828> 3000 7c00592d 40c2fff4
 <0>Kernel panic - not syncing: Fatal exception in interrupt
 <0>Rebooting in 180 seconds..

Hao Yu




   
 Jimi Xenidis  
 <[EMAIL PROTECTED] 
 .com>  To 
   Hao Yu/Watson/[EMAIL PROTECTED]  
   
 11/01/2006 12:01   cc 
 PMxen-ppc-devel@lists.xensource.com   
   Subject 
   FYI: Re: [XenPPC] problem of ssh to 
   domU on js21
   
   
   
   
   
   




Thank you Hao for posting the issue to the list!
I'll be posting this highly experimental patch to the list shortly.
-JX

On Nov 1, 2006, at 11:55 AM, Hao Yu wrote:

>
> With Jimi's new patch, the ping works fine, lasting forever (stayed
> for 1.5
> hours). However, I could not stop it using ^C or ^Z .
>
>