>-----Original Message-----
>From: Mike Christie [mailto:[EMAIL PROTECTED]
>> So any idea why system locks up with your 14 patches?
>>
>
>Weird with the 14 patches it works almost perfect for me, but now it
>does not work for you guys :)
>

It was not system lock-up instead my ssh session hanged after your these
14 patches with error test. I mostly work in ssh session and I thought
my system hanged without any additional log prints in my netconsole window.
So I went ahead with power re-cycling system assuming system locked up.

This time I observed console was still alive for sometime after error test
and I could do some command. However new ssh request failed with these
error persistently.

[EMAIL PROTECTED] ~]$ ssh [EMAIL PROTECTED]
Disconnecting: Bad packet length 1349676916.
[EMAIL PROTECTED] ~]$ ssh [EMAIL PROTECTED]
Disconnecting: Bad packet length 1349676916.
[EMAIL PROTECTED] ~]$
[EMAIL PROTECTED] ~]$ ssh [EMAIL PROTECTED]
Disconnecting: Bad packet length 1349676916.

After a less than a minutes, system crashed as you can see from
attached logs error_log[12].txt.

<snip>

>Do we get past the abort code and get to the fc_eh_host_reset code?
>Could you comment out the setting of the fc_eh_abort  and
>fc_eh_device_reset in the scsi_host_template so we can narrow it down.
>

I tried same error test after:

1. Applying handle-vasu-comments.patch
2. Applying clear-complete-during-reset.patch
3. Removing fc_eh_abort  and fc_eh_device_reset in the scsi_host_template

I got same problem as before, ssh session hanged but console active for
Some time then crashed as you can see this time from attached log
after-some-fix-rm-reset.txt

This "WARNING: at net/ipv4/tcp_timer.c:290" is more common before these
crash. Any idea?
ÿþsd 70:0:0:0: [sdb] Result: 
hostbyte=0x05 driverbyte=0x00

end_request: I/O error, dev sdb, sector 
32

Buffer I/O error on device sdb1, 
logical block 0

Buffer I/O error on device sdb1, 
logical block 1

Buffer I/O error on device sdb1, 
logical block 2

Buffer I/O error on device sdb1, 
logical block 3

------------[ cut here ]------------

WARNING: at net/ipv4/tcp_timer.c:290 
tcp_write_timer+0xa9/0x5fb()

Modules linked in: fcoe libfc 
netconsole vmxnet(P) [last unloaded: 
libfc]

Pid: 0, comm: swapper Tainted: P        
  2.6.27-rc1vasulibfc-27-rc1 #11

 [<c011f28b>] 
warn_on_slowpath+0x40/0x63

 [<c038f9ef>] __alloc_skb+0x4d/0xfb

 [<c03939c3>] netif_rx+0xd3/0x13c

 [<c03bb71d>] tcp_write_timer+0x0/0x5fb

 [<c03bb731>] 
tcp_write_timer+0x14/0x5fb

 [<c042b417>] _spin_lock+0x17/0x22

 [<c03bb7c6>] 
tcp_write_timer+0xa9/0x5fb

 [<c03bb71d>] tcp_write_timer+0x0/0x5fb

 [<c0126573>] 
run_timer_softirq+0x111/0x16d

 [<c03bb71d>] tcp_write_timer+0x0/0x5fb

 [<c0122dc2>] __do_softirq+0x69/0xde

 [<c0122e6e>] do_softirq+0x37/0x4d

 [<c012312d>] irq_exit+0x41/0x75

 [<c0110cef>] 
smp_apic_timer_interrupt+0x6e/0x7b

 [<c01038c9>] 
apic_timer_interrupt+0x2d/0x34

 [<c01077a0>] default_idle+0x2d/0x4c

 [<c01077a2>] default_idle+0x2f/0x4c

 [<c0101ac6>] cpu_idle+0xb5/0xd5

 =======================

---[ end trace f067c41fb2856b90 ]---

BUG: unable to handle kernel NULL 
pointer dereference at 00000045

IP: [<c03b6ff0>] 
tcp_enter_frto+0xfc/0x1e9

*pde = 00000000 

Oops: 0000 [#1] SMP 

Modules linked in: fcoe libfc 
netconsole vmxnet(P) [last unloaded: 
libfc]



Pid: 0, comm: swapper Tainted: P        
W (2.6.27-rc1vasulibfc-27-rc1 #11)

EIP: 0060:[<c03b6ff0>] EFLAGS: 00010246 
CPU: 0

EIP is at tcp_enter_frto+0xfc/0x1e9

EAX: 00000000 EBX: de940c80 ECX: 
00000000 EDX: 00000020

ESI: de940c80 EDI: de940c80 EBP: 
c03bb71d ESP: c05ebf2c

 DS: 007b ES: 007b FS: 00d8 GS: 0000 
SS: 0068

Process swapper (pid: 0, ti=c05ea000 
task=c058b3c0 task.ti=c05ea000)

Stack: de940c80 0000000f c03bbb4b 
de940cac 00000046 c0686d00 de940d68 
de940c80 

       00000100 c0686d00 c03bb71d 
c0126573 c03bb71d de940f98 00000000 
c05ebf68 

       c05ebf68 00000021 c05dea84 
0000000a 00000000 c0122dc2 00000046 
00000000 

Call Trace:

 [<c03bbb4b>] 
tcp_write_timer+0x42e/0x5fb

 [<c03bb71d>] tcp_write_timer+0x0/0x5fb

 [<c0126573>] 
run_timer_softirq+0x111/0x16d

 [<c03bb71d>] tcp_write_timer+0x0/0x5fb

 [<c0122dc2>] __do_softirq+0x69/0xde

 [<c0122e6e>] do_softirq+0x37/0x4d

 [<c012312d>] irq_exit+0x41/0x75

 [<c0110cef>] 
smp_apic_timer_interrupt+0x6e/0x7b

 [<c01038c9>] 
apic_timer_interrupt+0x2d/0x34

 [<c01077a0>] default_idle+0x2d/0x4c

 [<c01077a2>] default_idle+0x2f/0x4c

 [<c0101ac6>] cpu_idle+0xb5/0xd5

 =======================

Code: 00 00 8b 8e e8 00 00 00 c7 86 70 
05 00 00 00 00 00 00 89 86 6c 05 00 00 
8d 86 e8 00 00 00 39 c1 b8 00 00 00 00 
0f 44 c8 8d 51 20 <f6> 42 25 82 74 0a 
c7 86 6c 05 00 00 00 00 00 00 8a 42 25 
a8 02 

EIP: [<c03b6ff0>] 
tcp_enter_frto+0xfc/0x1e9 SS:ESP 
0068:c05ebf2c

Kernel panic - not syncing: Fatal 
exception in interrupt



nc: Write error: Connection refused

netconsole: network logging started

scsi1 : FCoE Driver

device eth3 entered promiscuous mode

ÿþBuffer I/O error on device sdb1, 
logical block 0

Buffer I/O error on device sdb1, 
logical block 1

Buffer I/O error on device sdb1, 
logical block 2

Buffer I/O error on device sdb1, 
logical block 3

------------[ cut here ]------------

kernel BUG at mm/slub.c:2730!

invalid opcode: 0000 [#1] SMP 

Modules linked in: fcoe libfc 
netconsole vmxnet(P)



Pid: 0, comm: swapper Tainted: P        
  (2.6.27-rc1vasulibfc-27-rc1 #11)

EIP: 0060:[<c016305f>] EFLAGS: 00010246 
CPU: 0

EIP is at kfree+0x3f/0xd1

EAX: 00000000 EBX: def90300 ECX: 
00000246 EDX: 00000004

ESI: c163d5cc EDI: def90330 EBP: 
deb7f000 ESP: c05ebec8

 DS: 007b ES: 007b FS: 00d8 GS: 0000 
SS: 0068

Process swapper (pid: 0, ti=c05ea000 
task=c058b3c0 task.ti=c05ea000)

Stack: def90300 def90330 def90300 
def90300 def90330 dfa99c00 c038f2eb 
def90300 

       c03c4057 c05d3a0c def90300 
def90330 00000000 c03a7ade deb7f01e 
def90300 

       c03a7fb0 00000062 def90300 
c05e85e8 c05d3ac0 df425000 c0393857 
df425000 

Call Trace:

 [<c038f2eb>] __kfree_skb+0x8/0x61

 [<c03c4057>] icmp_rcv+0x1e9/0x1f0

 [<c03a7ade>] 
ip_local_deliver+0xcf/0x17b

 [<c03a7fb0>] ip_rcv+0x426/0x472

 [<c0393857>] 
netif_receive_skb+0x1ae/0x247

 [<c0395cc1>] process_backlog+0x78/0xce

 [<c03956c2>] net_rx_action+0xae/0x198

 [<c0122dc2>] __do_softirq+0x69/0xde

 [<c0122e6e>] do_softirq+0x37/0x4d

 [<c012312d>] irq_exit+0x41/0x75

 [<c0104c85>] do_IRQ+0x6e/0x7d

 [<c01037c4>] 
common_interrupt+0x28/0x30

 [<c01077a0>] default_idle+0x2d/0x4c

 [<c01077a2>] default_idle+0x2f/0x4c

 [<c0101ac6>] cpu_idle+0xb5/0xd5

 =======================

Code: 00 00 00 40 c1 e8 0c 6b f0 34 03 
35 00 69 8e c0 8b 06 25 00 40 00 00 85 
c0 74 03 8b 76 0c 80 3e 00 78 19 f7 06 
00 60 00 00 75 04 <0f> 0b eb fe 5a 89 
f0 59 5b 5e 5f 5d e9 ce ab fe ff 8b 44 
24 18 

EIP: [<c016305f>] kfree+0x3f/0xd1 
SS:ESP 0068:c05ebec8

ÿþsd 1:0:0:0: [sdb] Result: 
hostbyte=0x05 driverbyte=0x00

end_request: I/O error, dev sdb, sector 
32

Buffer I/O error on device sdb1, 
logical block 0

Buffer I/O error on device sdb1, 
logical block 1

Buffer I/O error on device sdb1, 
logical block 2

Buffer I/O error on device sdb1, 
logical block 3

------------[ cut here ]------------

WARNING: at net/ipv4/tcp_timer.c:290 
tcp_write_timer+0xa9/0x5fb()

Modules linked in: fcoe libfc 
netconsole vmxnet(P)

Pid: 0, comm: swapper Tainted: P        
  2.6.27-rc1vasulibfc-27-rc1 #11

 [<c011f28b>] 
warn_on_slowpath+0x40/0x63

 [<c03c15f7>] 
__udp4_lib_rcv+0x70c/0x733

 [<c0106ebe>] read_tsc+0x6/0x22

 [<c03bb71d>] tcp_write_timer+0x0/0x5fb

 [<c03bb731>] 
tcp_write_timer+0x14/0x5fb

 [<c042b417>] _spin_lock+0x17/0x22

 [<c03bb7c6>] 
tcp_write_timer+0xa9/0x5fb

 [<c03bb71d>] tcp_write_timer+0x0/0x5fb

 [<c0126573>] 
run_timer_softirq+0x111/0x16d

 [<c03bb71d>] tcp_write_timer+0x0/0x5fb

 [<c0122dc2>] __do_softirq+0x69/0xde

 [<c0122e6e>] do_softirq+0x37/0x4d

 [<c012312d>] irq_exit+0x41/0x75

 [<c0110cef>] 
smp_apic_timer_interrupt+0x6e/0x7b

 [<c01038c9>] 
apic_timer_interrupt+0x2d/0x34

 [<c01077a0>] default_idle+0x2d/0x4c

 [<c01077a2>] default_idle+0x2f/0x4c

 [<c0101ac6>] cpu_idle+0xb5/0xd5

 =======================

---[ end trace 12febf7e3e3e47f4 ]---

BUG: unable to handle kernel NULL 
pointer dereference at 00000045

IP: [<c03b6ff0>] 
tcp_enter_frto+0xfc/0x1e9

*pde = 00000000 

Oops: 0000 [#1] SMP 

Modules linked in: fcoe libfc 
netconsole vmxnet(P)



Pid: 0, comm: swapper Tainted: P        
W (2.6.27-rc1vasulibfc-27-rc1 #11)

EIP: 0060:[<c03b6ff0>] EFLAGS: 00010246 
CPU: 0

EIP is at tcp_enter_frto+0xfc/0x1e9

EAX: 00000000 EBX: de970640 ECX: 
00000000 EDX: 00000020

ESI: de970640 EDI: de970640 EBP: 
c03bb71d ESP: c05ebf2c

 DS: 007b ES: 007b FS: 00d8 GS: 0000 
SS: 0068

Process swapper (pid: 0, ti=c05ea000 
task=c058b3c0 task.ti=c05ea000)

Stack: de970640 0000000f c03bbb4b 
de97066c 00000046 c0686d00 de970728 
de970640 

       00000100 c0686d00 c03bb71d 
c0126573 c03bb71d de970958 00000000 
c05ebf68 

       c05ebf68 00000001 c05dea84 
0000000a 00000000 c0122dc2 00000046 
00000000 

Call Trace:

 [<c03bbb4b>] 
tcp_write_timer+0x42e/0x5fb

 [<c03bb71d>] tcp_write_timer+0x0/0x5fb

 [<c0126573>] 
run_timer_softirq+0x111/0x16d

 [<c03bb71d>] tcp_write_timer+0x0/0x5fb

 [<c0122dc2>] __do_softirq+0x69/0xde

 [<c0122e6e>] do_softirq+0x37/0x4d

 [<c012312d>] irq_exit+0x41/0x75

 [<c0110cef>] 
smp_apic_timer_interrupt+0x6e/0x7b

 [<c01038c9>] 
apic_timer_interrupt+0x2d/0x34

 [<c01077a0>] default_idle+0x2d/0x4c

 [<c01077a2>] default_idle+0x2f/0x4c

 [<c0101ac6>] cpu_idle+0xb5/0xd5

 =======================

Code: 00 00 8b 8e e8 00 00 00 c7 86 70 
05 00 00 00 00 00 00 89 86 6c 05 00 00 
8d 86 e8 00 00 00 39 c1 b8 00 00 00 00 
0f 44 c8 8d 51 20 <f6> 42 25 82 74 0a 
c7 86 6c 05 00 00 00 00 00 00 8a 42 25 
a8 02 

EIP: [<c03b6ff0>] 
tcp_enter_frto+0xfc/0x1e9 SS:ESP 
0068:c05ebf2c

Kernel panic - not syncing: Fatal 
exception in interrupt

_______________________________________________
devel mailing list
[email protected]
http://www.open-fcoe.org/mailman/listinfo/devel

Reply via email to