Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
> On Mon, 12 Mar 2007 10:52:22 +0800 Joe Jin <[EMAIL PROTECTED]> wrote: > > The 2.6.9 base is very old in mainline terms. Are you sure the bug hasn't > > been fixed in mainline by other means? > > I cannot confirm if it have fixed in latest kernel, the server is a > production system, it's hard to debug it and try reproduce. Well. That makes it hard to run tests, but perhaps it can be determined from code review.. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
> The 2.6.9 base is very old in mainline terms. Are you sure the bug hasn't > been fixed in mainline by other means? I cannot confirm if it have fixed in latest kernel, the server is a production system, it's hard to debug it and try reproduce. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
> > This is a bug actually in the megaraid. Aha, I'll track it. > > And this is a direct command submission path: it already passed both > online check gates in this path *after* the device was offlined, so > adding a third won't fix this. Yeah, I have notice that, however, from the logs, the device have offline, but why still can send cmd to device? isn't the sequences of printk suspectful? > single disk, so the I/O was definitely bound for sda? Secondly, can you > reproduce with a modern (2.6.20) kernel. Your trace strongly suggests > that the device came back online for some reason and then the megaraid > driver died. It's hard to update the kernel for the system is a production system, and we cannot debug it at the box :( I dont know if you have notice, the logs come from diskdump, if it caused by diskdump? Thanks, Joe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
On Fri, 2007-03-09 at 09:40 +0800, Joe Jin wrote: > > What's the error you're trying to fix? scsi_dispatch_cmd() is only > > called from scsi_request_fn() which already has an equivalent of this > > check in it just prior to calling dispatch. > > Yeah, I have saw the cheking at scsi_request_fn(), recently we got a crash > info as following at rhel4 2.6.9-42.0.2.ELsmp, This kernel is way to old to debug ... However: > scsi0 (0:0): rejecting I/O to offline device > ... > EXT3-fs error (device sda8) in start_transaction: Journal has aborted > > Unable to handle kernel NULL pointer dereference at RIP: > {:megaraid_mbox:megaraid_queue_command+2634} This is a bug actually in the megaraid. > PML4 21a25d067 PGD 2170ac067 PMD 0 > Oops: 0002 [1] SMP > CPU 0 > Modules linked in: hangcheck_timer mptctl mptbase ipmi_devintf ipmi_si > ipmi_msghandler dell_rbu netconsole netdump autofs4 i2c_dev i2c_core ocfs2(U) > debugfs(U) nfs lockd nfs_acl ocfs2_dlmfs(U) ocfs2_dlm(U) ocfs2_nodemanager(U) > configfs(U) sunrpc ds yenta_socket pcmcia_core ide_dump scsi_dump diskdump > zlib_deflate dm_mirror dm_multipath dm_mod emcphr(U) emcpmpap(U) emcpmpaa(U) > emcpmpc(U) emcpmp(U) emcp(U) emcplib(U) button battery ac joydev uhci_hcd > ehci_hcd hw_random tg3 e1000 bond0(U) floppy sg ext3 jbd lpfc > scsi_transport_fc megaraid_mbox megaraid_mm sd_mod scsi_mod > Pid: 13238, comm: emagent Tainted: P 2.6.9-42.0.2.ELsmp > RIP: 0010:[] > {:megaraid_mbox:megaraid_queue_command+2634} > RSP: 0018:01019b5a9b48 EFLAGS: 00010002 > RAX: 000220b8e000 RBX: 0102ffd1b048 RCX: > RDX: RSI: 0001 RDI: 010431124bf0 > RBP: 0001 R08: R09: 010133ce5b80 > R10: 0102ffd3e5a0 R11: 0060 R12: 010133ce5b80 > R13: 0102ffd3e480 R14: 0100bfb4c8b8 R15: 0101ffcf4000 > FS: () GS:804e5180(005b) knlGS:f47ffbb0 > CS: 0010 DS: 002b ES: 002b CR0: 8005003b > CR2: CR3: 00101000 CR4: 06e0 > Process emagent (pid: 13238, threadinfo 01019b5a8000, task > 01003e5a8030) > Stack: 0046 0046 0102ffd3e480 >0101fff73980 8015cb38 0100bfb4d4aa 0100bfb4d4a2 >0100bfb4c8b8 01010080 > Call Trace:{mempool_alloc+129} > {:scsi_mod:scsi_done+0} >{__mod_timer+113} > {:scsi_mod:scsi_dispatch_cmd+595} >{:scsi_mod:scsi_request_fn+990} > {generic_unplug_device+24} >{__wait_on_buffer+120} > {bh_wake_function+0} >{bh_wake_function+0} > {:ext3:ext3_bread+96} >{:ext3:htree_dirblock_to_tree+50} >{:ext3:ext3_htree_fill_tree+295} >{filldir64+122} {filldir64+0} >{:ext3:ext3_readdir+371} {dput+56} >{filldir64+0} {path_release+12} >{compat_sys_statfs+105} > {filldir64+0} >{vfs_readdir+155} > {sys_getdents64+118} >{sysenter_do_call+27} And this is a direct command submission path: it already passed both online check gates in this path *after* the device was offlined, so adding a third won't fix this. Firstly, I'm assuming you have only a single disk, so the I/O was definitely bound for sda? Secondly, can you reproduce with a modern (2.6.20) kernel. Your trace strongly suggests that the device came back online for some reason and then the megaraid driver died. James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
> On Fri, 9 Mar 2007 09:40:40 +0800 Joe Jin <[EMAIL PROTECTED]> wrote: > > What's the error you're trying to fix? scsi_dispatch_cmd() is only > > called from scsi_request_fn() which already has an equivalent of this > > check in it just prior to calling dispatch. > > Yeah, I have saw the cheking at scsi_request_fn(), recently we got a crash > info as following at rhel4 2.6.9-42.0.2.ELsmp, The 2.6.9 base is very old in mainline terms. Are you sure the bug hasn't been fixed in mainline by other means? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
On Fri, 9 Mar 2007 09:40:40 +0800 Joe Jin [EMAIL PROTECTED] wrote: What's the error you're trying to fix? scsi_dispatch_cmd() is only called from scsi_request_fn() which already has an equivalent of this check in it just prior to calling dispatch. Yeah, I have saw the cheking at scsi_request_fn(), recently we got a crash info as following at rhel4 2.6.9-42.0.2.ELsmp, The 2.6.9 base is very old in mainline terms. Are you sure the bug hasn't been fixed in mainline by other means? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
On Fri, 2007-03-09 at 09:40 +0800, Joe Jin wrote: What's the error you're trying to fix? scsi_dispatch_cmd() is only called from scsi_request_fn() which already has an equivalent of this check in it just prior to calling dispatch. Yeah, I have saw the cheking at scsi_request_fn(), recently we got a crash info as following at rhel4 2.6.9-42.0.2.ELsmp, This kernel is way to old to debug ... However: scsi0 (0:0): rejecting I/O to offline device ... EXT3-fs error (device sda8) in start_transaction: Journal has aborted Unable to handle kernel NULL pointer dereference at RIP: a0031e66{:megaraid_mbox:megaraid_queue_command+2634} This is a bug actually in the megaraid. PML4 21a25d067 PGD 2170ac067 PMD 0 Oops: 0002 [1] SMP CPU 0 Modules linked in: hangcheck_timer mptctl mptbase ipmi_devintf ipmi_si ipmi_msghandler dell_rbu netconsole netdump autofs4 i2c_dev i2c_core ocfs2(U) debugfs(U) nfs lockd nfs_acl ocfs2_dlmfs(U) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs(U) sunrpc ds yenta_socket pcmcia_core ide_dump scsi_dump diskdump zlib_deflate dm_mirror dm_multipath dm_mod emcphr(U) emcpmpap(U) emcpmpaa(U) emcpmpc(U) emcpmp(U) emcp(U) emcplib(U) button battery ac joydev uhci_hcd ehci_hcd hw_random tg3 e1000 bond0(U) floppy sg ext3 jbd lpfc scsi_transport_fc megaraid_mbox megaraid_mm sd_mod scsi_mod Pid: 13238, comm: emagent Tainted: P 2.6.9-42.0.2.ELsmp RIP: 0010:[a0031e66] a0031e66{:megaraid_mbox:megaraid_queue_command+2634} RSP: 0018:01019b5a9b48 EFLAGS: 00010002 RAX: 000220b8e000 RBX: 0102ffd1b048 RCX: RDX: RSI: 0001 RDI: 010431124bf0 RBP: 0001 R08: R09: 010133ce5b80 R10: 0102ffd3e5a0 R11: 0060 R12: 010133ce5b80 R13: 0102ffd3e480 R14: 0100bfb4c8b8 R15: 0101ffcf4000 FS: () GS:804e5180(005b) knlGS:f47ffbb0 CS: 0010 DS: 002b ES: 002b CR0: 8005003b CR2: CR3: 00101000 CR4: 06e0 Process emagent (pid: 13238, threadinfo 01019b5a8000, task 01003e5a8030) Stack: 0046 0046 0102ffd3e480 0101fff73980 8015cb38 0100bfb4d4aa 0100bfb4d4a2 0100bfb4c8b8 01010080 Call Trace:8015cb38{mempool_alloc+129} a0002874{:scsi_mod:scsi_done+0} 8013fc00{__mod_timer+113} a0002adf{:scsi_mod:scsi_dispatch_cmd+595} a0007a72{:scsi_mod:scsi_request_fn+990} 8024e385{generic_unplug_device+24} 8017a6d3{__wait_on_buffer+120} 8017a55e{bh_wake_function+0} 8017a55e{bh_wake_function+0} a00877fe{:ext3:ext3_bread+96} a008935c{:ext3:htree_dirblock_to_tree+50} a008952c{:ext3:ext3_htree_fill_tree+295} 8018b232{filldir64+122} 8018b1b8{filldir64+0} a0083ace{:ext3:ext3_readdir+371} 8018f019{dput+56} 8018b1b8{filldir64+0} 8018599c{path_release+12} 8019e335{compat_sys_statfs+105} 8018b1b8{filldir64+0} 8018aef7{vfs_readdir+155} 8018b2e8{sys_getdents64+118} 80125bbb{sysenter_do_call+27} And this is a direct command submission path: it already passed both online check gates in this path *after* the device was offlined, so adding a third won't fix this. Firstly, I'm assuming you have only a single disk, so the I/O was definitely bound for sda? Secondly, can you reproduce with a modern (2.6.20) kernel. Your trace strongly suggests that the device came back online for some reason and then the megaraid driver died. James - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
This is a bug actually in the megaraid. Aha, I'll track it. And this is a direct command submission path: it already passed both online check gates in this path *after* the device was offlined, so adding a third won't fix this. Yeah, I have notice that, however, from the logs, the device have offline, but why still can send cmd to device? isn't the sequences of printk suspectful? single disk, so the I/O was definitely bound for sda? Secondly, can you reproduce with a modern (2.6.20) kernel. Your trace strongly suggests that the device came back online for some reason and then the megaraid driver died. It's hard to update the kernel for the system is a production system, and we cannot debug it at the box :( I dont know if you have notice, the logs come from diskdump, if it caused by diskdump? Thanks, Joe - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
The 2.6.9 base is very old in mainline terms. Are you sure the bug hasn't been fixed in mainline by other means? I cannot confirm if it have fixed in latest kernel, the server is a production system, it's hard to debug it and try reproduce. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
On Mon, 12 Mar 2007 10:52:22 +0800 Joe Jin [EMAIL PROTECTED] wrote: The 2.6.9 base is very old in mainline terms. Are you sure the bug hasn't been fixed in mainline by other means? I cannot confirm if it have fixed in latest kernel, the server is a production system, it's hard to debug it and try reproduce. Well. That makes it hard to run tests, but perhaps it can be determined from code review.. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
> What's the error you're trying to fix? scsi_dispatch_cmd() is only > called from scsi_request_fn() which already has an equivalent of this > check in it just prior to calling dispatch. Yeah, I have saw the cheking at scsi_request_fn(), recently we got a crash info as following at rhel4 2.6.9-42.0.2.ELsmp, > megaraid: aborting-150766876 cmd=2a megaraid abort: 150766876:15[255:128], fw owner ... egaraid: aborting-150767541 cmd=2a megaraid abort: 150767541[255:128], driver owner megaraid: resetting the host... megaraid: 150766876:129[65535:65535], reset from pending list megaraid: 1 outstanding commands. Max wait 180 sec megaraid mbox: Wait for 1 commands to complete:180 ... megaraid mbox: Wait for 1 commands to complete:0 megaraid mbox: critical hardware error! megaraid: resetting the host... megaraid: hw error, cannot reset megaraid: resetting the host... megaraid: hw error, cannot reset scsi: Device offlined - not ready after error recovery: host 0 channel 2 id 0 lun 0 SCSI error : <0 2 0 0> return code = 0x600 end_request: I/O error, dev sda, sector 24117409 Buffer I/O error on device sda5, logical block 327797 ... EXT3-fs error (device sda8) in start_transaction: Journal has aborted scsi0 (0:0): rejecting I/O to offline device printk: 85 messages suppressed. Buffer I/O error on device sda5, logical block 327691 lost page write due to I/O error on sda5 scsi0 (0:0): rejecting I/O to offline device ... EXT3-fs error (device sda8) in start_transaction: Journal has aborted Unable to handle kernel NULL pointer dereference at RIP: {:megaraid_mbox:megaraid_queue_command+2634} PML4 21a25d067 PGD 2170ac067 PMD 0 Oops: 0002 [1] SMP CPU 0 Modules linked in: hangcheck_timer mptctl mptbase ipmi_devintf ipmi_si ipmi_msghandler dell_rbu netconsole netdump autofs4 i2c_dev i2c_core ocfs2(U) debugfs(U) nfs lockd nfs_acl ocfs2_dlmfs(U) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs(U) sunrpc ds yenta_socket pcmcia_core ide_dump scsi_dump diskdump zlib_deflate dm_mirror dm_multipath dm_mod emcphr(U) emcpmpap(U) emcpmpaa(U) emcpmpc(U) emcpmp(U) emcp(U) emcplib(U) button battery ac joydev uhci_hcd ehci_hcd hw_random tg3 e1000 bond0(U) floppy sg ext3 jbd lpfc scsi_transport_fc megaraid_mbox megaraid_mm sd_mod scsi_mod Pid: 13238, comm: emagent Tainted: P 2.6.9-42.0.2.ELsmp RIP: 0010:[] {:megaraid_mbox:megaraid_queue_command+2634} RSP: 0018:01019b5a9b48 EFLAGS: 00010002 RAX: 000220b8e000 RBX: 0102ffd1b048 RCX: RDX: RSI: 0001 RDI: 010431124bf0 RBP: 0001 R08: R09: 010133ce5b80 R10: 0102ffd3e5a0 R11: 0060 R12: 010133ce5b80 R13: 0102ffd3e480 R14: 0100bfb4c8b8 R15: 0101ffcf4000 FS: () GS:804e5180(005b) knlGS:f47ffbb0 CS: 0010 DS: 002b ES: 002b CR0: 8005003b CR2: CR3: 00101000 CR4: 06e0 Process emagent (pid: 13238, threadinfo 01019b5a8000, task 01003e5a8030) Stack: 0046 0046 0102ffd3e480 0101fff73980 8015cb38 0100bfb4d4aa 0100bfb4d4a2 0100bfb4c8b8 01010080 Call Trace:{mempool_alloc+129} {:scsi_mod:scsi_done+0} {__mod_timer+113} {:scsi_mod:scsi_dispatch_cmd+595} {:scsi_mod:scsi_request_fn+990} {generic_unplug_device+24} {__wait_on_buffer+120} {bh_wake_function+0} {bh_wake_function+0} {:ext3:ext3_bread+96} {:ext3:htree_dirblock_to_tree+50} {:ext3:ext3_htree_fill_tree+295} {filldir64+122} {filldir64+0} {:ext3:ext3_readdir+371} {dput+56} {filldir64+0} {path_release+12} {compat_sys_statfs+105} {filldir64+0} {vfs_readdir+155} {sys_getdents64+118} {sysenter_do_call+27} Code: 48 89 04 11 41 8b 44 24 18 49 83 c4 20 49 8b 56 20 89 44 11 RIP {:megaraid_mbox:megaraid_queue_command+2634} RSP <01019b5a9b48> CR2: < full crash info have update to http://patch.linux-security.cn/crashinfo/megaraid_crashinfo.log >From crashinfo, befor kernel panic, device have setting state to OFFLINE, but at that time, scsi cmd still will send to device. any advice? -Joe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
On Thu, 2007-03-08 at 17:22 +0800, Joe Jin wrote: > While a scsi device hw error occured, device's status maybe setting > to SDEV_OFFLINE, So at scsi_dispatch_cmd function, we should checking > if device have offline, if yes, do nothing and just return error to > user directly. What's the error you're trying to fix? scsi_dispatch_cmd() is only called from scsi_request_fn() which already has an equivalent of this check in it just prior to calling dispatch. James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
On Thu, 2007-03-08 at 17:22 +0800, Joe Jin wrote: While a scsi device hw error occured, device's status maybe setting to SDEV_OFFLINE, So at scsi_dispatch_cmd function, we should checking if device have offline, if yes, do nothing and just return error to user directly. What's the error you're trying to fix? scsi_dispatch_cmd() is only called from scsi_request_fn() which already has an equivalent of this check in it just prior to calling dispatch. James - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
What's the error you're trying to fix? scsi_dispatch_cmd() is only called from scsi_request_fn() which already has an equivalent of this check in it just prior to calling dispatch. Yeah, I have saw the cheking at scsi_request_fn(), recently we got a crash info as following at rhel4 2.6.9-42.0.2.ELsmp, megaraid: aborting-150766876 cmd=2a c=2 t=0 l=0 megaraid abort: 150766876:15[255:128], fw owner ... egaraid: aborting-150767541 cmd=2a c=2 t=0 l=0 megaraid abort: 150767541[255:128], driver owner megaraid: resetting the host... megaraid: 150766876:129[65535:65535], reset from pending list megaraid: 1 outstanding commands. Max wait 180 sec megaraid mbox: Wait for 1 commands to complete:180 ... megaraid mbox: Wait for 1 commands to complete:0 megaraid mbox: critical hardware error! megaraid: resetting the host... megaraid: hw error, cannot reset megaraid: resetting the host... megaraid: hw error, cannot reset scsi: Device offlined - not ready after error recovery: host 0 channel 2 id 0 lun 0 SCSI error : 0 2 0 0 return code = 0x600 end_request: I/O error, dev sda, sector 24117409 Buffer I/O error on device sda5, logical block 327797 ... EXT3-fs error (device sda8) in start_transaction: Journal has aborted scsi0 (0:0): rejecting I/O to offline device printk: 85 messages suppressed. Buffer I/O error on device sda5, logical block 327691 lost page write due to I/O error on sda5 scsi0 (0:0): rejecting I/O to offline device ... EXT3-fs error (device sda8) in start_transaction: Journal has aborted Unable to handle kernel NULL pointer dereference at RIP: a0031e66{:megaraid_mbox:megaraid_queue_command+2634} PML4 21a25d067 PGD 2170ac067 PMD 0 Oops: 0002 [1] SMP CPU 0 Modules linked in: hangcheck_timer mptctl mptbase ipmi_devintf ipmi_si ipmi_msghandler dell_rbu netconsole netdump autofs4 i2c_dev i2c_core ocfs2(U) debugfs(U) nfs lockd nfs_acl ocfs2_dlmfs(U) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs(U) sunrpc ds yenta_socket pcmcia_core ide_dump scsi_dump diskdump zlib_deflate dm_mirror dm_multipath dm_mod emcphr(U) emcpmpap(U) emcpmpaa(U) emcpmpc(U) emcpmp(U) emcp(U) emcplib(U) button battery ac joydev uhci_hcd ehci_hcd hw_random tg3 e1000 bond0(U) floppy sg ext3 jbd lpfc scsi_transport_fc megaraid_mbox megaraid_mm sd_mod scsi_mod Pid: 13238, comm: emagent Tainted: P 2.6.9-42.0.2.ELsmp RIP: 0010:[a0031e66] a0031e66{:megaraid_mbox:megaraid_queue_command+2634} RSP: 0018:01019b5a9b48 EFLAGS: 00010002 RAX: 000220b8e000 RBX: 0102ffd1b048 RCX: RDX: RSI: 0001 RDI: 010431124bf0 RBP: 0001 R08: R09: 010133ce5b80 R10: 0102ffd3e5a0 R11: 0060 R12: 010133ce5b80 R13: 0102ffd3e480 R14: 0100bfb4c8b8 R15: 0101ffcf4000 FS: () GS:804e5180(005b) knlGS:f47ffbb0 CS: 0010 DS: 002b ES: 002b CR0: 8005003b CR2: CR3: 00101000 CR4: 06e0 Process emagent (pid: 13238, threadinfo 01019b5a8000, task 01003e5a8030) Stack: 0046 0046 0102ffd3e480 0101fff73980 8015cb38 0100bfb4d4aa 0100bfb4d4a2 0100bfb4c8b8 01010080 Call Trace:8015cb38{mempool_alloc+129} a0002874{:scsi_mod:scsi_done+0} 8013fc00{__mod_timer+113} a0002adf{:scsi_mod:scsi_dispatch_cmd+595} a0007a72{:scsi_mod:scsi_request_fn+990} 8024e385{generic_unplug_device+24} 8017a6d3{__wait_on_buffer+120} 8017a55e{bh_wake_function+0} 8017a55e{bh_wake_function+0} a00877fe{:ext3:ext3_bread+96} a008935c{:ext3:htree_dirblock_to_tree+50} a008952c{:ext3:ext3_htree_fill_tree+295} 8018b232{filldir64+122} 8018b1b8{filldir64+0} a0083ace{:ext3:ext3_readdir+371} 8018f019{dput+56} 8018b1b8{filldir64+0} 8018599c{path_release+12} 8019e335{compat_sys_statfs+105} 8018b1b8{filldir64+0} 8018aef7{vfs_readdir+155} 8018b2e8{sys_getdents64+118} 80125bbb{sysenter_do_call+27} Code: 48 89 04 11 41 8b 44 24 18 49 83 c4 20 49 8b 56 20 89 44 11 RIP a0031e66{:megaraid_mbox:megaraid_queue_command+2634} RSP 01019b5a9b48 CR2: full crash info have update to http://patch.linux-security.cn/crashinfo/megaraid_crashinfo.log From crashinfo, befor kernel panic, device have setting state to OFFLINE, but at that time, scsi cmd still will send to device. any advice? -Joe - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/