Hi Mike, Yes, as I said the target node crashed and then the open-icssi kernel oops happens.
So far I only hit it once. I'll try to see if I can reproduce it. Kevin On Wed, Dec 9, 2009 at 8:00 PM, Mike Christie <[email protected]> wrote: > Qinghua(Kevin) Ye wrote: > > Hi All, > > > > I encountered another kernel oops in the open-iscsi code. Not sure if it > is > > fixed in the new code, but I would like to have some idea about it. > Thanks. > > > > My setup: > > Ubuntu 8.04 with kernel 2.6.24-24-generic. > > Open-iscsi 2.0-870.3 > > > > > > The kernel oops happens after my iscsi target node crashed. > > Here is the kernel message. > > Dec 7 10:08:21 qye-serv1 kernel: [1459378.575584] connection3903:0: > > detected conn error (1011) > > Dec 7 10:08:21 qye-serv1 kernel: [1459378.826718] sd 18028:0:0:16: > timing > > out command, waited 180s > > Dec 7 10:08:21 qye-serv1 kernel: [1459378.826827] sd 18028:0:0:16: [sdd] > > Result: hostbyte=DID_BUS_BUSY driverbyte=DRIVER_OK,SUGGEST_OK > > Dec 7 10:08:21 qye-serv1 kernel: [1459378.826840] end_request: I/O > error, > > dev sdd, sector 4505344 > > Dec 7 10:08:21 qye-serv1 kernel: [1459378.826897] Buffer I/O error on > > device sdd, logical block 563168 > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.629142] session3903: session > > recovery timed out after 120 secs > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.629618] BUG: unable to handle > > kernel paging request at virtual address fcb8d006 > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.629815] printing eip: e0a49ff7 > > *pde = 00000000 > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.630090] Oops: 0000 [#1] SMP > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.630290] Modules linked in: > > iscsi_tcp libiscsi scsi_transport_iscsi iscsi_trgt nls_iso8859_1 > nls_cp437 > > vfat fat nfsd auth_rpcgss exportfs crc32c libcrc32c vmmemctl > > cpufreq_conservative cpufreq_ondemand cpufreq_userspace cpufreq_stats > > freq_table cpufreq_powersave sbs video output sbshc dock battery nfs > lockd > > nfs_acl sunrpc iptable_filter ip_tables x_tables vmhgfs lp loop ipv6 > > intel_agp i2c_piix4 serio_raw container ac button agpgart i2c_core shpchp > > pci_hotplug parport_pc parport evdev psmouse pcspkr ext3 jbd mbcache > ide_cd > > cdrom sg sd_mod floppy pcnet32 mptspi mptscsih mptbase mii pata_acpi > > ata_generic scsi_transport_spi ata_piix libata scsi_mod ide_generic > ide_core > > raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath > > linear md_mod dm_mirror dm_snapshot dm_mod thermal processor fan fbcon > > tileblit font bitblit softcursor fuse vmxnet > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.631597] > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.631735] Pid: 32010, comm: > > iscsi_eh Not tainted (2.6.24-24-generic #1) > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.631837] EIP: 0060:[<e0a49ff7>] > > EFLAGS: 00010097 CPU: 0 > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.632168] EIP is at > > iscsi_queuecommand+0x47/0x260 [libiscsi] > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.632273] EAX: e09776fa EBX: > > d8384500 ECX: e09775e0 EDX: e0979560 > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.632370] ESI: d8384500 EDI: > > c38d9400 EBP: c38d9400 ESP: ce921eb8 > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.632467] DS: 007b ES: 007b FS: > > 00d8 GS: 0000 SS: 0068 > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.632583] Process iscsi_eh (pid: > > 32010, ti=ce920000 task=c61aa000 task.ti=ce920000) > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.633296] Stack: e0979560 > 00000000 > > 00000000 d8384500 00000287 c38d9400 00000031 e0979da7 > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.633508] dc501600 > dc501600 > > c38d9400 d8926000 d81dc810 e098018a 00000036 0016452a > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.633660] 00099996 > d8926028 > > d8926148 d89260b0 d8384500 d81dc810 d81dc810 00000000 > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.633816] Call Trace: > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.634045] [<e0979560>] > > scsi_done+0x0/0x20 [scsi_mod] > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.708347] [<e0979da7>] > > scsi_dispatch_cmd+0x147/0x280 [scsi_mod] > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.708502] [<e098018a>] > > scsi_request_fn+0x1ea/0x380 [scsi_mod] > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.749813] [<e097e760>] > > device_unblock+0x0/0x10 [scsi_mod] > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.749930] [<c020bda2>] > > blk_start_queue+0x32/0x90 > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.832531] [<c02803ce>] > > get_device+0xe/0x20 > > Dec 7 10:10:21 qye-serv1 kernel: [1459498.879473] [<e097913e>] > > scsi_device_get+0x1e/0x50 [scsi_mod] > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.879592] [<e097e735>] > > scsi_internal_device_unblock+0x35/0x60 [scsi_mod] > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.879714] [<e0979f52>] > > starget_for_each_device+0x72/0x80 [scsi_mod] > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.879966] [<e097e080>] > > target_unblock+0x0/0x20 [scsi_mod] > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.880094] [<e097e09b>] > > target_unblock+0x1b/0x20 [scsi_mod] > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.880201] [<c02805c2>] > > device_for_each_child+0x22/0x40 > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.880290] [<e0969710>] > > session_recovery_timedout+0x0/0xc0 [scsi_transport_iscsi] > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.880439] [<c013ce6f>] > > run_workqueue+0xbf/0x160 > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.933133] [<c013d910>] > > worker_thread+0x0/0xe0 > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.933218] [<c013d994>] > > worker_thread+0x84/0xe0 > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.933297] [<c0140c20>] > > autoremove_wake_function+0x0/0x40 > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.933407] [<c013d910>] > > worker_thread+0x0/0xe0 > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.933501] [<c0140962>] > > kthread+0x42/0x70 > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.933574] [<c0140920>] > > kthread+0x0/0x70 > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.934156] [<c0105667>] > > kernel_thread_helper+0x7/0x10 > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.934669] > ======================= > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.934759] Code: 00 00 00 00 89 > 90 > > f0 00 00 00 c7 80 20 01 00 00 00 00 00 00 c7 80 f4 00 00 00 00 00 00 00 > 8b > > 00 8b 28 b8 01 00 00 00 8b 01 2c 86 <02> 8b 06 8b 80 24 01 00 00 8b 58 74 > 81 > > eb a4 00 00 00 8b bb a0 > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.935195] EIP: [<e0a49ff7>] > > iscsi_queuecommand+0x47/0x260 [libiscsi] SS:ESP 0068:ce921eb8 > > Dec 7 10:10:22 qye-serv1 kernel: [1459498.954695] ---[ end trace > > 56831ec2af4ad03c ]--- > > > > Are you logging out or doing any iscsiadm operation while this is going on? > > It looks like we detected a connection problem (the target crashing > probably). There was some IO running at the time, so it got queued up. > The initiator tried to log into the target for > replacement/recovery_timeout seconds (120 in your trace), but could not > so the replacement_timeout/recovery_timeout fired, and the initiator > started failing IO. Then you got the oops above. > > I have not seen this before. Is this one reproducable? > > -- > > You received this message because you are subscribed to the Google Groups > "open-iscsi" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<open-iscsi%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/open-iscsi?hl=en. > > > -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
