Hello, with SLES10 SP1 on x86_64 (open-iscsi-2.0.707-0.32) I'm seeing a problem during login using "iscsiadm -m node -L automatic". After a few logins, login suddenly fails: Login session [172.20.77.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f60.50001fe1500c1f68] Login session [172.20.77.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f60.50001fe1500c1f69] Login session [172.20.77.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f60.50001fe1500c1f6c] Login session [172.20.77.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f60.50001fe1500c1f6d] Login session [172.20.77.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f20.50001fe1500c1f28] Login session [172.20.77.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f20.50001fe1500c1f29] Login session [172.20.77.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f20.50001fe1500c1f2c] Login session [172.20.77.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f20.50001fe1500c1f2d] Login session [172.20.77.1:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f60.50001fe1500c1f68] Login session [172.20.77.1:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f60.50001fe1500c1f69] Login session [172.20.77.1:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f60.50001fe1500c1f6c] Login session [172.20.77.1:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f60.50001fe1500c1f6d] Login session [172.20.76.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f60.50001fe1500c1f68] Login session [172.20.76.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f60.50001fe1500c1f69] Login session [172.20.76.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f60.50001fe1500c1f6c] Login session [172.20.76.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f60.50001fe1500c1f6d] Login session [172.20.77.1:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f20.50001fe1500c1f28] Login session [172.20.77.1:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f20.50001fe1500c1f29] Login session [172.20.77.1:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f20.50001fe1500c1f2c] Login session [172.20.77.1:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f20.50001fe1500c1f2d] Login session [172.20.76.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f20.50001fe1500c1f28] Login session [172.20.76.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f20.50001fe1500c1f29] iscsiadm: Could not login session (err 5).
iscsiadm: initiator reported error (5 - encountered iSCSI login failure) Login session [172.20.76.2:3260 iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f20.50001fe1500c1f2c] Then the SSH session hangs, but the machine is still alive. Syslog says: Mar 2 10:36:27 testhost kernel: Unable to handle kernel NULL pointer dereferenc e at 0000000000000232 RIP: Mar 2 10:36:27 testhost kernel: <ffffffff802ba089>{inet_sendmsg+23} Mar 2 10:36:27 testhost kernel: PGD 0 Mar 2 10:36:27 testhost kernel: Oops: 0000 [1] SMP Mar 2 10:36:27 testhost kernel: last sysfs file: /class/iscsi_connection/connection22:0/exp_statsn Mar 2 10:36:27 testhost kernel: CPU 3 Mar 2 10:36:27 testhost kernel: Modules linked in: crc32c libcrc32c iscsi_tcp l ibiscsi scsi_transport_iscsi nfs lockd nfs_acl sunrpc ip6t_REJECT xt_pkttype ipt _REJECT ipt_TCPMSS xt_tcpudp ipt_LOG xt_limit xt_state iptable_mangle iptable_na t ip_nat ip6table_mangle ip_conntrack nfnetlink ip6table_filter ip6_tables xt_ph ysdev iptable_filter ip_tables x_tables bridge netbk netloop xenblk blkbk blktap xenbus_be ipmi_devintf ipv6 ipmi_si ipmi_msghandler af_packet button battery ac sr_mod loop usb_storage usbhid hw_random ide_cd cdrom i2c_amd8111 i2c_amd756 i2 c_core ohci_hcd mptctl shpchp usbcore pci_hotplug e1000 8250 serial_core reiserf s dm_snapshot dm_mod fan thermal processor sg mptsas mptscsih mptbase scsi_trans port_sas amd74xx sd_mod scsi_mod ide_disk ide_core Mar 2 10:36:27 testhost kernel: Pid: 25485, comm: scsi_wq_25 Not tainted 2.6.16.54-0.2.11-xen #1 Mar 2 10:36:27 testhost kernel: RIP: e030:[<ffffffff802ba089>] <ffffffff802ba08 9>{inet_sendmsg+23} Mar 2 10:36:27 testhost kernel: RSP: e02b:ffff880011e0db78 EFLAGS: 00010296 Mar 2 10:36:27 testhost kernel: RAX: ffffffff802f1c40 RBX: 0000000000000000 RCX : 0000000000000200 Mar 2 10:36:27 testhost kernel: RDX: ffff880011e0dd58 RSI: ffff8800080988c0 RDI : ffff880011e0dba8 Mar 2 10:36:27 testhost kernel: RBP: 0000000000000200 R08: 0000000000000200 R09 : 0000000000008000 Mar 2 10:36:27 testhost kernel: R10: 00000000dbb545c6 R11: 0000000000000001 R12 : ffff880011e0dd58 Mar 2 10:36:27 testhost kernel: R13: ffff880011e0dba8 R14: ffff88000bdc52c0 R15 : 0000000000000200 Mar 2 10:36:27 testhost kernel: FS: 00002b77ef71e6d0(0000) GS:ffffffff803a2180 (0000) knlGS:0000000000000000 Mar 2 10:36:27 testhost kernel: CS: e033 DS: 0000 ES: 0000 Mar 2 10:36:27 testhost kernel: Process scsi_wq_25 (pid: 25485, threadinfo ffff 880011e0c000, task ffff8800147e17c0) Mar 2 10:36:27 testhost kernel: Stack: 0000000000000030 ffff8800080988c0 000000 0000000200 ffff880011e0dd58 Mar 2 10:36:27 testhost kernel: 0000000000000000 ffffffff8026e1da 000000 0000000018 ffff880011e0e000 Mar 2 10:36:27 testhost kernel: 0000000000000000 ffffffff00000001 Mar 2 10:36:27 testhost kernel: Call Trace: <ffffffff8026e1da>{sock_sendmsg+249} <ffffffff802d39dd>{__kprobes_text_start+845} Mar 2 10:36:27 testhost kernel: <ffffffff8014195d>{autoremove_wake_function+0} <ffffffff8015cd68>{__alloc_pages+101} Mar 2 10:36:27 testhost kernel: <ffffffff8026fac1>{kernel_sendmsg+53} <ffffffff802708cb>{sock_no_sendpage+130} Mar 2 10:36:27 testhost kernel: <ffffffff8010df38>{monotonic_clock+53} <ffffffff883a4217>{:iscsi_tcp:iscsi_tcp_mtask_xmit+502} Mar 2 10:36:27 testhost kernel: <ffffffff8839ba18>{:libiscsi:iscsi_xmitworker+0} <ffffffff8839b5e7>{:libiscsi:iscsi_xmit_mtask+84} Mar 2 10:36:27 testhost kernel: <ffffffff8839bb4e>{:libiscsi:iscsi_xmitworker+310} <ffffffff8013dbc1>{run_workqueue+148} Mar 2 10:36:27 testhost kernel: <ffffffff8013e34e>{worker_thread+0} <ffffffff80141582>{keventd_create_kthread+0} Mar 2 10:36:27 testhost kernel: <ffffffff8013e43e>{worker_thread+240} <ffffffff801255ac>{default_wake_function+0} Mar 2 10:36:27 testhost kernel: <ffffffff80141582>{keventd_create_kthread+0} <ffffffff80141582>{keventd_create_kthread+0} Mar 2 10:36:27 testhost kernel: <ffffffff80141826>{kthread+212} <ffffffff8010bab6>{child_rip+8} Mar 2 10:36:27 testhost kernel: <ffffffff80141582>{keventd_create_kthread+0} <ffffffff80141752>{kthread+0} Mar 2 10:36:27 testhost kernel: <ffffffff8010baae>{child_rip+0} Mar 2 10:36:27 testhost kernel: Mar 2 10:36:27 testhost kernel: Code: 66 83 bb 32 02 00 00 00 75 0c 48 89 df e8 ad f5 ff ff 85 c0 Mar 2 10:36:27 testhost kernel: RIP <ffffffff802ba089>{inet_sendmsg+23} RSP <ffff880011e0db78> Mar 2 10:36:27 testhost kernel: CR2: 0000000000000232 Mar 2 10:39:29 testhost kernel: <3>iscsi: can not unicast skb (-11) Mar 2 10:39:29 testhost kernel: iscsi: can not broadcast skb (-3) Mar 2 10:39:29 testhost kernel: connection12:0: iscsi: detected conn error (10 11) Mar 2 10:39:29 testhost kernel: iscsi: can not unicast skb (-11) Mar 2 10:39:29 testhost kernel: iscsi: can not broadcast skb (-3) Mar 2 10:39:29 testhost kernel: connection11:0: iscsi: detected conn error (10 11) Mar 2 10:39:30 testhost kernel: iscsi: can not unicast skb (-11) Mar 2 10:39:30 testhost kernel: iscsi: can not broadcast skb (-3) Mar 2 10:39:30 testhost kernel: connection13:0: iscsi: detected conn error (10 11) Mar 2 10:39:30 testhost kernel: iscsi: can not unicast skb (-11) Mar 2 10:39:30 testhost kernel: iscsi: can not broadcast skb (-3) Mar 2 10:39:30 testhost kernel: connection14:0: iscsi: detected conn error (10 11) Mar 2 10:39:31 testhost kernel: iscsi: can not unicast skb (-11) Mar 2 10:39:31 testhost kernel: iscsi: can not broadcast skb (-3) Mar 2 10:39:31 testhost kernel: connection15:0: iscsi: detected conn error (10 11) As the same procedure worked a many times, I suspect a race condition. The hanging "iscsiadm -m node -L automatic" process hangs at: # strace -p 25230 Process 25230 attached - interrupt to quit recvfrom(5, # lsof -p 25230 COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME iscsiadm 25230 root cwd DIR 253,10 920 125 /root iscsiadm 25230 root rtd DIR 253,10 512 2 / iscsiadm 25230 root txt REG 253,10 135464 54599 /sbin/iscsiadm iscsiadm 25230 root mem REG 0,0 0 [heap] (stat: No such file or directory) iscsiadm 25230 root mem REG 253,10 133423 9973 /lib64/ld-2.4.so iscsiadm 25230 root mem REG 253,10 1505121 9980 /lib64/libc- 2.4.so iscsiadm 25230 root 0u CHR 136,0 2 /dev/pts/0 iscsiadm 25230 root 1u CHR 136,0 2 /dev/pts/0 iscsiadm 25230 root 2u CHR 136,0 2 /dev/pts/0 iscsiadm 25230 root 3r DIR 253,10 3120 54718 /etc/iscsi/nodes iscsiadm 25230 root 4r DIR 253,10 80 54783 /etc/iscsi/nodes/iqn.1986- 03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f20.50001fe1500c1f2c iscsiadm 25230 root 5u unix 0xffff88000cb02680 763005 socket The kernel being used is "kernel-xen-2.6.16.54-0.2.11" (not the absolutely latest, but stable for months) # uptime 10:52am up 100 days 17:30, 2 users, load average: 1.18, 1.23, 0.94 Regards, Ulrich --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---