Hello,

with SLES10 SP1 on x86_64 (open-iscsi-2.0.707-0.32) I'm seeing a problem during 
login using "iscsiadm -m node -L automatic". After a few logins, login suddenly 
fails:
Login session [172.20.77.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f60.50001fe1500c1f68]
Login session [172.20.77.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f60.50001fe1500c1f69]
Login session [172.20.77.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f60.50001fe1500c1f6c]
Login session [172.20.77.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f60.50001fe1500c1f6d]
Login session [172.20.77.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f20.50001fe1500c1f28]
Login session [172.20.77.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f20.50001fe1500c1f29]
Login session [172.20.77.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f20.50001fe1500c1f2c]
Login session [172.20.77.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.1.50001fe1500c1f20.50001fe1500c1f2d]
Login session [172.20.77.1:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f60.50001fe1500c1f68]
Login session [172.20.77.1:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f60.50001fe1500c1f69]
Login session [172.20.77.1:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f60.50001fe1500c1f6c]
Login session [172.20.77.1:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f60.50001fe1500c1f6d]
Login session [172.20.76.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f60.50001fe1500c1f68]
Login session [172.20.76.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f60.50001fe1500c1f69]
Login session [172.20.76.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f60.50001fe1500c1f6c]
Login session [172.20.76.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f60.50001fe1500c1f6d]
Login session [172.20.77.1:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f20.50001fe1500c1f28]
Login session [172.20.77.1:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f20.50001fe1500c1f29]
Login session [172.20.77.1:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f20.50001fe1500c1f2c]
Login session [172.20.77.1:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis1.1.50001fe1500c1f20.50001fe1500c1f2d]
Login session [172.20.76.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f20.50001fe1500c1f28]
Login session [172.20.76.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f20.50001fe1500c1f29]
iscsiadm: Could not login session (err 5).

iscsiadm: initiator reported error (5 - encountered iSCSI login failure)
Login session [172.20.76.2:3260 iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f20.50001fe1500c1f2c]

Then the SSH session hangs, but the machine is still alive. Syslog says:
Mar  2 10:36:27 testhost kernel: Unable to handle kernel NULL pointer dereferenc
e at 0000000000000232 RIP:
Mar  2 10:36:27 testhost kernel: <ffffffff802ba089>{inet_sendmsg+23}
Mar  2 10:36:27 testhost kernel: PGD 0
Mar  2 10:36:27 testhost kernel: Oops: 0000 [1] SMP
Mar  2 10:36:27 testhost kernel: last sysfs file: 
/class/iscsi_connection/connection22:0/exp_statsn
Mar  2 10:36:27 testhost kernel: CPU 3
Mar  2 10:36:27 testhost kernel: Modules linked in: crc32c libcrc32c iscsi_tcp l
ibiscsi scsi_transport_iscsi nfs lockd nfs_acl sunrpc ip6t_REJECT xt_pkttype ipt
_REJECT ipt_TCPMSS xt_tcpudp ipt_LOG xt_limit xt_state iptable_mangle iptable_na
t ip_nat ip6table_mangle ip_conntrack nfnetlink ip6table_filter ip6_tables xt_ph
ysdev iptable_filter ip_tables x_tables bridge netbk netloop xenblk blkbk blktap
 xenbus_be ipmi_devintf ipv6 ipmi_si ipmi_msghandler af_packet button battery ac
 sr_mod loop usb_storage usbhid hw_random ide_cd cdrom i2c_amd8111 i2c_amd756 i2
c_core ohci_hcd mptctl shpchp usbcore pci_hotplug e1000 8250 serial_core reiserf
s dm_snapshot dm_mod fan thermal processor sg mptsas mptscsih mptbase scsi_trans
port_sas amd74xx sd_mod scsi_mod ide_disk ide_core
Mar  2 10:36:27 testhost kernel: Pid: 25485, comm: scsi_wq_25 Not tainted 
2.6.16.54-0.2.11-xen #1
Mar  2 10:36:27 testhost kernel: RIP: e030:[<ffffffff802ba089>] <ffffffff802ba08
9>{inet_sendmsg+23}
Mar  2 10:36:27 testhost kernel: RSP: e02b:ffff880011e0db78  EFLAGS: 00010296
Mar  2 10:36:27 testhost kernel: RAX: ffffffff802f1c40 RBX: 0000000000000000 RCX
: 0000000000000200
Mar  2 10:36:27 testhost kernel: RDX: ffff880011e0dd58 RSI: ffff8800080988c0 RDI
: ffff880011e0dba8
Mar  2 10:36:27 testhost kernel: RBP: 0000000000000200 R08: 0000000000000200 R09
: 0000000000008000
Mar  2 10:36:27 testhost kernel: R10: 00000000dbb545c6 R11: 0000000000000001 R12
: ffff880011e0dd58
Mar  2 10:36:27 testhost kernel: R13: ffff880011e0dba8 R14: ffff88000bdc52c0 R15
: 0000000000000200
Mar  2 10:36:27 testhost kernel: FS:  00002b77ef71e6d0(0000) GS:ffffffff803a2180
(0000) knlGS:0000000000000000
Mar  2 10:36:27 testhost kernel: CS:  e033 DS: 0000 ES: 0000
Mar  2 10:36:27 testhost kernel: Process scsi_wq_25 (pid: 25485, threadinfo ffff
880011e0c000, task ffff8800147e17c0)
Mar  2 10:36:27 testhost kernel: Stack: 0000000000000030 ffff8800080988c0 000000
0000000200 ffff880011e0dd58
Mar  2 10:36:27 testhost kernel:        0000000000000000 ffffffff8026e1da 000000
0000000018 ffff880011e0e000
Mar  2 10:36:27 testhost kernel:        0000000000000000 ffffffff00000001
Mar  2 10:36:27 testhost kernel: Call Trace: 
<ffffffff8026e1da>{sock_sendmsg+249} 
<ffffffff802d39dd>{__kprobes_text_start+845}
Mar  2 10:36:27 testhost kernel:        
<ffffffff8014195d>{autoremove_wake_function+0} 
<ffffffff8015cd68>{__alloc_pages+101}
Mar  2 10:36:27 testhost kernel:        <ffffffff8026fac1>{kernel_sendmsg+53} 
<ffffffff802708cb>{sock_no_sendpage+130}
Mar  2 10:36:27 testhost kernel:        <ffffffff8010df38>{monotonic_clock+53} 
<ffffffff883a4217>{:iscsi_tcp:iscsi_tcp_mtask_xmit+502}
Mar  2 10:36:27 testhost kernel:        
<ffffffff8839ba18>{:libiscsi:iscsi_xmitworker+0} 
<ffffffff8839b5e7>{:libiscsi:iscsi_xmit_mtask+84}
Mar  2 10:36:27 testhost kernel:        
<ffffffff8839bb4e>{:libiscsi:iscsi_xmitworker+310} 
<ffffffff8013dbc1>{run_workqueue+148}
Mar  2 10:36:27 testhost kernel:        <ffffffff8013e34e>{worker_thread+0} 
<ffffffff80141582>{keventd_create_kthread+0}
Mar  2 10:36:27 testhost kernel:        <ffffffff8013e43e>{worker_thread+240} 
<ffffffff801255ac>{default_wake_function+0}
Mar  2 10:36:27 testhost kernel:        
<ffffffff80141582>{keventd_create_kthread+0} 
<ffffffff80141582>{keventd_create_kthread+0}
Mar  2 10:36:27 testhost kernel:        <ffffffff80141826>{kthread+212} 
<ffffffff8010bab6>{child_rip+8}
Mar  2 10:36:27 testhost kernel:        
<ffffffff80141582>{keventd_create_kthread+0} <ffffffff80141752>{kthread+0}
Mar  2 10:36:27 testhost kernel:        <ffffffff8010baae>{child_rip+0}
Mar  2 10:36:27 testhost kernel:
Mar  2 10:36:27 testhost kernel: Code: 66 83 bb 32 02 00 00 00 75 0c 48 89 df 
e8 
ad f5 ff ff 85 c0
Mar  2 10:36:27 testhost kernel: RIP <ffffffff802ba089>{inet_sendmsg+23} RSP 
<ffff880011e0db78>
Mar  2 10:36:27 testhost kernel: CR2: 0000000000000232
Mar  2 10:39:29 testhost kernel:  <3>iscsi: can not unicast skb (-11)
Mar  2 10:39:29 testhost kernel: iscsi: can not broadcast skb (-3)
Mar  2 10:39:29 testhost kernel:  connection12:0: iscsi: detected conn error (10
11)
Mar  2 10:39:29 testhost kernel: iscsi: can not unicast skb (-11)
Mar  2 10:39:29 testhost kernel: iscsi: can not broadcast skb (-3)
Mar  2 10:39:29 testhost kernel:  connection11:0: iscsi: detected conn error (10
11)
Mar  2 10:39:30 testhost kernel: iscsi: can not unicast skb (-11)
Mar  2 10:39:30 testhost kernel: iscsi: can not broadcast skb (-3)
Mar  2 10:39:30 testhost kernel:  connection13:0: iscsi: detected conn error (10
11)
Mar  2 10:39:30 testhost kernel: iscsi: can not unicast skb (-11)
Mar  2 10:39:30 testhost kernel: iscsi: can not broadcast skb (-3)
Mar  2 10:39:30 testhost kernel:  connection14:0: iscsi: detected conn error (10
11)
Mar  2 10:39:31 testhost kernel: iscsi: can not unicast skb (-11)
Mar  2 10:39:31 testhost kernel: iscsi: can not broadcast skb (-3)
Mar  2 10:39:31 testhost kernel:  connection15:0: iscsi: detected conn error (10
11)

As the same procedure worked a many times, I suspect a race condition.

The hanging "iscsiadm -m node -L automatic" process hangs at:
# strace -p 25230
Process 25230 attached - interrupt to quit
recvfrom(5,

# lsof -p 25230
COMMAND    PID USER   FD   TYPE             DEVICE    SIZE   NODE NAME
iscsiadm 25230 root  cwd    DIR             253,10     920    125 /root
iscsiadm 25230 root  rtd    DIR             253,10     512      2 /
iscsiadm 25230 root  txt    REG             253,10  135464  54599 /sbin/iscsiadm
iscsiadm 25230 root  mem    REG                0,0              0 [heap] (stat: 
No 
such file or directory)
iscsiadm 25230 root  mem    REG             253,10  133423   9973 
/lib64/ld-2.4.so
iscsiadm 25230 root  mem    REG             253,10 1505121   9980 /lib64/libc-
2.4.so
iscsiadm 25230 root    0u   CHR              136,0              2 /dev/pts/0
iscsiadm 25230 root    1u   CHR              136,0              2 /dev/pts/0
iscsiadm 25230 root    2u   CHR              136,0              2 /dev/pts/0
iscsiadm 25230 root    3r   DIR             253,10    3120  54718 
/etc/iscsi/nodes
iscsiadm 25230 root    4r   DIR             253,10      80  54783 
/etc/iscsi/nodes/iqn.1986-
03.com.hp:fcgw.mpx100:rkdvmis2.0.50001fe1500c1f20.50001fe1500c1f2c
iscsiadm 25230 root    5u  unix 0xffff88000cb02680         763005 socket

The kernel being used is "kernel-xen-2.6.16.54-0.2.11" (not the absolutely 
latest, 
but stable for months)

# uptime
 10:52am  up 100 days 17:30,  2 users,  load average: 1.18, 1.23, 0.94

Regards,
Ulrich


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to