Mike,

Thanks for the speedy reply!

> The hung task warnings are saying that some IO has taken longer than the
> hung task timeout value which looks like it is 2 miniutes for you.
>
> Are you doing any type of port down/up type of test?
Nope, just the following:
- Power on blade
- be2iscsi BIOS logs in to target(s)
- Grub loads linux (+initramfs)
- initramfs runs iscsistart -b

> Is there any line before this?
yes, but it's just the usual bootup bits. I'll include the whole
output at the bottom of this message.

What I already understood from the console output is that the iscsi
layer is failing up to multipath, and then when both paths are dead
(one always fails, followed by the other. Network conditions are good,
HBAs are still (icmp) ping-able, as is the target.) multipath passes
the failure on upwards, which eventually results in the kernel dying
to to losing its filesystems.

> So with the default setting of the replacement_timeout (120 secs) you
> should be seeing a message:
> session recovery timed out after X secs
> before you see hung task message below.

Yep, I do see those. I figured they were from the iscsi layer. Is it
worth validating the configuration first *without* using multipath?
It's fairly trivial to disable. The only reason i've not tried this
yet is that the same problem happened when using a non-multipath
target (single target on a linux box using ietd). I'm starting to
think that the be2iscsi driver or the actual ServerEngines HBA is
somehow unhappy. I've updated them to the latest firmware. These
problems don't happen with the iscsi_tcp module. I'd really like to
stick with be2iscsi though, as the offload cards have the *huge*
advantage of decoupling the iscsi and networking stacks.

> Is this easy to replicate? There is just too much going wrong here. If
> it happens again, can you do

it happens every time the machine is booted, after about 5-20 minutes

>
> cat /sys/block/sdX/device/state

# cat /sys/block/sd*/device/state
running
running
running

At the moment (booted about 5 minutes ago)

# iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-870
version 2.0-872
Target: iqn.2003-10.com.lefthandnetworks:thm-san:25:thm-vmutil01-root
        Current Portal: 10.20.128.100:3260,1
        Persistent Portal: 10.20.128.100:3260,1
                **********
                Interface:
                **********
                Iface Name: be2iscsi.d4:85:64:56:90:c9
                Iface Transport: be2iscsi
                Iface Initiatorname: 
iqn.2011-05.com.travelfusion.dc.thm-vmutil01
                Iface IPaddress: <empty>
                Iface HWaddress: d4:85:64:56:90:c9
                Iface Netdev: <empty>
                SID: 1
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 65536
                MaxXmitDataSegmentLength: 65536
                FirstBurstLength: 8192
                MaxBurstLength: 262144
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 0  State: running
                scsi0 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sda          State: running

                **********
                Interface:
                **********
                Iface Name: be2iscsi.d4:85:64:56:90:cd
                Iface Transport: be2iscsi
                Iface Initiatorname: 
iqn.2011-05.com.travelfusion.dc.thm-vmutil01
                Iface IPaddress: <empty>
                Iface HWaddress: d4:85:64:56:90:cd
                Iface Netdev: <empty>
                SID: 2
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 65536
                MaxXmitDataSegmentLength: 65536
                FirstBurstLength: 8192
                MaxBurstLength: 262144
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 1  State: running
                scsi1 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdb          State: running
Target: iqn.2003-10.com.lefthandnetworks:thm-san:27:thm-vmutil01-nfs
        Current Portal: 10.20.128.100:3260,1
        Persistent Portal: 10.20.128.100:3260,1
                **********
                Interface:
                **********
                Iface Name: be2iscsi.d4:85:64:56:90:c9
                Iface Transport: be2iscsi
                Iface Initiatorname: 
iqn.2011-05.com.travelfusion.dc.thm-vmutil01
                Iface IPaddress: <empty>
                Iface HWaddress: d4:85:64:56:90:c9
                Iface Netdev: <empty>
                SID: 3
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 65536
                MaxXmitDataSegmentLength: 65536
                FirstBurstLength: 8192
                MaxBurstLength: 262144
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 0  State: running
                scsi0 Channel 00 Id 1 Lun: 0
                        Attached scsi disk sdc          State: running

> Would you also be able to run a patch that will add some extra debugging
> to the driver and iscsi layer?
yes please!

> I will try to contact HP and get access to a box like this. Jay is
> leaving on vacation so I do not think he will be able to help for a
> couple days.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to