Re: I/O hang on SLES10 SP2 with Device-Mapper

2008-11-20 Thread Anthony C



On Nov 20, 2:59 pm, Anthony C <[EMAIL PROTECTED]> wrote:
> On Nov 20, 11:20 am, Mike Christie <[EMAIL PROTECTED]> wrote:
>
> > Anthony C wrote:
> > > Once the hanging occurred, I can leave it for days and it will still
> > > be hanging.  And I haven't seen above iSCSI States or Lun State change
> > > meanwhile.
>
> > > I thought with dm-multipath, I/O error handling would be limited in
> > > scsi layer.  And even with these error handling being done, it
> > > shouldn't get into "infinite hanging"?  Something is not right.
>
> > It shouldn't with the timeouts you have set.
>
> > What was in /sys/class/scsi_host/host4/host_busy when this happened.
> > There was a bug in some upstream rc kernels where this was zero and the
> > scsi layer would spin waiting for commands when there were none. Maybe
> > SUSE ported that code but did not get the fix.
>
> I just reproduced the problem and /sys/class/scsi_host/host4/host_busy
> has 6
>
>
>
> > I am not sure what is in SLES. Are you using the open-iscsi package with
> > SLES or a open-iscsi.org tarball?
>
> Yes, open-iscsi-2.0.707-0.44 that comes with the SLES
>
> > Do you have a
>
> > /sys/class/iscsi_session/session$XYZ/lu_reset_tmo
>
> > and
>
> > /sys/class/iscsi_session/session$XYZ/abort_tmo
>
> > file?
>
> All sessions' lu_reset_tmo is 20 and abort_tmo is 15
>
>
>
> > Or do you have the SLES kernel source? If so send
> > drivers/scsi/libiscsi.c and drivers/scs/iscsi_tcp.c.
>

> Yes, I have these files.  I may have to send to you private since
> there is no attachment option in discussion thread.

Just uploaded to Files section
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: I/O hang on SLES10 SP2 with Device-Mapper

2008-11-20 Thread Anthony C



On Nov 20, 11:20 am, Mike Christie <[EMAIL PROTECTED]> wrote:
> Anthony C wrote:
> > Once the hanging occurred, I can leave it for days and it will still
> > be hanging.  And I haven't seen above iSCSI States or Lun State change
> > meanwhile.
>
> > I thought with dm-multipath, I/O error handling would be limited in
> > scsi layer.  And even with these error handling being done, it
> > shouldn't get into "infinite hanging"?  Something is not right.
>
> It shouldn't with the timeouts you have set.
>

> What was in /sys/class/scsi_host/host4/host_busy when this happened.
> There was a bug in some upstream rc kernels where this was zero and the
> scsi layer would spin waiting for commands when there were none. Maybe
> SUSE ported that code but did not get the fix.

I just reproduced the problem and /sys/class/scsi_host/host4/host_busy
has 6

>
> I am not sure what is in SLES. Are you using the open-iscsi package with
> SLES or a open-iscsi.org tarball?
>

Yes, open-iscsi-2.0.707-0.44 that comes with the SLES

> Do you have a
>
> /sys/class/iscsi_session/session$XYZ/lu_reset_tmo
>
> and
>
> /sys/class/iscsi_session/session$XYZ/abort_tmo
>
> file?

All sessions' lu_reset_tmo is 20 and abort_tmo is 15

>
> Or do you have the SLES kernel source? If so send
> drivers/scsi/libiscsi.c and drivers/scs/iscsi_tcp.c.

Yes, I have these files.  I may have to send to you private since
there is no attachment option in discussion thread.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: I/O hang on SLES10 SP2 with Device-Mapper

2008-11-20 Thread Mike Christie

Anthony C wrote:
> Once the hanging occurred, I can leave it for days and it will still
> be hanging.  And I haven't seen above iSCSI States or Lun State change
> meanwhile.
> 
> I thought with dm-multipath, I/O error handling would be limited in
> scsi layer.  And even with these error handling being done, it
> shouldn't get into "infinite hanging"?  Something is not right.
> 

It shouldn't with the timeouts you have set.

What was in /sys/class/scsi_host/host4/host_busy when this happened. 
There was a bug in some upstream rc kernels where this was zero and the 
scsi layer would spin waiting for commands when there were none. Maybe 
SUSE ported that code but did not get the fix.

I am not sure what is in SLES. Are you using the open-iscsi package with 
SLES or a open-iscsi.org tarball?

Do you have a

/sys/class/iscsi_session/session$XYZ/lu_reset_tmo

and

/sys/class/iscsi_session/session$XYZ/abort_tmo

file?

Or do you have the SLES kernel source? If so send 
drivers/scsi/libiscsi.c and drivers/scs/iscsi_tcp.c.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: I/O hang on SLES10 SP2 with Device-Mapper

2008-11-20 Thread Anthony C



On Nov 19, 2:24 pm, Mike Christie <[EMAIL PROTECTED]> wrote:
> Anthony C wrote:
> > I am running heavy load test to an array connected to SLES10 SP2
> > system (2.6.16.60-0.21) using open-iscsi and device-mapper multipath
> > that comes with the distro.
>
> > The test from time to time will also inititate lun reset.  For
> > whatever reason, after couple hours I/O would just hang on a path.
> > And iscsiadm -m session -P3 would show that a path is in "recovery"
> > state (see below) and while other other paths are "running" state
>
> >         Current Portal: 192.0.1.144:3260,1
> >         Persistent Portal: 192.0.1.144:3260,1
> >                 **
> >                 Interface:
> >                 **
> >                 Iface Name: default
> >                 Iface Transport: tcp
> >                 Iface Initiatorname: iqn.1996-04.de.suse:
> > 01:7284eb499690
> >                 Iface IPaddress: 192.0.1.97
> >                 Iface HWaddress: default
> >                 Iface Netdev: default
> >                 SID: 1
> >                 iSCSI Connection State: LOGGED IN
> >                 iSCSI Session State: Unknown
> >                 Internal iscsid Session State: NO CHANGE
> >                 
> >                 Negotiated iSCSI params:
> >                 
> >                 HeaderDigest: None
> >                 DataDigest: None
> >                 MaxRecvDataSegmentLength: 131072
> >                 MaxXmitDataSegmentLength: 524288
> >                 FirstBurstLength: 262144
> >                 MaxBurstLength: 2097152
> >                 ImmediateData: No
> >                 InitialR2T: Yes
> >                 MaxOutstandingR2T: 1
> >                 
> >                 Attached SCSI devices:
> >                 
> >                 Host Number: 4  State: recovery
>

> What is the lun state that is output for each device/path right after this?

Luns state are "running"

>
>
>
> > So while other paths are deem good by iscsi and multipathd agrees
> > according to output in /var/log/messages, no test I/O nor multipathd
> > check I/O is going out to the wire on the so-called "bad" path.  It
> > seems they're being held back and never completed.
>
> When you see the Host state as recovery it means that a scsi command has
> timedout at the scsi layer, and that the scsi layer has started its
> error handler. If you do
>
> cat /sys/class/scsi_host/host4/host_busy
>
> you can see how many commands are stuck in recovery.
>
> At this time the scsi layer will have the driver try to abort and retry
> each command that is outstanding. If that fails the scsi layer will have
> the driver do a lun reset. And if that fails the scsi layer will have us
> do a host reset, which the driver drops the session then tries to
> relogin. If we try to drop and relogin, then these vales
>
>  >                 iSCSI Connection State: LOGGED IN
>  >                 iSCSI Session State: Unknown
>  >                 Internal iscsid Session State: NO CHANGE
>
> Would indicate that we trying to login and the session is not logged in.
> The device states (the output you did not include) would be blocked).

Once the hanging occurred, I can leave it for days and it will still
be hanging.  And I haven't seen above iSCSI States or Lun State change
meanwhile.

I thought with dm-multipath, I/O error handling would be limited in
scsi layer.  And even with these error handling being done, it
shouldn't get into "infinite hanging"?  Something is not right.

>
> If the recovery/replacement timeout eventually fires while we are trying
> to log back in, this will signal to the driver to give up and in this
> case the Host state will be online, but the devices will show offline,
> and the iscsi conn/session states above will indicate that we are in a
> failed state (the internal iscsid state will actually show that it is
> still trying to log back in because it is in case the connection does
> come back).
>

Btw, I have replacement_timeout=5 and default scsi device timeout 60s

>
>
> > On the trace, on the "bad" path the only I/O is iSCSI nop .  So who is
> > holding back all the I/O?  scsi mid-layer or iscsi or both?
>
> So it is both. The scsi layer initially blocks up io, but if aborts, and
> lun resets failed, then the iscsi layer will block things up until the
> replacement/recovery timeout has fired.- Hide quoted text -
>
> - Show quoted text -
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: I/O hang on SLES10 SP2 with Device-Mapper

2008-11-19 Thread Mike Christie

Anthony C wrote:
> I am running heavy load test to an array connected to SLES10 SP2
> system (2.6.16.60-0.21) using open-iscsi and device-mapper multipath
> that comes with the distro.
> 
> The test from time to time will also inititate lun reset.  For
> whatever reason, after couple hours I/O would just hang on a path.
> And iscsiadm -m session -P3 would show that a path is in "recovery"
> state (see below) and while other other paths are "running" state
> 
> Current Portal: 192.0.1.144:3260,1
> Persistent Portal: 192.0.1.144:3260,1
> **
> Interface:
> **
> Iface Name: default
> Iface Transport: tcp
> Iface Initiatorname: iqn.1996-04.de.suse:
> 01:7284eb499690
> Iface IPaddress: 192.0.1.97
> Iface HWaddress: default
> Iface Netdev: default
> SID: 1
> iSCSI Connection State: LOGGED IN
> iSCSI Session State: Unknown
> Internal iscsid Session State: NO CHANGE
> 
> Negotiated iSCSI params:
> 
> HeaderDigest: None
> DataDigest: None
> MaxRecvDataSegmentLength: 131072
> MaxXmitDataSegmentLength: 524288
> FirstBurstLength: 262144
> MaxBurstLength: 2097152
> ImmediateData: No
> InitialR2T: Yes
> MaxOutstandingR2T: 1
> 
> Attached SCSI devices:
> 
> Host Number: 4  State: recovery

What is the lun state that is output for each device/path right after this?

> 
> So while other paths are deem good by iscsi and multipathd agrees
> according to output in /var/log/messages, no test I/O nor multipathd
> check I/O is going out to the wire on the so-called "bad" path.  It
> seems they're being held back and never completed.

When you see the Host state as recovery it means that a scsi command has 
timedout at the scsi layer, and that the scsi layer has started its 
error handler. If you do

cat /sys/class/scsi_host/host4/host_busy

you can see how many commands are stuck in recovery.

At this time the scsi layer will have the driver try to abort and retry 
each command that is outstanding. If that fails the scsi layer will have 
the driver do a lun reset. And if that fails the scsi layer will have us 
do a host reset, which the driver drops the session then tries to 
relogin. If we try to drop and relogin, then these vales

 > iSCSI Connection State: LOGGED IN
 > iSCSI Session State: Unknown
 > Internal iscsid Session State: NO CHANGE

Would indicate that we trying to login and the session is not logged in. 
The device states (the output you did not include) would be blocked).

If the recovery/replacement timeout eventually fires while we are trying 
to log back in, this will signal to the driver to give up and in this 
case the Host state will be online, but the devices will show offline, 
and the iscsi conn/session states above will indicate that we are in a 
failed state (the internal iscsid state will actually show that it is 
still trying to log back in because it is in case the connection does 
come back).


> 
> On the trace, on the "bad" path the only I/O is iSCSI nop .  So who is
> holding back all the I/O?  scsi mid-layer or iscsi or both?
>

So it is both. The scsi layer initially blocks up io, but if aborts, and 
lun resets failed, then the iscsi layer will block things up until the 
replacement/recovery timeout has fired.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---