Re: Scientific Linux 4.6 (RHEL Deriv) and "Failing command cdb 0x8a..."

2008-07-11 Thread Mike Christie

John Reddy wrote:
> Obviously looking for some help, if any is there to be had.
> 
> Behavior I'm seeing is a log entry stating "ping timeout of 5 secs
> expired" then session dropped.  Roughly 2 seconds of "Failing
> command..." then the session is re-established.  (log excerpt below).
> This has been caused corruption in the ext3 journal, making the
> filesystem go read-only until its fsck'd.
> 
> I'm seeing a few ways this could be debugged, but I'm still new to
> iSCSI, so direction would be appreciated.  The "ping timeout of 5 sec"
> followed by 2 seconds, followed by re-establishing.  Is there a
> tunable parameter that I missed which would make the system less
> vulnerable these inexplicable lags?

In /etc/iscsi.conf there is:

ActiveTimeout
PingTimeout
IdleTimeout

If you set these to zero it would turn that off. You can also set it 
higher to a 30 secs or whatver is safest for your setup.

You also might want to consider using dm-multipath over iscsi, and then 
using dm-multipath's no_path_retry option.

> 
> I still yet to investigate the source of these lags.  The system does
> push a lot of traffic, ~ 200Mbit/sec sustained over the Gig eth.  I've
> also looked at updating my ethernet drivers.  The Broadcom drivers
> have been updated since the version released with SL 4.6

The initiator will send the target a iscsi ping every ActiveTimeout 
seconds if IO is running (if no IO is running it will send it every 
IdleTimeout).  If the initiator does not get a response to the nop 
within PingTimeout seconds you get the Ping timeout failover. If then 
that happens 5 times to the same command it is failed to the FS and you 
get the FS errors.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Scientific Linux 4.6 (RHEL Deriv) and "Failing command cdb 0x8a..."

2008-07-11 Thread John Reddy

Obviously looking for some help, if any is there to be had.

Behavior I'm seeing is a log entry stating "ping timeout of 5 secs
expired" then session dropped.  Roughly 2 seconds of "Failing
command..." then the session is re-established.  (log excerpt below).
This has been caused corruption in the ext3 journal, making the
filesystem go read-only until its fsck'd.

I'm seeing a few ways this could be debugged, but I'm still new to
iSCSI, so direction would be appreciated.  The "ping timeout of 5 sec"
followed by 2 seconds, followed by re-establishing.  Is there a
tunable parameter that I missed which would make the system less
vulnerable these inexplicable lags?

I still yet to investigate the source of these lags.  The system does
push a lot of traffic, ~ 200Mbit/sec sustained over the Gig eth.  I've
also looked at updating my ethernet drivers.  The Broadcom drivers
have been updated since the version released with SL 4.6

The system is Scientific Linux 4.6 (SL is a RHEL derivative like
CentOS) running on a quad - dual AMD, Tyan S4882 motherboard with
built in broadcom gig-ethernet.  The storage array is from Rorke
Data,  which is basically a re-branded Infortrend A16E-G2130-4.  The
iscsi target and the array are on the same subnet.

My kernel and iscsi-initiator-utils are as up-to-date as I can get
them

# cat /etc/redhat-release
Scientific Linux SL release 4.6 (Beryllium)
# uname -r
2.6.9-67.0.15.ELsmp
 # rpm -qi iscsi-initiator-utils
Name: iscsi-initiator-utilsRelocations: (not
relocatable)
Version : 4.0.3.0   Vendor: Scientific
Linux
Release : 6 Build Date: Tue Nov 20
17:49:23 2007
Install Date: Tue Jun 24 12:59:15 2008  Build Host: yort.fnal.gov
Group   : System Environment/DaemonsSource RPM: iscsi-
initiator-utils-4.0.3.0-6.src.rpm
Size: 227927   License: GPL
Signature   : DSA/SHA1, Thu Feb  7 11:39:33 2008, Key ID
25dbef78a7048f8d
URL : http://linux-iscsi.sourceforge.net/
Summary : iSCSI daemon and utility programs
Description :
The iscsi package provides the server daemon for the iSCSI protocol,
as well as the utility programs used to manage it. iSCSI is a protocol
for distributed disk access using SCSI commands sent over Internet
Protocol networks.


Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: ping timeout of 5
secs expired, last rx 4318639488, last ping 4318644488, now 4318649488
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Session dropped
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x8a task 636019 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x8a task 636033 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x88 task 636034 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x28 task 636035 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x28 task 636036 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x28 task 636037 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x28 task 636038 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x28 task 636039 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x28 task 636040 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x8a task 636041 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x88 task 636042 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x2a task 636043 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x2a task 636044 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x2a task 636045 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x2a task 636046 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x2a task 636047 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x2a task 636048 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x2a task 636049 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x2a task 636050 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x2a task 4294967295 with return code = 0x2
Jul  9 18:08:48 HOSTNAME kernel: iscsi-sfnet:host7: Failing command
cdb 0x28 task 4294967295 with return code = 0x2
Jul  9 18:08:48