Am 27.08.2013 06:49, schrieb Mike Christie:
The scsi layer sets a timeout for each command. I think the default is 30 or 60 secs in SLES 11. If a command does not complete within that timeout, the scsi error handler runs. The scsi eh basically calls the iscsi eh callouts to try and abort commands then restart them. If it cannot abort them it tries lun and target resets and if those fail we end up dripping the session and relogging in. So that is what is happening here.You are probably sending too many commands to the device. Either the storage cannot handle them or the connection is too slow or some combo of both. Since you have 10 gig ethernet it is probably that the device is too slow. You would want to check your target's logs and see if there are any errors during this time. If not then lower the queue depth on the initiator side (see the iscsi node.session.queue_depth and node.session.cmds_max params) or increase the scsi command timeout via udev or sysfs (however SUSE reccomends).
Hello Mike, thank you for your reply. I've decreased node.session.cmds_max = 128 and node.session.queue_depth = 32 by a factor of 8 from the defaults down to node.session.cmds_max = 16 and node.session.queue_depth = 4And I increased the timeout of the block device from 60 to 180 by issueing the command, after I checked for the right block device of course
echo 180 > /sys/block/sda/device/timeout The error still appears.Meanwhile we have been testing a lot more. We also tried newer firmware and driver versions which are marked beta. But that only to get an idea where the root cause lies. Beta version are no go for production here.
We also tried different Linux Distributions, Red Hat 6.4 and Arch Linux.Red Hat with latest stable firmware and Red Hat stock drivers -> no error. Also Arch Linux doesn't show the error. We also tried different file systems on SLES: xfs, ext3 and btrfs. All the same error. nobarrier mount option with xfs: same error.
We noticed that the ISCSI_ERR_SCSI_EH_SESSION_RST error only appears with fio's random read test and with that in the phase where the program lays out the files from which it will read later on for its test. Not in the read phase itself. So acutally it is writing in that moment! In contrast fio's random write test doesn't produce that error. I can hammer on the target with 96 jobs each writing 1 GB and I get no error. This very curious in my eyes.
I also reduced the number of jobs that the fio benchmark runs to only one job. File size staying at 8 gb. Error still comes.
I reduced the file size to 4gb -> error, then again to 2 GB and behold the error didn't appear! I raised to 3 gb and got the error again. Then back to 2 GB and got the error again, too. So there seems to be no direct connection between the file size and the error. Feels like some buffers getting filled, and when they are full, it happens. This is puzzling. :(
Some times I think fio is the culprit but our database import (which we will need regularly in production) triggers the error also. So we should be glad that fio triggers it too. But we arn't, because we don't know where it comes from.
We have no access to the iscsi target's logs yet, so we cannot take a look at them. :(
Do you have any other idea? Kind regards, Timo
smime.p7s
Description: S/MIME Kryptografische Unterschrift
