Re: Infortrend + "iSCSI: detected conn error (1011)" + "TCP Dup ACK"
El 06/11/09 14:10, mdaitc escribió: Hi mdaitc, > I’m seeing similar TCP “weirdness” as the other posts mention as well > as the below errors. (..) > Nov 2 08:15:14 backup kernel: connection33:0: detected conn error > The performance isn’t what I’d expect: (..) What happens if you disable TCP window scaling option in RHEL servers? # echo 0 > /proc/sys/net/ipv4/tcp_window_scaling In our case, iSCSI "conn errors" stopped after disabling, but still have a lot of TCP “weirdness” in the network, mainly dup ACKs packages. Regards, -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Infortrend + "iSCSI: detected conn error (1011)" + "TCP Dup ACK"
El 03/11/09 0:52, Mike Christie escribió: Dear Mike, > You can turn off ping/nops by setting > > node.conn[0].timeo.noop_out_interval = 0 > node.conn[0].timeo.noop_out_timeout = 0 > > (set that in iscsid.conf then rediscovery the target or run "iscsiadm -m > node -T your_target -o update -n name_of_param_above -v 0" Thanks!! As I said to James in the previous email, disabling TCP window scaling *solves partially* this problem, we still hold nop pings in the configuration. But still have too many "TCP Dup ACKs" in the network :-S > This might just work around. What might happen is that you will not see > the nop/ping and conn errors and instead would just see a slow down in > the workloads being run. I have sent your contact to Infortrend developers, a engineer will contact you, thanks! Regards, -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Infortrend + "iSCSI: detected conn error (1011)" + "TCP Dup ACK"
El 02/11/09 19:43, James A. T. Rice escribió: Dear James, > That looks vaguely familiar, although I think mine was nop-out timeout > (might be reported in another log file). Does it mostly happen when you do > long sequential reads from the Infortrend unit? In my case it turned out > to be a very low level of packet drops being caused by a cisco 2960G when > 'mls qos' was enabled (which due to an IOS bug, didn't increment the drop > counter). I'm not sure if the loss when 'mls qos' is enabled is by design > as part of WRED, or a function of the port buffers being divided up into > things smaller than optimal. > > Having TCP window scaling enabled made the problem an order of magnitude > worse, try disabling it and seeing if you have the same problem still? > (suggest something like dd if=/dev/sdc of=/dev/null bs=1048576 count=10 to > see if that triggers it, assuming it was the same problem I was > suffering). > > Every other iSCSI target I've tried recovered pretty gracefully from this, > but not the Infortrend, I suspect their TCP retransmit algorithm needs a > lot of love. I suspect it's pathologically broken when window scaling is > enabled. Disabling TCP window scaling [1] on Linux solves nop-out problem, we don't get more "iscsi: detected conn error" and performance improves :) It's very strange, we have 3 Cisco 2960G in the SAN and this behavior only occurs in two of them, we're looking in depth this problem. Nop-out has been solved but we still have a lot of "duplicate ACKs" in all machines. I will update this post with more info. James, thanks a lot of for the help. Regards, [1] # echo 0 > /proc/sys/net/ipv4/tcp_window_scaling -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Infortrend + "iSCSI: detected conn error (1011)" + "TCP Dup ACK"
Hi, Randomly we get Open-iSCSI "conn errors" when connecting to an Infortrend A16E-G2130-4 storage array. We had discussed about this earlier in the list, see: http://tr.im/DVQm http://tr.im/DVQp Open-iSCSI logs this: === Nov 2 18:34:02 vz-17 kernel: ping timeout of 5 secs expired, last rx 408250499, last ping 408249467, now 408254467 Nov 2 18:34:02 vz-17 kernel: connection1:0: iscsi: detected conn error (1011) Nov 2 18:34:03 vz-17 iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3) Nov 2 18:34:07 vz-17 iscsid: connection1:0 is operational after recovery (1 attempts) Nov 2 18:34:52 vz-17 kernel: ping timeout of 5 secs expired, last rx 408294833, last ping 408299833, now 408304833 Nov 2 18:34:52 vz-17 kernel: connection1:0: iscsi: detected conn error (1011) Nov 2 18:34:53 vz-17 iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3) Nov 2 18:34:57 vz-17 iscsid: connection1:0 is operational after recovery (1 attempts) === Running on CentOS 5.4 with "iscsi-initiator-utils-6.2.0.871-0.10.el5"; I think it's not a Open-iSCSI bug as Mike suggested at: http://groups.google.com/group/open-iscsi/msg/fe37156096b2955f I have only this error when connecting to Infortrend storage, and not with NetApp, Nexsan, etc. *connected in the same SAN*. Using Wireshark I see a lot of "TCP Dup ACK", "TCP ACKed lost segment", etc. and iSCSI session finally ends in timeout, see a screenshot here: http://tinyurl.com/ykpvckn Using Wireshark IO graphs I get this strange report about TCP/IP errors: http://tinyurl.com/ybm4m8x And this is another report in the same SAN connecting to a NetApp: http://tinyurl.com/ycgc8ul Those TCP/IP errors only occurs when connecting to Infortrend storage.. and no with other targets in the same SAN (using same switch infrastructure); is there anyway to deal with this using Open-iSCSI? As I see in Internet, there're a lot of Infortrend's users suffering this behavior. Thanks! P.D: speed and duplex configuration is correct in all point, there aren't CRC errors in the switch. -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Help with some iSCSI connect random errors
Hi, Randomly I get those iSCSI errors on a Linux box with CentOS 5.3, running default kernel (2.6.18) and using Open-iSCSI (6.2.0.868-0.18.el5_3.1): ping timeout of 5 secs expired, last rx (..) connection1:0: iscsi: detected conn error (1011) Kernel reported iSCSI connection 1:0 error (1011) state (3) session1: iscsi: session recovery timed out after 120 secs iscsi: cmd 0x28 is not queued (8) sd 1:0:0:0: SCSI error: return code = 0x0001 end_request: I/O error, dev sdb, sector 226732039 sd 1:0:0:0: SCSI error: return code = 0x0001 end_request: I/O error, dev sdb, sector 187040175 Full log is available at: http://pastebin.com/f40472f99 After that, we need to reboot the server to recover read-write into ext3 fs. Where use default Open-iSCSI config: http://pastebin.com/f9f15d82 More info about this device: # cat /sys/block/sdb/device/timeout 60 # cat /sys/class/iscsi_session/session1/recovery_tmo 120 There are more initiators conected to the same target and switch, and are not afectted by this situation, so we think that maybe changing some Open-iSCSI configuration parameter we can solve this.. any ideas? thanks!! Regards, -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Very strange problem with an Infortrend A16E iSCSI storage array
Hi Mike, El 3/2/09 20:19, Mike Christie escribió: >> *Randomly*, one of these channels resets, making the 4 servers connected >> to the channel timeout. The other 3 channels are not affected at all. (..) > The initiatior sends a iscsi ping every X seconds. If we do not get a > response in Y seconds we drop the session (drop connection and relogin). Yes, we were aware of this bug. In fact, you helped us with it not too long ago: http://tinyurl.com/cywy3j > There was a bug in the initiator where we would spit out this timeout > error by accident. What kernel are you using? Are you using the iscsi > modules in the kernel or modules from a open-iscsi.org release and what > release of open-iscsi.org? # iscsiadm -m session -P 3 iSCSI Transport Class version 2.0-724 iscsiadm version 2.0-868 Target: iqn.2002-10.com.infortrend:raid.sn7457155.30 Current Portal: 10.15.17.133:3260,1 Persistent Portal: 10.15.17.133:3260,1 ** Interface: ** Iface Name: default Iface Transport: tcp Iface Initiatorname: iqn.2001-05.net.example:vz11 Iface IPaddress: 10.15.17.137 Iface HWaddress: default Iface Netdev: default SID: 2 iSCSI Connection State: LOGGED IN iSCSI Session State: Unknown Internal iscsid Session State: NO CHANGE Negotiated iSCSI params: HeaderDigest: None DataDigest: None MaxRecvDataSegmentLength: 131072 MaxXmitDataSegmentLength: 65536 FirstBurstLength: 65536 MaxBurstLength: 262144 ImmediateData: Yes InitialR2T: No MaxOutstandingR2T: 1 Attached SCSI devices: Host Number: 2 State: running scsi2 Channel 00 Id 0 Lun: 0 Attached scsi disk sdb State: running We're using CentOS 5.2 with default "iscsi-initiator-utils" package: # rpm -qa iscsi-initiator-utils iscsi-initiator-utils-6.2.0.868-0.7.el5 Also, using default iSCSI modules. >> connection4:0: iscsi: detected conn error (1011) >> session4: iscsi: session recovery timed out after 120 secs > > I do not think it is the bug, because you would normally log right back in. > > The recovery timed out error means that the initiator tried to log back > in for 120 seconds and during that time we could not reconnect/relogin. > > I think this makes sense when looking at the switch messages below. If > something causes the link to go down, the iscsi ping would fail/timeout. > > I am not sure if the iscsi layer dropping the session would cause the > link to go down/up. The link that goes down/up isn't the link between switch and the host, the link affected is between the *switch and the array*, very strange. It appears that some iSCSI client is causing "something" that makes iSCSI interface in the array to reset.. I think it's not a problem with Open-iSCSI and it's a Infortrend array bug, but perhaps someone may shed some light with this problem. As I said, when this ocurrs it affects to all servers connected to this iSCSI interface/channel, including Windows hosts, etc.. Regards, -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Very strange problem with an Infortrend A16E iSCSI storage array
Hi, We have a very strange problem with an Infortrend A16E iSCSI storage array [1]. I think it's not a Open-iSCSI related problem, but someone here may shed some light :-) This array has 4 iSCSI interfaces to distribute/balance ethernet traffic. There are 16 hosts connected to this array via iSCSI, with 4 hosts per channel/interface. *Randomly*, one of these channels resets, making the 4 servers connected to the channel timeout. The other 3 channels are not affected at all. Open-iSCSI logs this: ping timeout of 5 secs expired, last rx 502453156, last ping 502446907, now 502463156 connection4:0: iscsi: detected conn error (1011) session4: iscsi: session recovery timed out after 120 secs iscsi: cmd 0x28 is not queued (8) iscsi: cmd 0x28 is not queued (8) iscsi: cmd 0x28 is not queued (8) sd 4:0:0:0: SCSI error: return code = 0x0001 end_request: I/O error, dev sdc, sector 338694423 (..) The switch port where it is connected shows: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/5, changed state to down %LINK-3-UPDOWN: Interface GigabitEthernet0/5, changed state to down %LINK-3-UPDOWN: Interface GigabitEthernet0/5, changed state to up %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/5, changed state to up It appears like iSCSI channel *resets* and starts a down+up port process.. we have changed the wire, the switch.. and still get the same error. The Infortrend array is logging nothing and the official support people have no idea about this issue :-/ We believe that the source of the problem is a single server. When we move this server to a different iSCSI channel we get the same error there, and the channel where it previously was starts working as expected, with no interface resets. Anyone could say that something in that faulty server is making the interface reset; but we've checked it several times and we really believe that the server is configured as the other 16 we have attached to the array. The switch connecting the servers and the array is a Cisco Catalyst 2960G. Anyone ever experienced anything similar? Regards, [1] http://www.infortrend.com/main/2_product/es_a16e-g2130-4.asp -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Open-iSCSI error on CentOs -> ping timeout of 5 secs expired
On Wed, 17 Dec 2008 11:43:42 -0600, Mike Christie wrote: > bugzilla.redhat.com/show_bug.cgi?id=460158 You are not authorized to access bug #460158. Thanks in any case, I will contact Virtuozzo dev team, Regards, -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Open-iSCSI error on CentOs -> ping timeout of 5 secs expired
On Wed, 17 Dec 2008 11:12:46 -0600, Mike Christie wrote: > It is an error in that we tried to send a ping to the target and did not > get a response. > > Are you using the kernel from CentOS 5.2? If so it has a bug in that > code patch that you might be hitting. The bug is that the code thought > the ping timedout when it had not, so the driver would fire off the conn > error and start recovery when we should not have. Thanks! Upss.. but I have a problem: it's a Virtuozzo based system, so I have not access to the source code to patch this bug. Virtuozzo is a Linux kernel modification based virtualization system, and it's not open-source :( But it's very extrange, we only have this problem *in one server*.. other server's has not this problem with the same scenario (there are +10 Linux servers with the same config). Do you have CentOS/Red Hat bug id to double-check with Virtuozzo development team? thanks! Regards, -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Open-iSCSI error on CentOs -> ping timeout of 5 secs expired
Hi, I'm getting this error on a CentOS 5.2 (i686) box connected via iSCSI to a Infortrend A16E storage array: ping timeout of 5 secs expired, last rx 1249707505, last ping 1249712505, now 1249717505 connection2:0: iscsi: detected conn error (1011) ping timeout of 5 secs expired, last rx 1252596366, last ping 1252595336, now 1252606366 connection2:0: iscsi: detected conn error (1011) As Mike said in this mail [1]: This happens when we cannot reach the target for the noop timout and interval seconds, which can happen if a cable is unplugged or the network is not reach able or is dropping packets. We have more than 10 servers connected to the same array, but we only get this "warning" in one server (using the same hardware, switch, etc..). It's very extrange.. there aren't network problems in the switch or in the Linux box, all apears OK, a simple "ping" between the Linux box and the array don't loss any packet, 100% of the packets are transmitted without problems. This is the target configuration in Open-iSCSI: == # iscsiadm -m node -p 10.15.17.133:3260 | grep -i time node.session.timeo.replacement_timeout = 120 node.session.err_timeo.abort_timeout = 15 node.session.err_timeo.lu_reset_timeout = 30 node.session.err_timeo.host_reset_timeout = 60 node.session.iscsi.DefaultTime2Retain = 0 node.session.iscsi.DefaultTime2Wait = 2 node.conn[0].timeo.logout_timeout = 15 node.conn[0].timeo.login_timeout = 15 node.conn[0].timeo.auth_timeout = 45 node.conn[0].timeo.noop_out_interval = 5 node.conn[0].timeo.noop_out_timeout = 5 == timeo.noop_out_interval and noop_out_timeout has default values, not changed. What means "iscsi: detected conn error"? It's really a problem, or only a warning? I have changed the wire to Cat6, but we still get the same errors, what can I do to solve this and what is the reason? And really maybe a network problem between initiator and target? thanks! Regards, [1] http://lkml.org/lkml/2008/6/25/299 -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: How to install open-iscsi only by moving files
Vincent Guo escribió: > I found the code in the script open-iscsi: > > # Source function library. > . /etc/init.d/functions > > What does these codes do ? > What will happen if I delete the code. In "functions" file there are general purpose functions for start/stop init scripts.. like status(), success(), etc.. it's used in a lot of scripts in /etc/init.d/*.. What are you doing? and what distro are you using? If you want to use Open-iSCSI start/stop script I think the best is to *copy* functions from another Red Hat based distro, removing this line is not a good idea.. Regards, -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Best way to take snapshots of iSCSI devices using Open-iSCSI
Hi, I want to take snapshots of a iSCSI devices from a target that hasn't snapshot/cloning capabilities (it's a Infortrend A16E storage array). What method are you using to make snapshots/clones of iSCSI targets using Open-iSCSI? What about using Open-iSCSI + LVM snapshots system? For example: - Take a LVM snapshot in the initiator with "lvcreate". - Give read-only access to the backup server, for the same LUN/volume - In the backup server, mount in read-only mode this snapshot. - Take a backup of this snapshot, using dd/tar/rsync for example. - Unmount the snapshot in the backup server - Remove this snapshot from the host with "lvremove". Is there any soft to make this? thanks! NOTE: It's a must, that device snapshots must be in other device.. not in the same target. Regards, -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: How to install open-iscsi only by moving files
On Thu, 4 Dec 2008 04:01:37 -0800 (PST), [EMAIL PROTECTED] wrote: > but when I input the command > ./open-iscsi start > > it reports errors: > > ;line 11: can't open /etc/init.d/functions What Linux flavour are you using? Perhaps it's a Linux From Scratch system? If it's a Red Hat based distro, you must install "initscripts" package: # rpm -q --whatprovides /etc/init.d/functions initscripts-8.45.19.EL-1.el5.centos.1 In "functions" file there are general purpose functions for start/stop init scripts.. Regards, -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Correct way to change I/O scheduler in a iSCSI dev
On Tue, 25 Nov 2008 10:51:33 -0600, Mike Christie <[EMAIL PROTECTED]> wrote: Hi Mike, > If you want to just config every iscsi device, then you could run > iscsiadm from a udev rule to check if a device is a iscsi device. > iscsiadm -m session -P 3 will disaplay the /dev/sdX and LUN for the > devices so if you parsed that and matched it, then you could set the > values. > > Maybe we should just add a common iscsi udev rule that users can edit. > If you someone has one that they want included I will add it, or we can > try to get it included with the udev package. Finally I think using udev to tune device config is the best and simplest way. $ cat /etc/udev/rules.d/99-san.rules # $Id: 99-san.rules.udev 13 2008-11-28 10:20:32Z santi $ # Set "noop" as I/O scheduler for iSCSI and Fiber Channel devices ACTION=="add", ENV{ID_FS_USAGE}!="filesystem", ENV{ID_PATH}=="*-iscsi-*", RUN+="/bin/sh -c 'echo noop > /sys$DEVPATH/queue/scheduler'" ACTION=="add", ENV{ID_FS_USAGE}!="filesystem", ENV{ID_PATH}=="*-fc-*", RUN+="/bin/sh -c 'echo noop > /sys$DEVPATH/queue/scheduler'" (To prevent line wrapping, it's also available at http://pastebin.com/f5ce875a1) When new iSCSI or FC device is added udevd will execute $RUN command; I set !="filesystem" condition to prevent running the command for each partition, I want to execute only for block devices. It will be great to have this example into udev package, thanks!! Regards, -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Correct way to change I/O scheduler in a iSCSI dev
On Tue, 25 Nov 2008 15:19:24 +0100, "Bart Van Assche" <[EMAIL PROTECTED]> wrote: > Please have a look at the hierarchy created by udevd in /dev. You can > find there soft links that have a name that does not change over > sessions and that point to the devices created by the iSCSI initiator > (/dev/sdb etc). Bart, thanks for the tip, but I want to distribute this config via Cfengine or other configuration management system (I need to replicate in more than 20 servers). So, it will be great to be completely iSCSI name independent. It will be fine to tune each device configuration when attached to the system via /sys, udev appears the best way to make this.. but perhaps there's a "standard" method for this using Open-iSCSI, after logging into a target with iscsiadm run a command, or something like this.. ENV{ID_PATH}=="*iscsi*", RUN+="echo noop > /sys/$env{DEVNAME}/queue/scheduler" P.S: I don't know if $env{DEVNAME} is the correct var.. but, something like this! ;) Regards, -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Correct way to change I/O scheduler in a iSCSI dev
Hi, What's the correct way to change configuration parameters for an iSCSI device? For example I/O scheduler, max_sectors_kb, etc... I could add commands to the S99local script: echo noop > /sys/block/sdb/queue/scheduler echo 64 > /sys/block/sdb/queue/max_hw_sectors_kb Unfortunately, iSCSI device names might change from sdb to, say, sdc (server reboot, iSCSI target reconnection). If this happens, customizations would be lost or applied to a different device :-/ Any workaround for this? sysctl, udev, anything else? What's the "standard method" for this task? Thanks! -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Recommend I/O scheduler for Open-iSCSI
Dear Srs, We're experimenting low perfomance issues with Open-iSCSI with Linux 2.6.18 and using CFQ as I/O scheduler. It happens in all servers running Virtuozzo with near about 10 virtual machines per node. vmstat reports +30-40% for I/O wait.. and load average is very high. Changing I/O scheduler from iSCSI device from "cfq" to "noop" performance is OK, as expected -5% of I/O wait. It's curious, when changing from "cfq" to "noop" there are less reads that when using cfq!! I have measured using dstat, vmstat, etc.. and always occurs the same :-/ What's the recomended I/O scheduler for an iSCSI devices? And... which tool is recomended to benchmark iSCSI devices, fio, bonnie++, etc..? thanks! Regards, -- Santi Saez http://woop.es --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Problems with Open-iSCSI and Infortrend A16E-G2130-4
El 29/04/2008, a las 18:23, Mike Christie escribió: >>> Apr 29 10:24:40 vz-10 kernel: scsi1 : iSCSI Initiator over TCP/IP >>> Apr 29 10:24:41 vz-10 kernel: Vendor: IFT Model: A16E- >>> G2130-4 Rev: 361F >>> Apr 29 10:24:41 vz-10 kernel: Type: Direct- >>> Access ANSI SCSI revision: 04 >>> Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: 629145600 512-byte >>> hdwr sectors (322123 MB) >>> Apr 29 10:24:41 vz-10 kernel: sdb: Write Protect is off >>> Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: drive cache: write >>> back >>> Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: 629145600 512-byte >>> hdwr sectors (322123 MB) >>> Apr 29 10:24:41 vz-10 kernel: sdb: Write Protect is off >>> Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: drive cache: write >>> back >>> Apr 29 10:24:41 vz-10 iscsid: connection1:0 is operational now >>> Apr 29 10:24:44 vz-10 udevd-event[23432]: wait_for_sysfs: waiting >>> for '/sys/devices/platform/host1/session1/target1:0:0/1:0:0:0/ >>> ioerr_cnt' failed >>> Apr 29 10:25:06 vz-10 iscsid: Nop-out timedout after 15 seconds on Dear Srs, The problem has been solved disabling Jumbo Frames in the Infortrend target. Linux has Jumbo Frames enabled with 9000 bytes MTU, and the switch has this feature enabled.. very curious :-/ Regards, -- Santi Saez --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Problems with Open-iSCSI and Infortrend A16E-G2130-4
El 29/04/2008, a las 20:15, Mike Christie escribió: > > Santi Saez wrote: >> >> El 29/04/2008, a las 19:42, Mike Christie escribió: >> >> Dear Mike!! >> >>> Are you doing iscsi boot? Or did you start the iscsi service, try to >>> stop it then try to restart it and one of the steps had errors? >> >> No, the server isn't booting from SAN. I'm starting iscsi service >> manually, restarting the service I get the same error :-/ > > doh, oh yeah, I forgot you are using centos. That is expected for > service restarts. Dear Mike, I have make some test with latest Open-iSCSI version "open- iscsi-2.0-869", and now I get new error: Apr 30 11:47:35 vz-09 kernel: iscsi: registered transport (tcp) Apr 30 11:47:39 vz-09 kernel: scsi1 : iSCSI Initiator over TCP/IP Apr 30 11:47:39 vz-09 kernel: Vendor: IFT Model: A16E- G2130-4 Rev: 361F Apr 30 11:47:39 vz-09 kernel: Type: Direct- Access ANSI SCSI revision: 04 Apr 30 11:47:39 vz-09 kernel: SCSI device sdb: 629145600 512-byte hdwr sectors (322123 MB) Apr 30 11:47:39 vz-09 kernel: sdb: Write Protect is off Apr 30 11:47:39 vz-09 kernel: SCSI device sdb: drive cache: write back Apr 30 11:47:39 vz-09 kernel: SCSI device sdb: 629145600 512-byte hdwr sectors (322123 MB) Apr 30 11:47:39 vz-09 kernel: sdb: Write Protect is off Apr 30 11:47:39 vz-09 kernel: SCSI device sdb: drive cache: write back Apr 30 11:47:40 vz-09 iscsid: received iferror -38 Apr 30 11:47:40 vz-09 last message repeated 4 times Apr 30 11:47:40 vz-09 iscsid: connection1:0 is operational now Apr 30 11:47:43 vz-09 udevd-event[2871]: wait_for_sysfs: waiting for '/sys/devices/platform/host1/session1/target1:0:0/1:0:0:0/ioerr_cnt' failed Apr 30 11:47:51 vz-09 iscsid: Nop-out timedout after 5 seconds on connection 1:0 state (3). Dropping session. Apr 30 11:47:54 vz-09 iscsid: received iferror -38 Apr 30 11:47:54 vz-09 last message repeated 4 times Apr 30 11:47:54 vz-09 iscsid: connection1:0 is operational after recovery (1 attempts) Apr 30 11:48:04 vz-09 iscsid: Nop-out timedout after 5 seconds on connection 1:0 state (3). Dropping session. Apr 30 11:48:07 vz-09 iscsid: received iferror -38 Apr 30 11:48:07 vz-09 last message repeated 4 times Apr 30 11:48:07 vz-09 iscsid: connection1:0 is operational after recovery (1 attempts) What means "received iferror -38" ?? Running the same kernel 2.6.18-53.1.14.el5PAE on a CentOS 5.1 i686 box. Regards, -- Santi Saez Hostalia Internet S.L.U. http://www.hostalia.com --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Problems with Open-iSCSI and Infortrend A16E-G2130-4
El 29/04/2008, a las 19:42, Mike Christie escribió: Dear Mike!! > Are you doing iscsi boot? Or did you start the iscsi service, try to > stop it then try to restart it and one of the steps had errors? No, the server isn't booting from SAN. I'm starting iscsi service manually, restarting the service I get the same error :-/ > For some reason all IO after the inquiry/report_luns does not seem > to be > getting to the target, or the target is not processing them. > > Does the inforterend box have multiple cards/hbas/ports? Does it > require > any ACL type of setup? Do you have to tell it to allow certain > initiators? > > Are there any errors in the target logs? The target is a Infortrend A16E-G2130-4 box, with 4 iSCSI interfaces. I have tested enabling/disabling CHAP authentication getting the same error. I have one partition and it's lun mapped to the first ethernet interface, I connect from the Open-iSCSI box to this interface.. I have tried with all interfaces, changing LUN, SCSI ID numbers, etc.. Maybe the problem is in the Infortrend target.. because of I have connected with the same config, CentOS 5.1 some days ago.. Regards, -- Santi Saez --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Problems with Open-iSCSI and Infortrend A16E-G2130-4
El 29/04/2008, a las 18:23, Mike Christie escribió: >> The problem appears to be related to udevd-event? The system is >> running CentOS 5.1, with kernel 2.6.18-53.1.14.el5PAE, and "iscsi- >> initiator-utils-6.2.0.865-0.8.el5". > > The target does not like our nops. If you set > > node.conn[0].timeo.noop_out_interval = 0 > node.conn[0].timeo.noop_out_timeout = 0 > > It should fix that problem, but if you could check the target logs for > something about a bad PDU or iSCSI protocol error or anything, we can > see why this is causing problems. Dear Mike, I get the same error changing those values at /etc/iscsi/iscsi.conf file and with "iscsiadm": > # tail -f -n0 -q /var/log/* 2> /dev/null > Apr 29 19:28:12 vz-09 iscsid: iSCSI logger with pid=3450 started! > Apr 29 19:28:12 vz-09 kernel: scsi2 : iSCSI Initiator over TCP/IP > Apr 29 19:28:12 vz-09 kernel: Vendor: IFT Model: A16E- > G2130-4 Rev: 361F > Apr 29 19:28:12 vz-09 kernel: Type: Direct- > Access ANSI SCSI revision: 04 > Apr 29 19:28:12 vz-09 kernel: SCSI device sdb: 629145600 512-byte > hdwr sectors (322123 MB) > Apr 29 19:28:12 vz-09 kernel: sdb: Write Protect is off > Apr 29 19:28:12 vz-09 kernel: SCSI device sdb: drive cache: write back > Apr 29 19:28:12 vz-09 kernel: SCSI device sdb: 629145600 512-byte > hdwr sectors (322123 MB) > Apr 29 19:28:12 vz-09 kernel: sdb: Write Protect is off > Apr 29 19:28:12 vz-09 kernel: SCSI device sdb: drive cache: write back > Apr 29 19:28:13 vz-09 iscsid: transport class version 2.0-724. > iscsid version 2.0-865 > Apr 29 19:28:13 vz-09 iscsid: iSCSI daemon with pid=3451 started! > Apr 29 19:28:13 vz-09 iscsid: Could not read data from db. Using > default and currently negotiated values > Apr 29 19:28:13 vz-09 iscsid: connection2:0 is operational now > Apr 29 19:28:16 vz-09 udevd-event[3470]: wait_for_sysfs: waiting > for '/sys/devices/platform/host2/session2/target2:0:0/2:0:0:0/ > ioerr_cnt' failed > Apr 29 19:28:16 vz-09 iscsid: connection1:0 is operational after > recovery (2 attempts) > Apr 29 19:29:02 vz-09 kernel: sdb:<6> connection2:0: iscsi: > detected conn error (1011) > Apr 29 19:29:03 vz-09 iscsid: Kernel reported iSCSI connection 2:0 > error (1011) state (3) > Apr 29 19:29:06 vz-09 kernel: iscsi: host reset succeeded > Apr 29 19:29:06 vz-09 iscsid: connection2:0 is operational after > recovery (2 attempts) (..) >> > > What are the settings for the PingTimeout, ActiveTimeout and > IdleTimeout > in /etc/iscsi.conf in the 4.6 installation? PingTimeout = default ActiveTimeout = default IdleTimeout = default Not changed, we're using default values.. This's the output of the iSCSI config: # iscsiadm -m node --targetname iqn. 2002-10.com.infortrend:raid.sn7511631.00 node.name = iqn.2002-10.com.infortrend:raid.sn7511631.00 node.tpgt = 1 node.startup = automatic iface.hwaddress = default iface.iscsi_ifacename = default iface.net_ifacename = default iface.transport_name = tcp node.discovery_address = 10.15.17.130 node.discovery_port = 3260 node.discovery_type = send_targets node.session.initial_cmdsn = 0 node.session.initial_login_retry_max = 4 node.session.cmds_max = 128 node.session.queue_depth = 32 node.session.auth.authmethod = None node.session.auth.username = node.session.auth.password = node.session.auth.username_in = node.session.auth.password_in = node.session.timeo.replacement_timeout = 120 node.session.err_timeo.abort_timeout = 10 node.session.err_timeo.reset_timeout = 30 node.session.iscsi.FastAbort = No node.session.iscsi.InitialR2T = No node.session.iscsi.ImmediateData = Yes node.session.iscsi.FirstBurstLength = 262144 node.session.iscsi.MaxBurstLength = 16776192 node.session.iscsi.DefaultTime2Retain = 0 node.session.iscsi.DefaultTime2Wait = 2 node.session.iscsi.MaxConnections = 1 node.session.iscsi.MaxOutstandingR2T = 1 node.session.iscsi.ERL = 0 node.conn[0].address = 10.15.17.130 node.conn[0].port = 3260 node.conn[0].startup = manual node.conn[0].tcp.window_size = 524288 node.conn[0].tcp.type_of_service = 0 node.conn[0].timeo.logout_timeout = 15 node.conn[0].timeo.login_timeout = 15 node.conn[0].timeo.auth_timeout = 45 node.conn[0].timeo.active_timeout = 5 node.conn[0].timeo.idle_timeout = 60 node.conn[0].timeo.ping_timeout = 5 node.conn[0].timeo.noop_out_interval = 0 node.conn[0].timeo.noop_out_timeout = 0 node.conn[0].iscsi.MaxRecvDataSegmentLength = 131072 node.conn[0].iscsi.HeaderDigest = None,CRC32C node.conn[0].iscsi.IFMarker = No node.conn[0].iscsi.OFMarker = No Regards!! -- Santi Saez Hostalia Internet S.L.U. http://www.hostalia.com --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Problems with Open-iSCSI and Infortrend A16E-G2130-4
Dear Srs, I'm getting this error when trying to connect to a Infortrend A16E- G2130-4 storage vía iSCSI. > Apr 29 10:24:40 vz-10 kernel: scsi1 : iSCSI Initiator over TCP/IP > Apr 29 10:24:41 vz-10 kernel: Vendor: IFT Model: A16E- > G2130-4 Rev: 361F > Apr 29 10:24:41 vz-10 kernel: Type: Direct- > Access ANSI SCSI revision: 04 > Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: 629145600 512-byte > hdwr sectors (322123 MB) > Apr 29 10:24:41 vz-10 kernel: sdb: Write Protect is off > Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: drive cache: write back > Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: 629145600 512-byte > hdwr sectors (322123 MB) > Apr 29 10:24:41 vz-10 kernel: sdb: Write Protect is off > Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: drive cache: write back > Apr 29 10:24:41 vz-10 iscsid: connection1:0 is operational now > Apr 29 10:24:44 vz-10 udevd-event[23432]: wait_for_sysfs: waiting > for '/sys/devices/platform/host1/session1/target1:0:0/1:0:0:0/ > ioerr_cnt' failed > Apr 29 10:25:06 vz-10 iscsid: Nop-out timedout after 15 seconds on > connection 1:0 state (3). Dropping session. > Apr 29 10:25:10 vz-10 iscsid: connection1:0 is operational after > recovery (2 attempts) > Apr 29 10:25:36 vz-10 iscsid: Nop-out timedout after 15 seconds on > connection 1:0 state (3). Dropping session. > Apr 29 10:25:39 vz-10 iscsid: connection1:0 is operational after > recovery (2 attempts) > Apr 29 10:26:05 vz-10 iscsid: Nop-out timedout after 15 seconds on > connection 1:0 state (3). Dropping session. > Apr 29 10:26:09 vz-10 iscsid: connection1:0 is operational after > recovery (2 attempts) > Apr 29 10:26:34 vz-10 iscsid: Nop-out timedout after 15 seconds on > connection 1:0 state (3). Dropping session. > Apr 29 10:26:38 vz-10 iscsid: connection1:0 is operational after > recovery (2 attempts) > Apr 29 10:27:03 vz-10 iscsid: Nop-out timedout after 15 seconds on > connection 1:0 state (3). Dropping session. > Apr 29 10:27:07 vz-10 iscsid: connection1:0 is operational after > recovery (2 attempts) > Apr 29 10:27:32 vz-10 kernel: sdb:<6>sd 1:0:0:0: SCSI error: > return code = 0x0002 > Apr 29 10:27:32 vz-10 kernel: end_request: I/O error, dev sdb, > sector 0 > Apr 29 10:27:32 vz-10 kernel: Buffer I/O error on device sdb, > logical block 0 > Apr 29 10:27:32 vz-10 iscsid: Nop-out timedout after 15 seconds on > connection 1:0 state (3). Dropping session. > Apr 29 10:27:36 vz-10 iscsid: connection1:0 is operational after > recovery (2 attempts) > Apr 29 10:28:02 vz-10 iscsid: Nop-out timedout after 15 seconds on > connection 1:0 state (3). Dropping session. > Apr 29 10:28:05 vz-10 iscsid: connection1:0 is operational after > recovery (2 attempts) > The problem appears to be related to udevd-event? The system is running CentOS 5.1, with kernel 2.6.18-53.1.14.el5PAE, and "iscsi- initiator-utils-6.2.0.865-0.8.el5". The "iscsiadm" holds on: [EMAIL PROTECTED]:~ # iscsiadm -m node --targetname iqn. 2002-10.com.infortrend:raid.sn7511631.10 -p 10.15.17.131 -l Login session [iface: default, target: iqn. 2002-10.com.infortrend:raid.sn7511631.10, portal: 10.15.17.131,3260] (..) It's a strange problem.. I have no errors with CentOS 4.6, what can be the problem? Thanks!! Regards, -- Santi Saez Hostalia Internet S.L.U. http://www.hostalia.com --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---