Re: Infortrend + "iSCSI: detected conn error (1011)" + "TCP Dup ACK"

2009-11-09 Thread Santi Saez

El 06/11/09 14:10, mdaitc escribió:

Hi mdaitc,

> I’m seeing similar TCP “weirdness” as the other posts mention as  well
> as the below errors.

(..)

> Nov  2 08:15:14 backup kernel:  connection33:0: detected conn error
> The performance isn’t what I’d expect:

(..)

What happens if you disable TCP window scaling option in RHEL servers?

# echo 0 > /proc/sys/net/ipv4/tcp_window_scaling

In our case, iSCSI "conn errors" stopped after disabling, but still have 
a lot of TCP “weirdness” in the network, mainly dup ACKs packages.

Regards,

-- 
Santi Saez
http://woop.es

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Infortrend + "iSCSI: detected conn error (1011)" + "TCP Dup ACK"

2009-11-04 Thread Santi Saez

El 03/11/09 0:52, Mike Christie escribió:

Dear Mike,

> You can turn off ping/nops by setting
>
> node.conn[0].timeo.noop_out_interval = 0
> node.conn[0].timeo.noop_out_timeout = 0
>
> (set that in iscsid.conf then rediscovery the target or run "iscsiadm -m
> node -T your_target -o update -n name_of_param_above -v 0"

Thanks!! As I said to James in the previous email, disabling TCP window 
scaling *solves partially* this problem, we still hold nop pings in the 
configuration. But still have too many "TCP Dup ACKs" in the network :-S


> This might just work around. What might happen is that you will not see
> the nop/ping and conn errors and instead would just see a slow down in
> the workloads being run.

I have sent your contact to Infortrend developers, a engineer will 
contact you, thanks!

Regards,

-- 
Santi Saez
http://woop.es

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Infortrend + "iSCSI: detected conn error (1011)" + "TCP Dup ACK"

2009-11-04 Thread Santi Saez

El 02/11/09 19:43, James A. T. Rice escribió:

Dear James,

> That looks vaguely familiar, although I think mine was nop-out timeout
> (might be reported in another log file). Does it mostly happen when you do
> long sequential reads from the Infortrend unit? In my case it turned out
> to be a very low level of packet drops being caused by a cisco 2960G when
> 'mls qos' was enabled (which due to an IOS bug, didn't increment the drop
> counter). I'm not sure if the loss when 'mls qos' is enabled is by design
> as part of WRED, or a function of the port buffers being divided up into
> things smaller than optimal.
>
> Having TCP window scaling enabled made the problem an order of magnitude
> worse, try disabling it and seeing if you have the same problem still?
> (suggest something like dd if=/dev/sdc of=/dev/null bs=1048576 count=10 to
> see if that triggers it, assuming it was the same problem I was
> suffering).
>
> Every other iSCSI target I've tried recovered pretty gracefully from this,
> but not the Infortrend, I suspect their TCP retransmit algorithm needs a
> lot of love. I suspect it's pathologically broken when window scaling is
> enabled.

Disabling TCP window scaling [1] on Linux solves nop-out problem, we 
don't get more "iscsi: detected conn error" and performance improves :)

It's very strange, we have 3 Cisco 2960G in the SAN and this behavior 
only occurs in two of them, we're looking in depth this problem.

Nop-out has been solved but we still have a lot of "duplicate ACKs" in 
all machines. I will update this post with more info. James, thanks a 
lot of for the help.

Regards,

[1] # echo 0 > /proc/sys/net/ipv4/tcp_window_scaling

-- 
Santi Saez
http://woop.es

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Infortrend + "iSCSI: detected conn error (1011)" + "TCP Dup ACK"

2009-11-02 Thread Santi Saez


Hi,

Randomly we get Open-iSCSI "conn errors" when connecting to an  
Infortrend A16E-G2130-4 storage array. We had discussed about this  
earlier in the list, see:

  http://tr.im/DVQm
  http://tr.im/DVQp

Open-iSCSI logs this:

===
Nov  2 18:34:02 vz-17 kernel: ping timeout of 5 secs expired, last rx  
408250499, last ping 408249467, now 408254467
Nov  2 18:34:02 vz-17 kernel:  connection1:0: iscsi: detected conn  
error (1011)
Nov  2 18:34:03 vz-17 iscsid: Kernel reported iSCSI connection 1:0  
error (1011) state (3)
Nov  2 18:34:07 vz-17 iscsid: connection1:0 is operational after  
recovery (1 attempts)
Nov  2 18:34:52 vz-17 kernel: ping timeout of 5 secs expired, last rx  
408294833, last ping 408299833, now 408304833
Nov  2 18:34:52 vz-17 kernel:  connection1:0: iscsi: detected conn  
error (1011)
Nov  2 18:34:53 vz-17 iscsid: Kernel reported iSCSI connection 1:0  
error (1011) state (3)
Nov  2 18:34:57 vz-17 iscsid: connection1:0 is operational after  
recovery (1 attempts)
===

Running on CentOS 5.4 with "iscsi-initiator-utils-6.2.0.871-0.10.el5";  
I think it's not a Open-iSCSI bug as Mike suggested at:

http://groups.google.com/group/open-iscsi/msg/fe37156096b2955f

I have only this error when connecting to Infortrend storage, and not  
with NetApp, Nexsan, etc. *connected in the same SAN*.

Using Wireshark I see a lot of "TCP Dup ACK", "TCP ACKed lost  
segment", etc. and iSCSI session finally ends in timeout, see a  
screenshot here:

http://tinyurl.com/ykpvckn

Using Wireshark IO graphs I get this strange report about TCP/IP errors:

http://tinyurl.com/ybm4m8x

And this is another report in the same SAN connecting to a NetApp:

http://tinyurl.com/ycgc8ul

Those TCP/IP errors only occurs when connecting to Infortrend  
storage.. and no with other targets in the same SAN (using same switch  
infrastructure); is there anyway to deal with this using Open-iSCSI?  
As I see in Internet, there're a lot of Infortrend's users suffering  
this behavior.

Thanks!

P.D: speed and duplex configuration is correct in all point, there  
aren't CRC errors in the switch.

-- 
Santi Saez
http://woop.es

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Help with some iSCSI connect random errors

2009-06-25 Thread Santi Saez


Hi,

Randomly I get those iSCSI errors on a Linux box with CentOS 5.3, 
running default kernel (2.6.18) and using Open-iSCSI 
(6.2.0.868-0.18.el5_3.1):

ping timeout of 5 secs expired, last rx (..)
connection1:0: iscsi: detected conn error (1011)
Kernel reported iSCSI connection 1:0 error (1011) state (3)
session1: iscsi: session recovery timed out after 120 secs
iscsi: cmd 0x28 is not queued (8)
sd 1:0:0:0: SCSI error: return code = 0x0001
end_request: I/O error, dev sdb, sector 226732039
sd 1:0:0:0: SCSI error: return code = 0x0001
end_request: I/O error, dev sdb, sector 187040175

Full log is available at: http://pastebin.com/f40472f99

After that, we need to reboot the server to recover read-write into ext3 fs.

Where use default Open-iSCSI config:

http://pastebin.com/f9f15d82

More info about this device:

# cat /sys/block/sdb/device/timeout
60

# cat /sys/class/iscsi_session/session1/recovery_tmo
120

There are more initiators conected to the same target and switch, and 
are not afectted by this situation, so we think that maybe changing some 
Open-iSCSI configuration parameter we can solve this.. any ideas? thanks!!

Regards,

-- 
Santi Saez
http://woop.es

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Very strange problem with an Infortrend A16E iSCSI storage array

2009-02-04 Thread Santi Saez


Hi Mike,

El 3/2/09 20:19, Mike Christie escribió:

>> *Randomly*, one of these channels resets, making the 4 servers connected
>> to the channel timeout. The other 3 channels are not affected at all.

(..)

> The initiatior sends a iscsi ping every X seconds. If we do not get a
> response in Y seconds we drop the session (drop connection and relogin).

Yes, we were aware of this bug. In fact, you helped us with it not too 
long ago:

http://tinyurl.com/cywy3j


> There was a bug in the initiator where we would spit out this timeout
> error by accident. What kernel are you using? Are you using the iscsi
> modules in the kernel or modules from a open-iscsi.org release and what
> release of open-iscsi.org?

# iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-724
iscsiadm version 2.0-868
Target: iqn.2002-10.com.infortrend:raid.sn7457155.30
 Current Portal: 10.15.17.133:3260,1
 Persistent Portal: 10.15.17.133:3260,1
 **
 Interface:
 **
 Iface Name: default
 Iface Transport: tcp
 Iface Initiatorname: iqn.2001-05.net.example:vz11
 Iface IPaddress: 10.15.17.137
 Iface HWaddress: default
 Iface Netdev: default
 SID: 2
 iSCSI Connection State: LOGGED IN
 iSCSI Session State: Unknown
 Internal iscsid Session State: NO CHANGE
 
 Negotiated iSCSI params:
 
 HeaderDigest: None
 DataDigest: None
 MaxRecvDataSegmentLength: 131072
 MaxXmitDataSegmentLength: 65536
 FirstBurstLength: 65536
 MaxBurstLength: 262144
 ImmediateData: Yes
 InitialR2T: No
 MaxOutstandingR2T: 1
 
 Attached SCSI devices:
 
 Host Number: 2  State: running
 scsi2 Channel 00 Id 0 Lun: 0
 Attached scsi disk sdb  State: running


We're using CentOS 5.2 with default "iscsi-initiator-utils" package:

# rpm -qa iscsi-initiator-utils
iscsi-initiator-utils-6.2.0.868-0.7.el5

Also, using default iSCSI modules.


>> connection4:0: iscsi: detected conn error (1011)
>> session4: iscsi: session recovery timed out after 120 secs
>
> I do not think it is the bug, because you would normally log right back in.
>
> The recovery timed out error means that the initiator tried to log back
> in for 120 seconds and during that time we could not reconnect/relogin.
>
> I think this makes sense when looking at the switch messages below. If
> something causes the link to go down, the iscsi ping would fail/timeout.
>
> I am not sure if the iscsi layer dropping the session would cause the
> link to go down/up.

The link that goes down/up isn't the link between switch and the host, 
the link affected is between the *switch and the array*, very strange. 
It appears that some iSCSI client is causing "something" that makes 
iSCSI interface in the array to reset..

I think it's not a problem with Open-iSCSI and it's a Infortrend array 
bug, but perhaps someone may shed some light with this problem.

As I said, when this ocurrs it affects to all servers connected to this 
iSCSI interface/channel, including Windows hosts, etc..

Regards,

-- 
Santi Saez
http://woop.es

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Very strange problem with an Infortrend A16E iSCSI storage array

2009-02-03 Thread Santi Saez


Hi,

We have a very strange problem with an Infortrend A16E iSCSI storage 
array [1]. I think it's not a Open-iSCSI related problem, but someone 
here may shed some light :-)

This array has 4 iSCSI interfaces to distribute/balance ethernet 
traffic. There are 16 hosts connected to this array via iSCSI, with 4 
hosts per channel/interface.

*Randomly*, one of these channels resets, making the 4 servers connected 
to the channel timeout. The other 3 channels are not affected at all.

Open-iSCSI logs this:

ping timeout of 5 secs expired, last rx 502453156, last ping 502446907, 
now 502463156
connection4:0: iscsi: detected conn error (1011)
session4: iscsi: session recovery timed out after 120 secs
iscsi: cmd 0x28 is not queued (8)
iscsi: cmd 0x28 is not queued (8)
iscsi: cmd 0x28 is not queued (8)
sd 4:0:0:0: SCSI error: return code = 0x0001
end_request: I/O error, dev sdc, sector 338694423
(..)


The switch port where it is connected shows:

%LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/5, 
changed state to down
%LINK-3-UPDOWN: Interface GigabitEthernet0/5, changed state to down
%LINK-3-UPDOWN: Interface GigabitEthernet0/5, changed state to up
%LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/5, 
changed state to up


It appears like iSCSI channel *resets* and starts a down+up port 
process.. we have changed the wire, the switch.. and still get the same 
error.

The Infortrend array is logging nothing and the official support people 
have no idea about this issue :-/

We believe that the source of the problem is a single server. When we 
move this server to a different iSCSI channel we get the same error 
there, and the channel where it previously was starts working as 
expected, with no interface resets.

Anyone could say that something in that faulty server is making the 
interface reset; but we've checked it several times and we really 
believe that the server is configured as the other 16 we have attached 
to the array.

The switch connecting the servers and the array is a Cisco Catalyst 2960G.

Anyone ever experienced anything similar?

Regards,

[1] http://www.infortrend.com/main/2_product/es_a16e-g2130-4.asp

-- 
Santi Saez
http://woop.es

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Open-iSCSI error on CentOs -> ping timeout of 5 secs expired

2008-12-17 Thread Santi Saez



On Wed, 17 Dec 2008 11:43:42 -0600, Mike Christie 
wrote:

> bugzilla.redhat.com/show_bug.cgi?id=460158

You are not authorized to access bug #460158.

Thanks in any case, I will contact Virtuozzo dev team,

Regards,

-- 
Santi Saez
http://woop.es


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Open-iSCSI error on CentOs -> ping timeout of 5 secs expired

2008-12-17 Thread Santi Saez



On Wed, 17 Dec 2008 11:12:46 -0600, Mike Christie 
wrote:

> It is an error in that we tried to send a ping to the target and did not
> get a response.
> 
> Are you using the kernel from CentOS 5.2? If so it has a bug in that
> code patch that you might be hitting. The bug is that the code thought
> the ping timedout when it had not, so the driver would fire off the conn
> error and start recovery when we should not have.

Thanks!

Upss.. but I have a problem: it's a Virtuozzo based system, so I have not
access to the source code to patch this bug. Virtuozzo is a Linux kernel
modification based virtualization system, and it's not open-source :(

But it's very extrange, we only have this problem *in one server*.. other
server's has not this problem with the same scenario (there are +10 Linux
servers with the same config).

Do you have CentOS/Red Hat bug id to double-check with Virtuozzo
development team? thanks!

Regards,

-- 
Santi Saez
http://woop.es


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Open-iSCSI error on CentOs -> ping timeout of 5 secs expired

2008-12-17 Thread Santi Saez


Hi,

I'm getting this error on a CentOS 5.2 (i686) box connected via iSCSI to a
Infortrend A16E storage array:


ping timeout of 5 secs expired, last rx 1249707505, last ping 1249712505,
now 1249717505
 connection2:0: iscsi: detected conn error (1011)
ping timeout of 5 secs expired, last rx 1252596366, last ping 1252595336,
now 1252606366
 connection2:0: iscsi: detected conn error (1011)


As Mike said in this mail [1]:


This happens when we cannot reach the target for the noop timout and
interval seconds, which can happen if a cable is unplugged or the network
is not reach able or is dropping packets.


We have more than 10 servers connected to the same array, but we only get
this "warning" in one server (using the same hardware, switch, etc..).

It's very extrange.. there aren't network problems in the switch or in the
Linux box, all apears OK, a simple "ping" between the Linux box and the
array don't loss any packet, 100% of the packets are transmitted without
problems.

This is the target configuration in Open-iSCSI:

==
# iscsiadm -m node -p 10.15.17.133:3260 | grep -i time
node.session.timeo.replacement_timeout = 120
node.session.err_timeo.abort_timeout = 15
node.session.err_timeo.lu_reset_timeout = 30
node.session.err_timeo.host_reset_timeout = 60
node.session.iscsi.DefaultTime2Retain = 0
node.session.iscsi.DefaultTime2Wait = 2
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.auth_timeout = 45
node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 5
==

timeo.noop_out_interval and noop_out_timeout has default values, not
changed.

What means "iscsi: detected conn error"? It's really a problem, or only a
warning?

I have changed the wire to Cat6, but we still get the same errors, what can
I do to solve this and what is the reason? And really maybe a network
problem between initiator and target? thanks!

Regards,

[1] http://lkml.org/lkml/2008/6/25/299

-- 
Santi Saez
http://woop.es


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: How to install open-iscsi only by moving files

2008-12-08 Thread Santi Saez

Vincent Guo escribió:
> I found the code in the script open-iscsi:
>
> # Source function library.
> . /etc/init.d/functions
>
> What does these codes do ?
> What will happen if I delete the code.
In "functions" file there are general purpose functions for start/stop 
init scripts.. like status(), success(), etc.. it's used in a lot of 
scripts in /etc/init.d/*..

What are you doing? and what distro are you using? If you want to use 
Open-iSCSI start/stop script I think the best is to *copy* functions 
from another Red Hat based distro, removing this line is not a good idea..

Regards,

--
Santi Saez
http://woop.es

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Best way to take snapshots of iSCSI devices using Open-iSCSI

2008-12-05 Thread Santi Saez


Hi,

I want to take snapshots of a iSCSI devices from a target that hasn't
snapshot/cloning capabilities (it's a Infortrend A16E storage array).

What method are you using to make snapshots/clones of iSCSI targets using
Open-iSCSI? What about using Open-iSCSI + LVM 
snapshots system? For example:

 - Take a LVM snapshot in the initiator with "lvcreate".
 - Give read-only access to the backup server, for the same LUN/volume
 - In the backup server, mount in read-only mode this snapshot.
 - Take a backup of this snapshot, using dd/tar/rsync for example.
 - Unmount the snapshot in the backup server
 - Remove this snapshot from the host with "lvremove".

Is there any soft to make this? thanks!

NOTE: It's a must, that device snapshots must be in other device.. not in
the same target.

Regards,

-- 
Santi Saez
http://woop.es


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: How to install open-iscsi only by moving files

2008-12-05 Thread Santi Saez


On Thu, 4 Dec 2008 04:01:37 -0800 (PST), [EMAIL PROTECTED] wrote:

> but when I input the command
> ./open-iscsi start
> 
> it reports errors:
> 
> ;line 11: can't open /etc/init.d/functions

What Linux flavour are you using? Perhaps it's a Linux From Scratch system?

If it's a Red Hat based distro, you must install "initscripts" package:

# rpm -q --whatprovides /etc/init.d/functions
initscripts-8.45.19.EL-1.el5.centos.1

In "functions" file there are general purpose functions for start/stop init
scripts..

Regards,

-- 
Santi Saez
http://woop.es


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Correct way to change I/O scheduler in a iSCSI dev

2008-11-28 Thread Santi Saez


On Tue, 25 Nov 2008 10:51:33 -0600, Mike Christie <[EMAIL PROTECTED]>
wrote:

Hi Mike,

> If you want to just config every iscsi device, then you could run
> iscsiadm from a udev rule to check if a device is a iscsi device.
> iscsiadm -m session -P 3 will disaplay the /dev/sdX and LUN for the
> devices so if you parsed that and matched it, then you could set the
> values.
> 
> Maybe we should just add a common iscsi udev rule that users can edit.
> If you someone has one that they want included I will add it, or we can
> try to get it included with the udev package.

Finally I think using udev to tune device config is the best and simplest
way.

$ cat /etc/udev/rules.d/99-san.rules

# $Id: 99-san.rules.udev 13 2008-11-28 10:20:32Z santi $
# Set "noop" as I/O scheduler for iSCSI and Fiber Channel devices
ACTION=="add", ENV{ID_FS_USAGE}!="filesystem", ENV{ID_PATH}=="*-iscsi-*",
RUN+="/bin/sh -c 'echo noop > /sys$DEVPATH/queue/scheduler'"
ACTION=="add", ENV{ID_FS_USAGE}!="filesystem", ENV{ID_PATH}=="*-fc-*",   
RUN+="/bin/sh -c 'echo noop > /sys$DEVPATH/queue/scheduler'"

(To prevent line wrapping, it's also available at
http://pastebin.com/f5ce875a1)

When new iSCSI or FC device is added udevd will execute $RUN command; I set
!="filesystem" condition to prevent running the command for each partition,
I want to execute only for block devices.

It will be great to have this example into udev package, thanks!!

Regards,

-- 
Santi Saez
http://woop.es


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Correct way to change I/O scheduler in a iSCSI dev

2008-11-25 Thread Santi Saez


On Tue, 25 Nov 2008 15:19:24 +0100, "Bart Van Assche"
<[EMAIL PROTECTED]> wrote:

> Please have a look at the hierarchy created by udevd in /dev. You can
> find there soft links that have a name that does not change over
> sessions and that point to the devices created by the iSCSI initiator
> (/dev/sdb etc).

Bart, thanks for the tip, but I want to distribute this config via Cfengine
or other configuration management system (I need to replicate in more than
20 servers). So, it will be great to be completely iSCSI name independent.

It will be fine to tune each device configuration when attached to the
system via /sys, udev appears the best way to make this.. but perhaps
there's a "standard" method for this using Open-iSCSI, after logging into a
target with iscsiadm run a command, or something like this..

   ENV{ID_PATH}=="*iscsi*", RUN+="echo noop >
/sys/$env{DEVNAME}/queue/scheduler" 

P.S: I don't know if $env{DEVNAME} is the correct var.. but, something like
this! ;)

Regards,

-- 
Santi Saez
http://woop.es


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Correct way to change I/O scheduler in a iSCSI dev

2008-11-25 Thread Santi Saez

Hi,

What's the correct way to change configuration parameters for an iSCSI device? 
For example I/O scheduler, max_sectors_kb, etc...

I could add commands to the S99local script:

  echo noop > /sys/block/sdb/queue/scheduler
  echo 64 > /sys/block/sdb/queue/max_hw_sectors_kb

Unfortunately, iSCSI device names might change from sdb to, say, sdc (server 
reboot, iSCSI target reconnection). If this happens, customizations would be 
lost or applied to a different device :-/

Any workaround for this? sysctl, udev, anything else? What's the "standard 
method" for this task?

Thanks!

--
Santi Saez
http://woop.es


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Recommend I/O scheduler for Open-iSCSI

2008-11-17 Thread Santi Saez


Dear Srs,

We're experimenting low perfomance issues with Open-iSCSI with Linux 2.6.18
and using CFQ as I/O scheduler. It happens in all servers running Virtuozzo
with near about 10 virtual machines per node. vmstat reports +30-40% for
I/O wait.. and load average is very high.

Changing I/O scheduler from iSCSI device from "cfq" to "noop" performance
is OK, as expected -5% of I/O wait.

It's curious, when changing from "cfq" to "noop" there are less reads that
when using cfq!! I have measured using dstat, vmstat, etc.. and always
occurs the same :-/

What's the recomended I/O scheduler for an iSCSI devices?

And... which tool is recomended to benchmark iSCSI devices, fio, bonnie++,
etc..? thanks!

Regards,

-- 
Santi Saez
http://woop.es


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Problems with Open-iSCSI and Infortrend A16E-G2130-4

2008-04-30 Thread Santi Saez


El 29/04/2008, a las 18:23, Mike Christie escribió:

>>> Apr 29 10:24:40 vz-10 kernel: scsi1 : iSCSI Initiator over TCP/IP
>>> Apr 29 10:24:41 vz-10 kernel:   Vendor: IFT   Model: A16E-
>>> G2130-4  Rev: 361F
>>> Apr 29 10:24:41 vz-10 kernel:   Type:   Direct-
>>> Access  ANSI SCSI revision: 04
>>> Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: 629145600 512-byte
>>> hdwr sectors (322123 MB)
>>> Apr 29 10:24:41 vz-10 kernel: sdb: Write Protect is off
>>> Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: drive cache: write  
>>> back
>>> Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: 629145600 512-byte
>>> hdwr sectors (322123 MB)
>>> Apr 29 10:24:41 vz-10 kernel: sdb: Write Protect is off
>>> Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: drive cache: write  
>>> back
>>> Apr 29 10:24:41 vz-10 iscsid: connection1:0 is operational now
>>> Apr 29 10:24:44 vz-10 udevd-event[23432]: wait_for_sysfs: waiting
>>> for '/sys/devices/platform/host1/session1/target1:0:0/1:0:0:0/
>>> ioerr_cnt' failed
>>> Apr 29 10:25:06 vz-10 iscsid: Nop-out timedout after 15 seconds on


Dear Srs,

The problem has been solved disabling Jumbo Frames in the Infortrend  
target.

Linux has Jumbo Frames enabled with 9000 bytes MTU, and the switch  
has this feature enabled.. very curious :-/

Regards,

--
Santi Saez


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Problems with Open-iSCSI and Infortrend A16E-G2130-4

2008-04-30 Thread Santi Saez


El 29/04/2008, a las 20:15, Mike Christie escribió:

>
> Santi Saez wrote:
>>
>> El 29/04/2008, a las 19:42, Mike Christie escribió:
>>
>> Dear Mike!!
>>
>>> Are you doing iscsi boot? Or did you start the iscsi service, try to
>>> stop it then try to restart it and one of the steps had errors?
>>
>> No, the server isn't booting from SAN. I'm starting iscsi service
>> manually, restarting the service I get the same error :-/
>
> doh, oh yeah, I forgot you are using centos. That is expected for
> service restarts.

Dear Mike,

I have make some test with latest Open-iSCSI version "open- 
iscsi-2.0-869", and now I get new error:


Apr 30 11:47:35 vz-09 kernel: iscsi: registered transport (tcp)
Apr 30 11:47:39 vz-09 kernel: scsi1 : iSCSI Initiator over TCP/IP
Apr 30 11:47:39 vz-09 kernel:   Vendor: IFT   Model: A16E- 
G2130-4  Rev: 361F
Apr 30 11:47:39 vz-09 kernel:   Type:   Direct- 
Access  ANSI SCSI revision: 04
Apr 30 11:47:39 vz-09 kernel: SCSI device sdb: 629145600 512-byte  
hdwr sectors (322123 MB)
Apr 30 11:47:39 vz-09 kernel: sdb: Write Protect is off
Apr 30 11:47:39 vz-09 kernel: SCSI device sdb: drive cache: write back
Apr 30 11:47:39 vz-09 kernel: SCSI device sdb: 629145600 512-byte  
hdwr sectors (322123 MB)
Apr 30 11:47:39 vz-09 kernel: sdb: Write Protect is off
Apr 30 11:47:39 vz-09 kernel: SCSI device sdb: drive cache: write back
Apr 30 11:47:40 vz-09 iscsid: received iferror -38
Apr 30 11:47:40 vz-09 last message repeated 4 times
Apr 30 11:47:40 vz-09 iscsid: connection1:0 is operational now
Apr 30 11:47:43 vz-09 udevd-event[2871]: wait_for_sysfs: waiting for  
'/sys/devices/platform/host1/session1/target1:0:0/1:0:0:0/ioerr_cnt'  
failed
Apr 30 11:47:51 vz-09 iscsid: Nop-out timedout after 5 seconds on  
connection 1:0 state (3). Dropping session.
Apr 30 11:47:54 vz-09 iscsid: received iferror -38
Apr 30 11:47:54 vz-09 last message repeated 4 times
Apr 30 11:47:54 vz-09 iscsid: connection1:0 is operational after  
recovery (1 attempts)
Apr 30 11:48:04 vz-09 iscsid: Nop-out timedout after 5 seconds on  
connection 1:0 state (3). Dropping session.
Apr 30 11:48:07 vz-09 iscsid: received iferror -38
Apr 30 11:48:07 vz-09 last message repeated 4 times
Apr 30 11:48:07 vz-09 iscsid: connection1:0 is operational after  
recovery (1 attempts)


What means "received iferror -38" ??

Running the same kernel 2.6.18-53.1.14.el5PAE on a CentOS 5.1 i686 box.

Regards,

--
Santi Saez
Hostalia Internet S.L.U.
http://www.hostalia.com


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Problems with Open-iSCSI and Infortrend A16E-G2130-4

2008-04-29 Thread Santi Saez


El 29/04/2008, a las 19:42, Mike Christie escribió:

Dear Mike!!

> Are you doing iscsi boot? Or did you start the iscsi service, try to
> stop it then try to restart it and one of the steps had errors?

No, the server isn't booting from SAN. I'm starting iscsi service  
manually, restarting the service I get the same error :-/

> For some reason all IO after the inquiry/report_luns does not seem  
> to be
> getting to the target, or the target is not processing them.
>
> Does the inforterend box have multiple cards/hbas/ports? Does it  
> require
> any ACL type of setup? Do you have to tell it to allow certain  
> initiators?
>
> Are there any errors in the target logs?

The target is a Infortrend A16E-G2130-4 box, with 4 iSCSI interfaces.  
I have tested enabling/disabling CHAP authentication getting the same  
error.

I have one partition and it's lun mapped to the first ethernet  
interface, I connect from the Open-iSCSI box to this interface.. I  
have tried with all interfaces, changing LUN, SCSI ID numbers, etc..

Maybe the problem is in the Infortrend target.. because of I have  
connected with the same config, CentOS 5.1 some days ago..

Regards,

--
Santi Saez


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Problems with Open-iSCSI and Infortrend A16E-G2130-4

2008-04-29 Thread Santi Saez


El 29/04/2008, a las 18:23, Mike Christie escribió:

>> The problem appears to be related to udevd-event? The system is
>> running CentOS 5.1, with kernel 2.6.18-53.1.14.el5PAE, and "iscsi-
>> initiator-utils-6.2.0.865-0.8.el5".
>
> The target does not like our nops. If you set
>
> node.conn[0].timeo.noop_out_interval = 0
> node.conn[0].timeo.noop_out_timeout = 0
>
> It should fix that problem, but if you could check the target logs for
> something about a bad PDU or iSCSI protocol error or anything, we can
> see why this is causing problems.

Dear Mike,

I get the same error changing those values at /etc/iscsi/iscsi.conf  
file and with "iscsiadm":

> # tail -f -n0 -q /var/log/* 2> /dev/null
> Apr 29 19:28:12 vz-09 iscsid: iSCSI logger with pid=3450 started!
> Apr 29 19:28:12 vz-09 kernel: scsi2 : iSCSI Initiator over TCP/IP
> Apr 29 19:28:12 vz-09 kernel:   Vendor: IFT   Model: A16E- 
> G2130-4  Rev: 361F
> Apr 29 19:28:12 vz-09 kernel:   Type:   Direct- 
> Access  ANSI SCSI revision: 04
> Apr 29 19:28:12 vz-09 kernel: SCSI device sdb: 629145600 512-byte  
> hdwr sectors (322123 MB)
> Apr 29 19:28:12 vz-09 kernel: sdb: Write Protect is off
> Apr 29 19:28:12 vz-09 kernel: SCSI device sdb: drive cache: write back
> Apr 29 19:28:12 vz-09 kernel: SCSI device sdb: 629145600 512-byte  
> hdwr sectors (322123 MB)
> Apr 29 19:28:12 vz-09 kernel: sdb: Write Protect is off
> Apr 29 19:28:12 vz-09 kernel: SCSI device sdb: drive cache: write back
> Apr 29 19:28:13 vz-09 iscsid: transport class version 2.0-724.  
> iscsid version 2.0-865
> Apr 29 19:28:13 vz-09 iscsid: iSCSI daemon with pid=3451 started!
> Apr 29 19:28:13 vz-09 iscsid: Could not read data from db. Using  
> default and currently negotiated values
> Apr 29 19:28:13 vz-09 iscsid: connection2:0 is operational now
> Apr 29 19:28:16 vz-09 udevd-event[3470]: wait_for_sysfs: waiting  
> for '/sys/devices/platform/host2/session2/target2:0:0/2:0:0:0/ 
> ioerr_cnt' failed
> Apr 29 19:28:16 vz-09 iscsid: connection1:0 is operational after  
> recovery (2 attempts)
> Apr 29 19:29:02 vz-09 kernel:  sdb:<6> connection2:0: iscsi:  
> detected conn error (1011)
> Apr 29 19:29:03 vz-09 iscsid: Kernel reported iSCSI connection 2:0  
> error (1011) state (3)
> Apr 29 19:29:06 vz-09 kernel: iscsi: host reset succeeded
> Apr 29 19:29:06 vz-09 iscsid: connection2:0 is operational after  
> recovery (2 attempts)


(..)

>>
>
> What are the settings for the PingTimeout, ActiveTimeout and  
> IdleTimeout
> in /etc/iscsi.conf in the 4.6 installation?


PingTimeout = default
ActiveTimeout = default
IdleTimeout = default

Not changed, we're using default values..

This's the output of the iSCSI config:

# iscsiadm -m node --targetname iqn. 
2002-10.com.infortrend:raid.sn7511631.00
node.name = iqn.2002-10.com.infortrend:raid.sn7511631.00
node.tpgt = 1
node.startup = automatic
iface.hwaddress = default
iface.iscsi_ifacename = default
iface.net_ifacename = default
iface.transport_name = tcp
node.discovery_address = 10.15.17.130
node.discovery_port = 3260
node.discovery_type = send_targets
node.session.initial_cmdsn = 0
node.session.initial_login_retry_max = 4
node.session.cmds_max = 128
node.session.queue_depth = 32
node.session.auth.authmethod = None
node.session.auth.username = 
node.session.auth.password = 
node.session.auth.username_in = 
node.session.auth.password_in = 
node.session.timeo.replacement_timeout = 120
node.session.err_timeo.abort_timeout = 10
node.session.err_timeo.reset_timeout = 30
node.session.iscsi.FastAbort = No
node.session.iscsi.InitialR2T = No
node.session.iscsi.ImmediateData = Yes
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.session.iscsi.DefaultTime2Retain = 0
node.session.iscsi.DefaultTime2Wait = 2
node.session.iscsi.MaxConnections = 1
node.session.iscsi.MaxOutstandingR2T = 1
node.session.iscsi.ERL = 0
node.conn[0].address = 10.15.17.130
node.conn[0].port = 3260
node.conn[0].startup = manual
node.conn[0].tcp.window_size = 524288
node.conn[0].tcp.type_of_service = 0
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.auth_timeout = 45
node.conn[0].timeo.active_timeout = 5
node.conn[0].timeo.idle_timeout = 60
node.conn[0].timeo.ping_timeout = 5
node.conn[0].timeo.noop_out_interval = 0
node.conn[0].timeo.noop_out_timeout = 0
node.conn[0].iscsi.MaxRecvDataSegmentLength = 131072
node.conn[0].iscsi.HeaderDigest = None,CRC32C
node.conn[0].iscsi.IFMarker = No
node.conn[0].iscsi.OFMarker = No

Regards!!

--
Santi Saez
Hostalia Internet S.L.U.
http://www.hostalia.com


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Problems with Open-iSCSI and Infortrend A16E-G2130-4

2008-04-29 Thread Santi Saez


Dear Srs,

I'm getting this error when trying to connect to a Infortrend A16E- 
G2130-4 storage vía iSCSI.

> Apr 29 10:24:40 vz-10 kernel: scsi1 : iSCSI Initiator over TCP/IP
> Apr 29 10:24:41 vz-10 kernel:   Vendor: IFT   Model: A16E- 
> G2130-4  Rev: 361F
> Apr 29 10:24:41 vz-10 kernel:   Type:   Direct- 
> Access  ANSI SCSI revision: 04
> Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: 629145600 512-byte  
> hdwr sectors (322123 MB)
> Apr 29 10:24:41 vz-10 kernel: sdb: Write Protect is off
> Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: drive cache: write back
> Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: 629145600 512-byte  
> hdwr sectors (322123 MB)
> Apr 29 10:24:41 vz-10 kernel: sdb: Write Protect is off
> Apr 29 10:24:41 vz-10 kernel: SCSI device sdb: drive cache: write back
> Apr 29 10:24:41 vz-10 iscsid: connection1:0 is operational now
> Apr 29 10:24:44 vz-10 udevd-event[23432]: wait_for_sysfs: waiting  
> for '/sys/devices/platform/host1/session1/target1:0:0/1:0:0:0/ 
> ioerr_cnt' failed
> Apr 29 10:25:06 vz-10 iscsid: Nop-out timedout after 15 seconds on  
> connection 1:0 state (3). Dropping session.
> Apr 29 10:25:10 vz-10 iscsid: connection1:0 is operational after  
> recovery (2 attempts)
> Apr 29 10:25:36 vz-10 iscsid: Nop-out timedout after 15 seconds on  
> connection 1:0 state (3). Dropping session.
> Apr 29 10:25:39 vz-10 iscsid: connection1:0 is operational after  
> recovery (2 attempts)
> Apr 29 10:26:05 vz-10 iscsid: Nop-out timedout after 15 seconds on  
> connection 1:0 state (3). Dropping session.
> Apr 29 10:26:09 vz-10 iscsid: connection1:0 is operational after  
> recovery (2 attempts)
> Apr 29 10:26:34 vz-10 iscsid: Nop-out timedout after 15 seconds on  
> connection 1:0 state (3). Dropping session.
> Apr 29 10:26:38 vz-10 iscsid: connection1:0 is operational after  
> recovery (2 attempts)
> Apr 29 10:27:03 vz-10 iscsid: Nop-out timedout after 15 seconds on  
> connection 1:0 state (3). Dropping session.
> Apr 29 10:27:07 vz-10 iscsid: connection1:0 is operational after  
> recovery (2 attempts)
> Apr 29 10:27:32 vz-10 kernel:  sdb:<6>sd 1:0:0:0: SCSI error:  
> return code = 0x0002
> Apr 29 10:27:32 vz-10 kernel: end_request: I/O error, dev sdb,  
> sector 0
> Apr 29 10:27:32 vz-10 kernel: Buffer I/O error on device sdb,  
> logical block 0
> Apr 29 10:27:32 vz-10 iscsid: Nop-out timedout after 15 seconds on  
> connection 1:0 state (3). Dropping session.
> Apr 29 10:27:36 vz-10 iscsid: connection1:0 is operational after  
> recovery (2 attempts)
> Apr 29 10:28:02 vz-10 iscsid: Nop-out timedout after 15 seconds on  
> connection 1:0 state (3). Dropping session.
> Apr 29 10:28:05 vz-10 iscsid: connection1:0 is operational after  
> recovery (2 attempts)
>

The problem appears to be related to udevd-event? The system is  
running CentOS 5.1, with kernel 2.6.18-53.1.14.el5PAE, and "iscsi- 
initiator-utils-6.2.0.865-0.8.el5".

The "iscsiadm" holds on:

[EMAIL PROTECTED]:~
# iscsiadm -m node --targetname iqn. 
2002-10.com.infortrend:raid.sn7511631.10 -p 10.15.17.131 -l
Login session [iface: default, target: iqn. 
2002-10.com.infortrend:raid.sn7511631.10, portal: 10.15.17.131,3260]
(..)

It's a strange problem.. I have no errors with CentOS 4.6, what can  
be the problem? Thanks!!

Regards,
--
Santi Saez
Hostalia Internet S.L.U.
http://www.hostalia.com


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---