Re: Antw: Re: equallogic and double connections for multipathing with Debian 5

2010-08-31 Thread Mike Vallaly

I should have better explained why this script is useful. Tardiness
should be expected with me ;)

We have a setup something like this:

LunX <-> Openiscsi <-> eth2 (10.99.99.101) <-> IscsiTarget
(10.99.99.10)
  and
LunX <-> Openiscsi <-> eth3 (10.99.99.102) <-> IscsiTarget
(10.99.99.10)

Both (eth2) 10.99.99.101 and (eth3) 10.99.99.102 are on a single Linux
Host and we have a single IscsiTarget on the same network
(10.99.99.0/24) which we create two iscsi TCP sessions to
(10.99.99.10).

What happens is you get a linux routing table which looks something
like this:


mvall...@darkstar:~$ ip ro sh
10.99.99.0/24 dev eth2  proto kernel  scope link  src 10.99.99.101
10.99.99.0/24 dev eth3  proto kernel  scope link  src 10.99.99.102
default via 10.99.99.1 dev eth2  proto static


Connectivity _WILL_ work, but not likely as one would expect. I will
attempt to enumerate the unexpected (land-mine) behavior below:

#1. Arp Flux - In Linux the MAC address by default is tied to the HOST
and not the NIC (counter intuitive I know), meaning _ANY_ active
network interface on a host _MAY_ reply to an ARP request for _ANY_
MAC address the host has NICs for. What this means is that you have a
race condition when two NICs are placed on the same network, as either
of them CAN and WILL respond to ARP requests for the other, resulting
in (my personal experience) unbalanced inbound traffic across two NICs
on the same network.

#2. Routing Table - In the above routing table snippet, you will
notice we have two network routes for 10.99.99.0/24 (one attached to
eth2 and one attached to eth3) the HOST can use when contacting the
IscsiTarget (10.99.99.10). The land-mine here is that even with two
separate processes (separate open-iscsi sessions), both sessions will
use the most specific match in the routing table (IE: "10.99.99.0/24
dev eth2  proto kernel  scope link  src 10.99.99.101") and traffic
will exit only via eth2. To make matters even worse we have a standard
default gateway of 10.99.99.1, which works until the eth2 network link
is severed, at which point we lose all ability to get packets off the
10.99.99.0/24 network despite sill having a working eth3.

So the script I attached earlier fixes these issues,

First it changes the ARP function on the SAN interfaces. Specifically
'echo "1" > /proc/sys/net/ipv4/conf/${interface}/arp_ignore' on the
SAN interfaces to prevent the Linux kernel from responding to ARPs on
interfaces other than via the NIC containing the MAC used in the ARP
response.

Second it configures iproute2 route rules, used in conjunction to the
standard linux routing table, which forces traffic leaving from the
HOST, to exit using the interface the IP packets are sourced sourced
from.

Third it configures a multi-hop default route for the SAN networks,
which provides dead gateway detection and fail-over should one of the
interfaces on the 10.99.99.0/24 network fail.

Hopefully this helps.. (or explains my maddness)

-Mike




On Aug 5, 1:32 am, "Ulrich Windl" 
wrote:
> >>> Mike Vallaly  schrieb am 04.08.2010 um 19:54 in
>
> Nachricht
> :
>
> > Sorry for the lateness in my reply. Just stumbled across this
> > thread.. ;)
>
> > Part of the problem with MPIO in linux with two (or more) interfaces
> > connected to the same Ethernet segment is "arp flux". Essentially all
> > traffic will by default only exit out one path (mac address) on a
> > multi-homed network. The fix for this is to explicitly tie the
> > interface to a route rule which ensures traffic leaves via the
> > interface the application intended.
>
> Mike,
>
> looking at the script I tried to figure out what the script actually does. 
> Can you describe in more detail what the problem is, and how the script fixes 
> it?
>
> Regards,
> Ulrich
> [...]

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



RE: detected conn error (1011)

2010-08-31 Thread Goncalo Gomes
Thanks Hannes and Mike,

Your help has been highly appreciated!

Cheers,
 -Goncalo.

-Original Message-
From: Hannes Reinecke [mailto:h...@suse.de] 
Sent: 31 August 2010 14:43
To: Goncalo Gomes
Cc: Mike Christie; open-iscsi@googlegroups.com; Shantanu Mehendale
Subject: Re: detected conn error (1011)

Goncalo Gomes wrote:
> Hi Hannes,
> 
> Thanks. The Citrix XenServer 5.6 distribution kernel is based on the 2.6.27 
> tree of SLES 11.
> We add a few extra patches specific to Xen,  dom0 integration and some 
> backports from upstream.
> To the best of my knowledge these additions don't touch the iscsi layer, so 
> from the iscsi
> drivers point of view, I believe they are as pristine as the ones in the SuSE 
> kernel and that's
> why we need the patch as the binaries probably will mismatch gcc version 
> and/or the versioning
> that we use e.g 2.6.27.42-0.1.1.xs5.6.0.44.58xen. I do definitely 
> appreciate your
> 'forward thinking' with regards to the issue, though!
> 
I just checked, and the resulting patch is indeed like you proposed:

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 32b30f1..441ca8b 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1336,9 +1336,6 @@ int iscsi_queuecommand(struct scsi_cmnd *sc, void (*done)(
struct scsi_cmnd *))
 */
switch (session->state) {
case ISCSI_STATE_FAILED:
-   reason = FAILURE_SESSION_FAILED;
-   sc->result = DID_TRANSPORT_DISRUPTED << 16;
-   break;
case ISCSI_STATE_IN_RECOVERY:
reason = FAILURE_SESSION_IN_RECOVERY;
sc->result = DID_IMM_RETRY << 16;

HTH,

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: detected conn error (1011)

2010-08-31 Thread Hannes Reinecke
Goncalo Gomes wrote:
> Hi Hannes,
> 
> Thanks. The Citrix XenServer 5.6 distribution kernel is based on the 2.6.27 
> tree of SLES 11.
> We add a few extra patches specific to Xen,  dom0 integration and some 
> backports from upstream.
> To the best of my knowledge these additions don't touch the iscsi layer, so 
> from the iscsi
> drivers point of view, I believe they are as pristine as the ones in the SuSE 
> kernel and that's
> why we need the patch as the binaries probably will mismatch gcc version 
> and/or the versioning
> that we use e.g 2.6.27.42-0.1.1.xs5.6.0.44.58xen. I do definitely 
> appreciate your
> 'forward thinking' with regards to the issue, though!
> 
I just checked, and the resulting patch is indeed like you proposed:

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 32b30f1..441ca8b 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1336,9 +1336,6 @@ int iscsi_queuecommand(struct scsi_cmnd *sc, void (*done)(
struct scsi_cmnd *))
 */
switch (session->state) {
case ISCSI_STATE_FAILED:
-   reason = FAILURE_SESSION_FAILED;
-   sc->result = DID_TRANSPORT_DISRUPTED << 16;
-   break;
case ISCSI_STATE_IN_RECOVERY:
reason = FAILURE_SESSION_IN_RECOVERY;
sc->result = DID_IMM_RETRY << 16;

HTH,

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



RE: detected conn error (1011)

2010-08-31 Thread Goncalo Gomes
Hi Hannes,

Thanks. The Citrix XenServer 5.6 distribution kernel is based on the 2.6.27 
tree of SLES 11. We add a few extra patches specific to Xen,  dom0 integration 
and some backports from upstream. To the best of my knowledge these additions 
don't touch the iscsi layer, so from the iscsi drivers point of view, I believe 
they are as pristine as the ones in the SuSE kernel and that's why we need the 
patch as the binaries probably will mismatch gcc version and/or the versioning 
that we use e.g 2.6.27.42-0.1.1.xs5.6.0.44.58xen. I do definitely 
appreciate your 'forward thinking' with regards to the issue, though!

Thanks,
 -Goncalo.



-Original Message-
From: Hannes Reinecke [mailto:h...@suse.de] 
Sent: 30 August 2010 15:12
To: Goncalo Gomes
Cc: Mike Christie; open-iscsi@googlegroups.com; Shantanu Mehendale
Subject: Re: detected conn error (1011)

Goncalo Gomes wrote:
> Hi,
> 
> On Fri, 2010-08-06 at 15:57 +0100, Hannes Reinecke wrote: 
>> Mike Christie wrote:
>>> ccing Hannes from suse, because this looks like a SLES only bug.
>>>
>>> Hey Hannes,
>>>
>>> The user is using Linux 2.6.27 x86 based on SLES + Xen 3.4 (as dom0)
>>> running a couple of RHEL 5.5 VMs. The underlying storage for these VMs
>>> is iSCSI based via open-iscsi 2.0.870-26.6.1 and a DELL equallogic array.
>>>
>>>
>>> On 08/05/2010 02:21 PM, Goncalo Gomes wrote:
 I've copied both the messages file from the host goncalog140 and the
 patched libiscsi.c. FWIW, I've also included the iscsid.conf. Find these
 files in the link below:

 http://promisc.org/iscsi/

>>> It looks like this chunk from libiscsi.c:iscsi_queuecommand:
>>>
>>> case ISCSI_STATE_FAILED:
>>> reason = FAILURE_SESSION_FAILED;
>>> sc->result = DID_TRANSPORT_DISRUPTED << 16;
>>> break;
>>>
>>> is causing IO errors.
>>>
>>> You want to use something like DID_IMM_RETRY because it can be a long
>>> time between the time the kernel marks the state as ISCSI_STATE_FAILED
>>> until we start recovery and properly get all the device queues blocked,
>>> so we can exhaust all the retries if we use DID_TRANSPORT_DISRUPTED.
>> Yeah, I noticed.
>> But the problem is that multipathing will stall during this time,
>> ie no failover will occur and I/O will stall. Using DID_TRANSPORT_DISRUPTED
>> will circumvent this and we can failover immediately.
>>
>> Sadly I got additional bugreports about this so I think I'll have
>> to revert it.
> 
> I applied and tested the changes Mike Christie suggests. After the LUN
> is rebalanced within the array I no longer see the IO errors and it
> appears the setup is now resilient to the equallogic LUN failover
> process.
> 
> I'm attaching the log from the dmesg merely for sanity check purposes,
> if anyone cares to take a look?
> 
>> I have put some test kernels at
>>
>> http://beta.suse.com/private/hare/sles11/iscsi
> 
> Do the test kernels in the url above contain the change of
> DID_TRANSPORT_DISRUPTED to DID_DIMM_RETRY or is there more to it than
> simply changing the result code? If the latter, would you be able to
> upload the source rpms or a unified patch containing the changes you are
> are staging? I'm looking for a more pallatable way to test them, given I
> have no SLES box lying around, but will install one if needs be.
> 
Got me confused. How would you test the patch if not on a SLES box?
Presumably you would have to install the new kernel on the instance
you are planning to run the test on. Which for any sane setup would
have to be a SLES box. In which case you can just use the provided
kernel directly and save you the compilation step.

Am I missing something?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.