Re: Connection Errors

Mike Christie Fri, 13 Jun 2008 17:59:38 -0700

swejis wrote:
> I managed to reconfigure the nop-timeouts and set both timeout and
> interval to zero.
> 
> manjula:/ # iscsiadm -m node -T iqn.
> 1994-12.com.promise.target.a9.39.4.55.1.0.0.20 -p 192.168.43.5:3260 -o
> update -n node.conn[0].timeo.noop_out_interval -v 0
> manjula:/ # iscsiadm -m node -T iqn.
> 1994-12.com.promise.target.a9.39.4.55.1.0.0.20 -p 192.168.43.5:3260 -o
> update -n node.conn[0].timeo.noop_out_timeout -v 0
> 
> manjula:/ # iscsiadm -m node --targetname iqn.
> 1994-12.com.promise.target.a9.39.4.55.1.0.0.20 | grep noop
> node.conn[0].timeo.noop_out_interval = 0
> node.conn[0].timeo.noop_out_timeout = 0
> node.conn[0].timeo.noop_out_interval = 0
> node.conn[0].timeo.noop_out_timeout = 0
> 
> Still I'm seeing quite a few connection errors (with the latest path
> applied)
>


Yeah, something is still setting the nops somehow so we are still 
hitting the same problem. If you do

cat /sys/class/iscsi_connection/connectionX:0/ping_tmo
cat /sys/class/iscsi_connection/connectionX:0/recv_tmo

Do you see 5 for both values (X would be the session number)?

Could also send the beginning of the log where you login and see the 
scsi devices get added? The parts right after:

Loading iSCSI transport class v2.0-869.
iscsi: registered transport (tcp)

Up to the parts where you see the last scsi device get added.

Also what arch are you running? Are you running x86 or x86_64? If the 
latter are you running both 64bit kernels and userspace?

And could you just run iscsid by hand with

iscsid -d 8 -f &

login to the target then send the output? You do not have to do any 
other IO. I just want to make sure the params are getting sent to the 
kernel right.



Oh yeah I found the reason for the zobmie process you were hitting. 
There was a bug introduced into 2.5.25 which left them hanging around 
even though we removed the host and sessions. You need the attached 
patch on your kernel to fix the problem.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

In 2.6.25 we started using class_find_child instead of implementing
our own loop and lookup. The problem is that class_find_child would get a
reference to the host's device, then scsi_host_lookup would get an extra
reference to it when it called scsi_host_get(). This patch drops the
ref from class_find_child because scsi_host_get gets a ref if the host
is not being removed.

Signed-off-by: Mike Christie <[EMAIL PROTECTED]>

--- linux-2.6.25.2/drivers/scsi/hosts.c 2008-05-06 18:21:32.000000000 -0500
+++ linux-2.6.25.2.work/drivers/scsi/hosts.c    2008-06-13 19:43:15.000000000 
-0500
@@ -455,9 +455,10 @@ struct Scsi_Host *scsi_host_lookup(unsig
        struct Scsi_Host *shost = ERR_PTR(-ENXIO);
 
        cdev = class_find_child(&shost_class, &hostnum, __scsi_host_match);
-       if (cdev)
+       if (cdev) {
                shost = scsi_host_get(class_to_shost(cdev));
-
+               class_device_put(cdev);
+       }
        return shost;
 }
 EXPORT_SYMBOL(scsi_host_lookup);

Re: Connection Errors

Reply via email to