Hi Mike!

On Sat, 2008-06-28 at 15:33 -0500, Mike Christie wrote:
> What version of open-iscsi and kernel are you using? And are you using 
> the kernel modules with open-iscsi or the ones that come with the kernel?
> 

Whoops, forgot to include that tid-bit:

open-iscsi: 2.0.730-1etch1

kernel: I am using v2.6.22.19-kdb, and Jerome is using
v2.6.22-4-vserver-amd64

> Nicholas A. Bellinger wrote:
> >>
> >> The problem is that the failure of the outstanding I/Os does not seem to
> >> be occuring in all cases.  In particular, a iscsiadm --logout I believe
> >> is getting issued, and said logout request failing/timing out because
> >> DRBD_TARGET has been released.  It is at this point where umount for the
> >> ext3 mount and/or sync hangs indefinately.  When the problem occurs, it
> >> looks like this from the kernel ring buffer:
> >>
> >> iscsi_deallocate_extra_thread_sets:285: ***OPS*** Stopped 1 thread set(s) 
> >> (2 total threads).
> >> iscsi_deallocate_extra_thread_sets:285: ***OPS*** Stopped 2 thread set(s) 
> >> (4 total threads).
> >> session10: iscsi: session recovery timed out after 120 secs
> >> sd 51:0:0:0: scsi: Device offlined - not ready after error recovery
> 
> If you see this then any and all that was sent the device and any new IO 
> should be failed to the FS and block layer like below. There is a bug in 
> some kernels though, where if you were to run a iscsiadm logout command 
> it can hang and lead to weird problems, because the scsi layer is 
> broken. If you use open-iscsi 869.2's kernel modules or the iscsi 
> modules in 2.6.25.6 or newer then this is fixed.

Ok boss, I will upgrade the tools and jump from v2.6.22.19-kdb to
v2.6.25.9-kdb on my VHACs nodes.

>  Not sure if that is 
> what you are seeing, because we see IO failed upwards here. Also once we 
> see "Device offlined", the scsi layer is going to fail the IO when it 
> hits the scsi prep functions and is never even reaches us. If there is 
> IO stuck in the driver you could do
> cat /sys/class/scsi_host/hostX/host_busy
>
> to check (that file prints the number of commands the scsi layer has 
> sent the driver and the driver has not yet returned back (ok so I mean 
> how many commands is outsatnding)).
> 
> 

<nod>, got it.

> >> sd 51:0:0:0: [sdg] Result: hostbyte=DID_BUS_BUSY 
> >> driverbyte=DRIVER_OK,SUGGEST_OK
> >> end_request: I/O error, dev sdg, sector 0
> >> Buffer I/O error on device sdg, logical block 0
> >> lost page write due to I/O error on sdg
> 
> 
> 
> 
> >>
> >> I should mention that we are not doing any I/O to said iSCSI LUN via
> >> Open/iSCSI other than the filesystem metadata for ext3 umount and
> >> SYNCHRONIZE_CACHE CDB during struct scsi_device deregistration.  From
> >> experience with Core-iSCSI, I know the logout path is tricky wrt
> >> exceptions (I spent months on it to handle all cases with Immediate and
> >> Non Immediate Logout, as well as doing logouts on the fly from the same
> >> connection in MC/S and different connections in MC/S :-)
> >>
> >> So the question is:
> >>
> >> I) When a ISCSI_INIT_LOGOUT_REQ is not returned with a
> >> ISCSI_TARGET_LOGOUT_RSP and replacement_timeout fires, are all
> >> outstanding I/Os for that particular session being completed with an
> >> non-recoveryable exception..?  Has anyone ever run into this case and/or
> >> tested it..?
> 
> If the connection is down when you run iscsiadm logout, we will not send 
> a logout and the replacement_timeout does not come into play. We just 
> fast fail the connection and just cleanup the commands and kernel 
> resrouces and iscsiadm returns (yeah pretty bad I know - it is on the TODO).
> 
> If the connection is up when you run iscsiadm logout, and while the 
> logout is floating around the connection drops, we are again lazy and 
> just fail and cleanup and return right away. The replacement_timeout 
> does not come into play for this and we just fail right away.
> 
> 
> If you run 869.2 from open-iscsi.org and build with
> 
> make DEBUG_SCSI=1 DEBUG_TCP=1
> make DEBUG_SCSI=1 DEBUG_TCP=1 install
> and send all the log output I can tell you better what is going on.
> 

Ok, I will get into this at the office tomorrow afternoon and let you
know what I find.  Thanks for the info.

Many thanks for your most valuable of time,

--nab



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to