On 09/02/2011 12:54 PM, iscsi developer man wrote: > Hi Mike, I have a follow up question. > > Your notes were very insightful. I was able to recreate exactly what you > said. > > I was slightly confused, because iostat counts the operation in its output. > > > Here's another question: What about other failure scenarios? we talked > about task_set_full and busy, but what if the connection is broken by > the target side? Is that subject to the same 180 second (scsi commands > allowed * timeout value ) timeout?
Yes and no and maybe :) It depends on other timers and the target. If the target drops the conn and the we get a notification (either the target sends a iscsi async pdu or we get a tcp/ip socket state change notification then the initiator will set the session as down, block the scsi devices accessed through that session, and then fail IO with a return value that tells the scsi layer to requeue the IO until we tell it otherwise. We will then try to relogin to the target for replacement/recovery timeout seconds (see iscsid.conf and the README for info on that timer). If we cannot login within that timeout we will the scsi layer to unblock the queues and we fail all the IO. If the target does not send us an async pdu and we do not get a state change notification then chances are the initiator will send a nop (iscsi ping) (see noop timeout settings in iscsid.conf). That will time out and then the initiator will handle that like above. If the noop timeout settings are higher than scsi command timeout (the single run timeout value) then the scsi command would timeout. That would start the scsi eh which requests the initiator to do aborts and resets. Those would fail and the initiator would drop the connection and we would wait for replacement/recovery timeout seconds like above. See the README's section 8. It describes how many of the timers and eh work. > > > Is it possible that on connection breaks, iscsi returns immediate failure? Not really immediately. But if all the timers were set really low it could be really quickly. > > > We have been experimenting, and looks like check conditons & sense data > return immediate errors. What do you mean? Return to layers above scsi or to scsi/iscsi layers? It depends on the sense code. Some are immediately returned upwards to the block/FS/passthrough layers. > > Does the same happen with connection closes by the target? > > > thanks > > > iscsi devel man > > > > > > On Thu, Aug 25, 2011 at 5:16 PM, iscsi developer man > <iscsidevel...@gmail.com <mailto:iscsidevel...@gmail.com>> wrote: > > Thanks Mike, > > Its good to hear that SCSI BUSY and SCSI Task_Set_Full are both > handled correctly by the linux kernel. > > > The bug must be in my code then! > > I'll look deeper at the wireshark traces. > > thanks > > iscsi devel man. > > > On Thu, Aug 25, 2011 at 4:39 PM, Mike Christie <micha...@cs.wisc.edu > <mailto:micha...@cs.wisc.edu>> wrote: > > On 08/25/2011 05:23 PM, iscsi developer man wrote: > > Thanks Mike, > > > > So what happens if we return the task set full or the busy > status forever? > > Does the host get an io error at a certain timeout, does the > host silently > > return back to the application that the operation has completed > > successfully, or does it retry indefinitely? > > > > The info below is for the current upstream kernel. It is probably > correct from about 2.6.18 - 3.*. > > There is a max time value that the scsi layer will retry. It > depends on > the command type. The algorithm is: > > (scsi_cmnd->allowed + 1) * scsi_cmnd->timeout. > > The allowed value for disk IO is 5. The timeout depends on your > distro. > You can see it in /sys/block/sdX/device/timeout. The kernel sets > it to > 30 but some distro udev version set it to 60. Users can set it to > whatever makes them happy so who knows. > > If the command has not completed in ((scsi_cmnd->allowed + 1) * > scsi_cmnd->timeout) seconds then the command is failed. In > /var/log/messages you would see: > > "timing out command, waited X seconds", > > And the upper layers would get some error. The error value > depends on > the IO type. The block layer, dm, file systems (kernel stuff) > gets -EIO. > If you were doing SG IO then you would see the scsi status value > you set > in the SG IO's error data. > > > > > -- > You received this message because you are subscribed to the Google > Groups "open-iscsi" group. > To post to this group, send email to open-iscsi@googlegroups.com. > To unsubscribe from this group, send email to > open-iscsi+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/open-iscsi?hl=en. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.