On 09/15/2011 06:27 AM, Aastha Mehta wrote: > Hi, > > I have a 2GB LUN exported using iscsitarget. I am seeing ISCSI_ERR_BAD_ITT > (error 1010) on the initiator. After sometime, i get messages saying that > the device is offlined, it could not recover from error. After some more > time, I get messages saying I/O rejected to offlined device. Attached are > two logs of open-iscsi. There is some call trace dumped in one of the logs, > while such a thing did not appear the next time. >
When a scsi command times out (/sys/block/sdX/device/timeout) then the scsi layer will start its error handler. The scsi ml eh will start by trying to abort commands. If that fails it will do a lun reset. If that fails a target reset and if that fails we just drop the session and do relogin to the target. If the scsi eh fails then it will set the scsi device state to offline. If you then send IO to a offline device it will be failed with the "rejecting IO to offline device" message like you saw. We know from your logs the scsi eh is running because we see devices get offlined and those offline errors. What can sometimes happen is that when we are sending aborts and resets, the target will sometimes still send responses for commands that should have been cleaned up the abort or reset. The initiator then does not know what to do and can spit out the bad itt error. At that point the initiator is basically reading in the pdu and checking the itt. It cannot look it up properly because it thinks the command/task should not be running and so it spits out that error. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
