On 02/01/2010 11:52 AM, Mike Christie wrote:
On 02/01/2010 05:14 AM, Erez Zilber wrote:
When iscsi_eh_cmd_timed_out gets called, we can ask scsi-ml to give us
more time if the cmd is making progress (i.e. if there was some data
transfer since the last timeout).
The problem is that task->last_xfer& task->last_timeout are set to
the value of 'jiffies' when allocating the task. If the target machine
is already dead when we send the cmd, no progress will be made. Still,
when iscsi_eh_cmd_timed_out will be called, it will think that data
was sent since the last timeout and reset the timer (and waste time).
In order to solve that, iscsi_eh_cmd_timed_out should also check if
there was any data transfer after the task was allocated.
I agree it is a problem with the code.
The problem is that the check also handled the case where we are so
backed up that we cannot even send a cmd/data within the cmd timeout.
For that case, the check was giving it a extra cmd timeout seconds to
get it off. That code is not really good though. It should probably just
loop over all the cmds there and see if any cmds have made progress. If
so give the cmd more time, if not then fail.
I was not sure though if I should check if any cmds to the target made
progress or if any cmds to the same disk. It could be that just one disk
went bad, so we might want to check per disk. However, this could be the
first IO to the disk and it just got stuck behind a bunch of other IO to
other disks, so in that case we want to check per target.
Give me until tomorrow. I think I can cook up patch. Before when
deciding when to check for dev vs target, I was mixed up with some
reordering stuff, but I think I have a patch that should work for both
of us.
--
You received this message because you are subscribed to the Google Groups
"open-iscsi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/open-iscsi?hl=en.