Re: [PATCH 2/2] libiscsi: Check for empty cmdqueue in iscsi_eh_cmd_timed_out
On 06/24/2009 03:52 PM, Mike Christie wrote: > On 06/24/2009 09:40 AM, Hannes Reinecke wrote: >> During heavy I/O the scsi cmd might be stuck in the cmdqueue >> and hasn't even been sent at the time the command timeout >> strikes. So we should be resetting the command timer here >> and not aborting the command. >> >> Signed-off-by: Hannes Reinecke >> --- >>drivers/scsi/libiscsi.c |8 >>1 files changed, 8 insertions(+), 0 deletions(-) >> >> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c >> index 41bb177..e5b79cf 100644 >> --- a/drivers/scsi/libiscsi.c >> +++ b/drivers/scsi/libiscsi.c >> @@ -1795,6 +1795,14 @@ static enum blk_eh_timer_return >> iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc) >> goto done; >> } >> >> +if (!list_empty(&conn->cmdqueue)) { > > > I think you want to check for the task->state == ISCSI_TASK_PENDING. If > we just check if there is any IO on the cmdqueue, could be that one IO > is really stuck and new IO is getting added and we just happen to have > bad luck and do the check above when new IO is getting added? > > > I am still not sure how do we decide when to give the command to the > scsi eh layer? If a task is stuck in ISCSI_TASK_PENDING for 5 minutes, > can we say we give up and let the scsi layer have it eventually or did > you really want to continue asking for more time? > > > I added this code > http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=d355e57d58193b89283b0c8153649f0427b0bdad > to give a task more time if we have not got/send data or it during the > scsi cmd timeuot second period. Then if on the second run, it still has > not completed a pdu then we will let the scsi eh run. This still has > problems. As you saw when the queue depth is high and/or the link or > device is slow, then we can easily queue enough IO where a command does > not get run for a couple scsi cmd timeout periods. I was also working on > decreasing the queuedepth and host->can_queue, but am not done (I have > just consolidated the QUEUE_FULL ramp down code so far as you saw on > linux-scsi). > Oh yeah for the ramp down code I was going to decrease the scsi device queue depth and/or the host->canqueue when we are getting scsi command time outs, but know other commands are executing. But I think even with that I am still not sure it will give us what we want. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH 2/2] libiscsi: Check for empty cmdqueue in iscsi_eh_cmd_timed_out
On 06/24/2009 09:40 AM, Hannes Reinecke wrote: > During heavy I/O the scsi cmd might be stuck in the cmdqueue > and hasn't even been sent at the time the command timeout > strikes. So we should be resetting the command timer here > and not aborting the command. > > Signed-off-by: Hannes Reinecke > --- > drivers/scsi/libiscsi.c |8 > 1 files changed, 8 insertions(+), 0 deletions(-) > > diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c > index 41bb177..e5b79cf 100644 > --- a/drivers/scsi/libiscsi.c > +++ b/drivers/scsi/libiscsi.c > @@ -1795,6 +1795,14 @@ static enum blk_eh_timer_return > iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc) > goto done; > } > > + if (!list_empty(&conn->cmdqueue)) { I think you want to check for the task->state == ISCSI_TASK_PENDING. If we just check if there is any IO on the cmdqueue, could be that one IO is really stuck and new IO is getting added and we just happen to have bad luck and do the check above when new IO is getting added? I am still not sure how do we decide when to give the command to the scsi eh layer? If a task is stuck in ISCSI_TASK_PENDING for 5 minutes, can we say we give up and let the scsi layer have it eventually or did you really want to continue asking for more time? I added this code http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=d355e57d58193b89283b0c8153649f0427b0bdad to give a task more time if we have not got/send data or it during the scsi cmd timeuot second period. Then if on the second run, it still has not completed a pdu then we will let the scsi eh run. This still has problems. As you saw when the queue depth is high and/or the link or device is slow, then we can easily queue enough IO where a command does not get run for a couple scsi cmd timeout periods. I was also working on decreasing the queuedepth and host->can_queue, but am not done (I have just consolidated the QUEUE_FULL ramp down code so far as you saw on linux-scsi). > + ISCSI_DBG_EH(session, "cmdqueue busy, reset timeout. " > + "Last data recv at %lu. Last timeout was at " > + "%lu\n", task->last_xfer, task->last_timeout); > + rc = BLK_EH_RESET_TIMER; > + goto done; > + } > + > if (!conn->recv_timeout&& !conn->ping_timeout) > goto done; > /* --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---