Roland Dreier wrote:
> 1st scsi_try_host_reset() --> srp_host_reset() -->
> srp_reconnect_target() return SUCCESS. Then scsi_eh_try_stu() or
> scsi_eh_tur() is called right after
>
> scsi_eh_try_stu or scsi_eh_tur --> scsi_send_eh_cmnd() -->
> srp_queuecommand()
But after srp_reconnect_target(), both SRP's and the midlayer's queue
of pending commands should be completely empty, since I put
list_for_each_entry(req, &target->req_queue, list) {
req->scmnd->result = DID_RESET << 16;
req->scmnd->scsi_done(req->scmnd);
srp_unmap_data(req->scmnd, target, req);
}
and
INIT_LIST_HEAD(&target->free_reqs);
INIT_LIST_HEAD(&target->req_queue);
for (i = 0; i < SRP_SQ_SIZE; ++i)
list_add_tail(&target->req_ring[i].list, &target->free_reqs);
in there. Why doesn't that work to kill all the pending commands?
That works fine and kills all the pending commands; however
right after srp_host_reset return, scsi error handling
queue/send the stu or tur scsi command right away in the
error handling flow of function scsi_eh_host_reset()
Please re-read scsi_eh_host_reset() and
scsi_try_host_reset() in scsi_error.c. Here is the logic
scsi_eh_host_reset() --> scsi_try_host_reset() -->
srp_host_reset() --- all pending command are killed.
srp_host_reset() returns SUCCESS, scsi_try_host_reset()
returns SUCCCESS.
static int scsi_eh_host_reset(struct list_head *work_q,
struct list_head *done_q)
{
...
rtn = scsi_try_host_reset(scmd);
if (rtn == SUCCESS) {
list_for_each_entry_safe(scmd, next, work_q,
eh_entry) {
if (!scsi_device_online(scmd->device) ||
(!scsi_eh_try_stu(scmd) &&
!scsi_eh_tur(scmd)) ||
!scsi_eh_tur(scmd))
...
}
Since the (rtn == SUCCESS), scsi_eh_host_reset calls
scsi_eh_try_stu() or scsi_eh_try_tur() which will call
scsi_send_eh_cmnd() --> srp_queuecommand(). Now srp's
request queue is not empty anymore.
scsi_eh_try_stu or scsi_eh_try_tur get timeout, scsi
midlayer tried to abort stu or tur command as well. Since we
delay to clean in srp_reset_device(), srp's request queue is
still not empty. This stu or tur command is freed by scsi
midlayer. The next srp_host_reset() will try to clean srp's
request queue with "old" request referencing to freed scsi
command.
If you still have question, I can call you or give me a call
at (408) 916-0006
Vu
_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general