CC'ing the fibre channel experts because they might have the same issue. On 10/11/24 3:18 AM, Xiang Zhang wrote: > Initiator need to recover session and reconnect to target, after calling > stop_conn. And target will rebuild new session info, and mark > ASC_POWERON_RESET ua sense for scsi devices belong to the target(device > reset). After recovery, first scsi command(scmd) request to target will get > ASC_POWERON_RESET(ua sense) + SAM_STAT_CHECK_CONDITION(status) in response. > According to scsi code: "scsi_done --> scsi_complete --> > scsi_decide_disposition --> scsi_check_sense", if expecting_cc_ua = 0, scmd > response with ASC_POWERON_RESET(ua sense) will ignore "cmd->retries <= > cmd->allowed", fail directly. It will cause SCSI return io_error to upper > layer without retry.
Just want to make sure I understand the problem. Does the failure only happen with tape or passthrough or if removable is set? For commands coming from sd, then scsi_io_completion will end up calling scsi_io_completion_action and seeing the UNIT_ATTENTION and will retry. I'm not saying we shouldn't do a fix like you did below. Just want to make sure I understand the case you describe above. > If we set expecting_cc_ua=1 in fail_scsi_tasks, SISC will retry the scmd > which is response with ASC_POWERON_RESET. The scmd second request to target > can successful, because target will clear ASC_POWERON_RESET in device pending > ua_sense_list after first scmd request. What does "SISC" stand for? > > Signed-off-by: Xiang Zhang <[email protected]> > --- > drivers/scsi/libiscsi.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c > index 0fda8905eabd..317e57be32b3 100644 > --- a/drivers/scsi/libiscsi.c > +++ b/drivers/scsi/libiscsi.c > @@ -629,9 +629,10 @@ static void __fail_scsi_task(struct iscsi_task *task, > int err) > conn->session->queued_cmdsn--; > /* it was never sent so just complete like normal */ > state = ISCSI_TASK_COMPLETED; > - } else if (err == DID_TRANSPORT_DISRUPTED) > + } else if (err == DID_TRANSPORT_DISRUPTED) { > state = ISCSI_TASK_ABRT_SESS_RECOV; > - else > + sc->device->expecting_cc_ua = 1; The failure case can happen with other transports like fibre channel right? If it's common I think we want this in the core scsi code. For iscsi, we want to set expecting_cc_ua whenever we call scsi_block_targets() or whenever we return DID_TRANSPORT_DISRUPTED or DID_TRANSPORT_FAILFAST. FC developers, I'm not sure if that's the case for you. For example if your driver called fc_remote_port_delete -> scsi_block_targets but then the issue is resolved quickly, like for a quick cable pull, and you called fc_remote_port_add, could there be cases where you did not get a I_T Nexus loss/reset type of issue? Or is it the case where anytime a fc driver calls fc_remote_port_delete then you will expect a UA after calling fc_remote_port_add again? -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/open-iscsi/ead203fc-abf5-49b1-b34c-64b97d3fecd6%40oracle.com.
