Re: scsi: Always retry internal target error
> On 10/17/2017 09:11 AM, Hannes Reinecke wrote: >> EMC Symmetrix is returning 'internal target error' for a variety >> of conditions, most of which will be transient. >> So we should always retry it, even with failfast set. >> Otherwise we'd get spurious path flaps with multipath. >> >> Signed-off-by: Hannes Reinecke >> --- >> drivers/scsi/scsi_error.c | 13 + >> 1 file changed, 13 insertions(+) >> >> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c >> index 5086489..5722c2e 100644 >> --- a/drivers/scsi/scsi_error.c >> +++ b/drivers/scsi/scsi_error.c >> @@ -497,6 +497,16 @@ int scsi_check_sense(struct scsi_cmnd *scmd) >> if (sshdr.asc == 0x10) /* DIF */ >> return SUCCESS; >> >> +if (!strncmp(scmd->device->vendor, "EMC", 3) && >> +!strncmp(scmd->device->model, "SYMMETRIX", 9) && >> +(sshdr.asc == 0x44) && (sshdr.ascq == 0x0)) { >> +/* >> + * EMC Symmetrix returns 'Internal target failure' >> + * for a variety of internal issues, all of which >> + * can be recovered by retry. >> + */ >> +return ADD_TO_MLQUEUE; >> +} >> return NEEDS_RETRY; >> case NOT_READY: >> case UNIT_ATTENTION: >> @@ -1846,6 +1856,9 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd) >> rtn = scsi_check_sense(scmd); >> if (rtn == NEEDS_RETRY) >> goto maybe_retry; >> +else if (rtn == ADD_TO_MLQUEUE) >> +/* Always enforce a retry for ADD_TO_MLQUEUE */ >> +rtn = NEEDS_RETRY; >> /* if rtn == FAILED, we have no sense information; >> * returning FAILED will wake the error handler thread >> * to collect the sense and redo the decide >> On 10/18/2017 07:27 AM, Hannes Reinecke wrote: >> Yeah, you are right. >> Will be adding a blacklist entry for this. Is there a new patch ongoing?
Re: [PATCH] scsi: Always retry internal target error
On 10/18/2017 08:34 AM, Christoph Hellwig wrote: > On Wed, Oct 18, 2017 at 08:15:46AM +0200, Hannes Reinecke wrote: >>> It's decidedly awful to have vendor/model-specific triggers in >>> scsi_error. >>> >>> What are the drawbacks of just always refiring on AC/0x44/ITF? >>> >> Hmm. 'Internal target failure' is not very descriptive, so it could mean >> anything. Hence the rather awkward approach. >> But I just checked with the qemu code, and that returns 0x44/0x00 only >> if some (internal) call returned with -ENOMEM. >> So I guess we're safe to always retry here. > > And even if not we should add a quirk so that we can just check a > bit instead of doing two string compares in the I/O completion path.. > Yeah, you are right. Will be adding a blacklist entry for this. Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)
Re: [PATCH] scsi: Always retry internal target error
On Wed, Oct 18, 2017 at 08:15:46AM +0200, Hannes Reinecke wrote: > > It's decidedly awful to have vendor/model-specific triggers in > > scsi_error. > > > > What are the drawbacks of just always refiring on AC/0x44/ITF? > > > Hmm. 'Internal target failure' is not very descriptive, so it could mean > anything. Hence the rather awkward approach. > But I just checked with the qemu code, and that returns 0x44/0x00 only > if some (internal) call returned with -ENOMEM. > So I guess we're safe to always retry here. And even if not we should add a quirk so that we can just check a bit instead of doing two string compares in the I/O completion path..
Re: [PATCH] scsi: Always retry internal target error
On 10/18/2017 04:58 AM, Martin K. Petersen wrote: > > Hannes, > >> +if (!strncmp(scmd->device->vendor, "EMC", 3) && >> +!strncmp(scmd->device->model, "SYMMETRIX", 9) && >> +(sshdr.asc == 0x44) && (sshdr.ascq == 0x0)) { >> +/* >> + * EMC Symmetrix returns 'Internal target failure' >> + * for a variety of internal issues, all of which >> + * can be recovered by retry. >> + */ >> +return ADD_TO_MLQUEUE; >> +} > > It's decidedly awful to have vendor/model-specific triggers in > scsi_error. > > What are the drawbacks of just always refiring on AC/0x44/ITF? > Hmm. 'Internal target failure' is not very descriptive, so it could mean anything. Hence the rather awkward approach. But I just checked with the qemu code, and that returns 0x44/0x00 only if some (internal) call returned with -ENOMEM. So I guess we're safe to always retry here. Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)
Re: [PATCH] scsi: Always retry internal target error
Hannes, > + if (!strncmp(scmd->device->vendor, "EMC", 3) && > + !strncmp(scmd->device->model, "SYMMETRIX", 9) && > + (sshdr.asc == 0x44) && (sshdr.ascq == 0x0)) { > + /* > + * EMC Symmetrix returns 'Internal target failure' > + * for a variety of internal issues, all of which > + * can be recovered by retry. > + */ > + return ADD_TO_MLQUEUE; > + } It's decidedly awful to have vendor/model-specific triggers in scsi_error. What are the drawbacks of just always refiring on AC/0x44/ITF? -- Martin K. Petersen Oracle Linux Engineering
[PATCH] scsi: Always retry internal target error
EMC Symmetrix is returning 'internal target error' for a variety of conditions, most of which will be transient. So we should always retry it, even with failfast set. Otherwise we'd get spurious path flaps with multipath. Signed-off-by: Hannes Reinecke --- drivers/scsi/scsi_error.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 5086489..5722c2e 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -497,6 +497,16 @@ int scsi_check_sense(struct scsi_cmnd *scmd) if (sshdr.asc == 0x10) /* DIF */ return SUCCESS; + if (!strncmp(scmd->device->vendor, "EMC", 3) && + !strncmp(scmd->device->model, "SYMMETRIX", 9) && + (sshdr.asc == 0x44) && (sshdr.ascq == 0x0)) { + /* +* EMC Symmetrix returns 'Internal target failure' +* for a variety of internal issues, all of which +* can be recovered by retry. +*/ + return ADD_TO_MLQUEUE; + } return NEEDS_RETRY; case NOT_READY: case UNIT_ATTENTION: @@ -1846,6 +1856,9 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd) rtn = scsi_check_sense(scmd); if (rtn == NEEDS_RETRY) goto maybe_retry; + else if (rtn == ADD_TO_MLQUEUE) + /* Always enforce a retry for ADD_TO_MLQUEUE */ + rtn = NEEDS_RETRY; /* if rtn == FAILED, we have no sense information; * returning FAILED will wake the error handler thread * to collect the sense and redo the decide -- 1.8.5.6