Re: [PATCH] scsi: handle ABORTED_COMMAND on Fujitsu ETERNUS

2018-01-18 Thread James Bottomley
On Thu, 2018-01-18 at 11:17 +0100, Hannes Reinecke wrote:
> On 01/18/2018 03:43 AM, Martin K. Petersen wrote:
> > 
> > 
> > Martin,
> > 
> > > 
> > > You'd like to spend a precious BLIST bit for this single device
> > > which uses vendor-specific ASC/Q?
> > 
> > I really don't want string comparisons in the regular code paths.
> > Also not a fan of vendor-specific ASCs. But if you must use them,
> > please add a flag and trigger on that.
> > 
> > Since this is a bit of an unusual check condition combo, we could
> > entertain whether we should simply always ADD_TO_MLQUEUE on
> > 0xb/0xc1/0x1. I wonder what would break?
> > 
> That's the usual problem I have with vendor-specific sense codes.
> In general the risk is quite low of different vendors actually using
> the same code; I guess we can get away with just adding a debug
> message here and enable it without any vendor check.
> Then we can reconsider the whole thing once we notice several vendors
> using the same sense codes.

Murphy's law says that if we rely on Vendors to chose non-overlapping
numbers we'll end up with a clash fairly quickly ...

Can't we find a way of doing this in the device_handler?  That way we
can use vendor specific codes only when we know who the vendor is?

James



Re: [PATCH] scsi: handle ABORTED_COMMAND on Fujitsu ETERNUS

2018-01-18 Thread Hannes Reinecke
On 01/18/2018 03:43 AM, Martin K. Petersen wrote:
> 
> Martin,
> 
>> You'd like to spend a precious BLIST bit for this single device which
>> uses vendor-specific ASC/Q?
> 
> I really don't want string comparisons in the regular code paths. Also
> not a fan of vendor-specific ASCs. But if you must use them, please add
> a flag and trigger on that.
> 
> Since this is a bit of an unusual check condition combo, we could
> entertain whether we should simply always ADD_TO_MLQUEUE on
> 0xb/0xc1/0x1. I wonder what would break?
> 
That's the usual problem I have with vendor-specific sense codes.
In general the risk is quite low of different vendors actually using the
same code; I guess we can get away with just adding a debug message here
and enable it without any vendor check.
Then we can reconsider the whole thing once we notice several vendors
using the same sense codes.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH] scsi: handle ABORTED_COMMAND on Fujitsu ETERNUS

2018-01-17 Thread Martin K. Petersen

Martin,

> You'd like to spend a precious BLIST bit for this single device which
> uses vendor-specific ASC/Q?

I really don't want string comparisons in the regular code paths. Also
not a fan of vendor-specific ASCs. But if you must use them, please add
a flag and trigger on that.

Since this is a bit of an unusual check condition combo, we could
entertain whether we should simply always ADD_TO_MLQUEUE on
0xb/0xc1/0x1. I wonder what would break?

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: handle ABORTED_COMMAND on Fujitsu ETERNUS

2018-01-17 Thread Xose Vazquez Perez
Martin Wilck wrote:

> On Wed, 2018-01-17 at 01:15 -0500, Martin K. Petersen wrote:
>> 
>> > +  if (!strncmp(scmd->device->vendor, "FUJITSU", 7)
>> > &&
>> > + !strncmp(scmd->device->model,
>> > "ETERNUS_DXM", 11) &&

Is DXM the S3 family?

>> Blacklist, please.
> 
> To make sure I understand correctly:
> 
> You'd like to spend a precious BLIST bit for this single device which
> uses vendor-specific ASC/Q?


A similar patch for SYMMETRIX was rejected recently:
https://marc.info/?t=15082243347


BTW, a BLIST patch for ETERNUS DXL (S2???) is also needed:
http://kernel.suse.com/cgit/kernel-source/plain/patches.fixes/scsi_scan-Send-TEST-UNIT-READY-to-LUN0-before-LUN-sc.patch?h=SLE15


Re: [PATCH] scsi: handle ABORTED_COMMAND on Fujitsu ETERNUS

2018-01-16 Thread Martin Wilck
On Wed, 2018-01-17 at 01:15 -0500, Martin K. Petersen wrote:
> Martin,
> 
> > +   if (!strncmp(scmd->device->vendor, "FUJITSU", 7)
> > &&
> > +  !strncmp(scmd->device->model,
> > "ETERNUS_DXM", 11) &&
> 
> Blacklist, please.
> 

To make sure I understand correctly:

You'd like to spend a precious BLIST bit for this single device which
uses vendor-specific ASC/Q?

Regards,
Martin

-- 
Dr. Martin Wilck , Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)



Re: [PATCH] scsi: handle ABORTED_COMMAND on Fujitsu ETERNUS

2018-01-16 Thread Martin K. Petersen

Martin,

> + if (!strncmp(scmd->device->vendor, "FUJITSU", 7) &&
> +!strncmp(scmd->device->model, "ETERNUS_DXM", 11) &&

Blacklist, please.

-- 
Martin K. Petersen  Oracle Linux Engineering


[PATCH] scsi: handle ABORTED_COMMAND on Fujitsu ETERNUS

2018-01-15 Thread Martin Wilck
On Fujitsu ETERNUS systems, sense code ABORTED COMMAND with ASC/Q C1/01
is used to indicate temporary condition where the storage-internal path
to a target is switched from one controller to another. SCSI commands
that return with this error code must be retried unconditionally (i.e. without
the "maybe_retry" logic in scsi_decide_disposition); otherwise dm-multipath
might initiate a failover from a healthy path e.g. for REQ_FAILFAST_DEV
commands.

Signed-off-by: Martin Wilck 
---
 drivers/scsi/scsi_error.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index d042915ce895..598111a741d4 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -518,6 +518,16 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
if (sshdr.asc == 0x10) /* DIF */
return SUCCESS;
 
+   if (!strncmp(scmd->device->vendor, "FUJITSU", 7) &&
+  !strncmp(scmd->device->model, "ETERNUS_DXM", 11) &&
+  (sshdr.asc == 0xc1) && (sshdr.ascq == 0x1)) {
+   /*
+* Fujitsu Eternus uses this vendor specific code
+* to indicate an internal reconfiguration status
+* which can be recovered with a retry.
+*/
+   return ADD_TO_MLQUEUE;
+   }
return NEEDS_RETRY;
case NOT_READY:
case UNIT_ATTENTION:
-- 
2.15.1