Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
On 04/24/2018 11:09 AM, Steffen Maier wrote: On 11/04/2016 05:35 PM, Martin K. Petersen wrote: "Hannes" == Hannes Reineckewrites: Hannes> Checking with SAT-3 (section 6.2.4: Commands the SATL queues Hannes> internally) the implemented behaviour is standards conformant, Hannes> although the standard also allows for returning 'TASK SET FULL' Hannes> or 'BUSY' in these cases. Doing so would nicely solve this Hannes> issue. I agree with Hannes that it would be appropriate for the SATL to report busy when it makes an non-queued command queueable. Wouldn't this potentially still cause problems if the secure erase takes longer than max_retries * scmd_tmo. I.e. the command timing out by default after 180 seconds as in https://www.spinics.net/lists/linux-block/msg24837.html ? The fix approach here seems to also handle this gracefully. Well, yes, of course the command will be terminated after it timed out. But typically secure erase is invoked from userspace via sg ioctls, and it's in the responsibility of the application to set the correct timeout. Cheers, Hannes
Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
On 11/04/2016 05:35 PM, Martin K. Petersen wrote: "Hannes" == Hannes Reineckewrites: Hannes> Checking with SAT-3 (section 6.2.4: Commands the SATL queues Hannes> internally) the implemented behaviour is standards conformant, Hannes> although the standard also allows for returning 'TASK SET FULL' Hannes> or 'BUSY' in these cases. Doing so would nicely solve this Hannes> issue. I agree with Hannes that it would be appropriate for the SATL to report busy when it makes an non-queued command queueable. Wouldn't this potentially still cause problems if the secure erase takes longer than max_retries * scmd_tmo. I.e. the command timing out by default after 180 seconds as in https://www.spinics.net/lists/linux-block/msg24837.html ? The fix approach here seems to also handle this gracefully. -- Mit freundlichen Grüßen / Kind regards Steffen Maier Linux on z Systems Development IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294
Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
On Sat, Nov 5, 2016 at 6:47 PM, Andrey Grodzovskywrote: > On Fri, Nov 4, 2016 at 10:51 AM, Hannes Reinecke wrote: >> On 11/04/2016 01:45 PM, Sreekanth Reddy wrote: >>> >>> Hi All, >>> >>> From last two days, I was working with my firmware team to get the >>> required info over this issue. Here is my firmware team response >>> >>> "For ATA PASSTHROUGH commands, the IOC SATL will not check for the >>> opcode and will direct it to the drive. So even though ATA PASSTHOUGH >>> has ATA erase to the drive, IOC SATL FW will not know that and as a >>> general logic for all ATA PASSTHOGH commands, IOC FW will pend the >>> upcoming IOs untill the previous ATA PASSTHORUGH completes. This is as >>> per the SAT specification for SAS controllers and we can't compare it >>> with the SATA controllers in the on board that have full fledge SATA >>> implementation". >>> >>> So this is an expected behavior from our HBA firmware. i.e. it will >>> pend the subsequent commands if any ATA PASSTHROUGH command is going >>> on. So their is no issue with the FW. >>> >> But is there a way to figure out if the firmware / SATL layer is busy >> processing requests? >> >> With 'real' ATA HBAs these issue doesn't occur, as the ATA erase command is >> a non-queued command, and hence the next command automatically has to wait >> for the erase command to complete. >> But this wait happens as the ATA HBA returns 'BUSY', and the linux I/O stack >> will then reset the timeout for all consecutive commands. >> >> With mpt3sas _all_ commands are queued, so if there is a long-running I/O >> command all other commands already in the queue will time out. >> >> Which is at least a very awkward behaviour. >> >> Checking with SAT-3 (section 6.2.4: Commands the SATL queues internally) the >> implemented behaviour is standards conformant, although the standard also >> allows for returning 'TASK SET FULL' or 'BUSY' in these cases. >> Doing so would nicely solve this issue. >> >>> Today I have tried the same test case on my local setup. i.e. I have >>> issued a secure erase command using hdparm utility and observed the >>> same issue on 4.2.3-300.fc23.x86_64 kernel. >>> >>> Then after browsing over this issue, I found that some people are >>> recommending to enable 'CONFIG_IDE_TASK_IOCTL' Kconfig flag. I had a >>> compiled 4.4.0 kernel, so I have enabled this CONFIG_IDE_TASK_IOCTL >>> and recompiled this 4.4.0 kernel and booted in to this kernel. Then I >>> tried same test case and I haven't observed this issue and secure >>> erase operation was completed successfully. >>> >>> So, can you please try once with CONFIG_IDE_TASK_IOCTL enabled. >>> >> Errm. >> CONFIG_IDE_TASK_IOCTL is for the old IDE subsystem, which isn't in use here. >> So this option does not make a difference when using mpt3sas, as this is a >> 'real' SCSI driver which never calls out into any of these subsystems. >> >> I would be _VERY_ much surprised if that would make a difference. >> >> The reason why this behaviour did go unnoticed with older kernels was that a >> command timeout would trigger SCSI EH to engage, and that in turn required >> all outstanding commands to complete. >> So by the time SCSI EH started the ERASE command was complete, and a retry >> of the timed-out commands would work. > > Indeed, when retesting with CONFIG_IDE_TASK_IOCTL=y and. reverting the > fix the bug is back. > > Thanks, > Andrey Hi Andrey, We are fine with this patch with below few changes, 1. Please remove below comment. it not a bug in firmware, it is designed like that, /* This is a work around for a bug with LSI Fusion MPT SAS2 when * pefroming secure erase. Due to the verly long time the operation * takes commands issued during the erase will time out and will trigger * execution of abort hook. This leads to device reset and premature * termination of the secured erase. */ 2. Use SCSI commands opcodes definitions instead of value, so replace below line return (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85); as return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16); 3. Please correct alignment for the below comment, /** * Lock the device for any subsequent command until * command is done. */ Thanks, Sreekanth >> >> >> Cheers, >> >> Hannes >> -- >> Dr. Hannes Reinecke zSeries & Storage >> h...@suse.de +49 911 74053 688 >> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg >> GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
On Fri, Nov 4, 2016 at 10:51 AM, Hannes Reineckewrote: > On 11/04/2016 01:45 PM, Sreekanth Reddy wrote: >> >> Hi All, >> >> From last two days, I was working with my firmware team to get the >> required info over this issue. Here is my firmware team response >> >> "For ATA PASSTHROUGH commands, the IOC SATL will not check for the >> opcode and will direct it to the drive. So even though ATA PASSTHOUGH >> has ATA erase to the drive, IOC SATL FW will not know that and as a >> general logic for all ATA PASSTHOGH commands, IOC FW will pend the >> upcoming IOs untill the previous ATA PASSTHORUGH completes. This is as >> per the SAT specification for SAS controllers and we can't compare it >> with the SATA controllers in the on board that have full fledge SATA >> implementation". >> >> So this is an expected behavior from our HBA firmware. i.e. it will >> pend the subsequent commands if any ATA PASSTHROUGH command is going >> on. So their is no issue with the FW. >> > But is there a way to figure out if the firmware / SATL layer is busy > processing requests? > > With 'real' ATA HBAs these issue doesn't occur, as the ATA erase command is > a non-queued command, and hence the next command automatically has to wait > for the erase command to complete. > But this wait happens as the ATA HBA returns 'BUSY', and the linux I/O stack > will then reset the timeout for all consecutive commands. > > With mpt3sas _all_ commands are queued, so if there is a long-running I/O > command all other commands already in the queue will time out. > > Which is at least a very awkward behaviour. > > Checking with SAT-3 (section 6.2.4: Commands the SATL queues internally) the > implemented behaviour is standards conformant, although the standard also > allows for returning 'TASK SET FULL' or 'BUSY' in these cases. > Doing so would nicely solve this issue. > >> Today I have tried the same test case on my local setup. i.e. I have >> issued a secure erase command using hdparm utility and observed the >> same issue on 4.2.3-300.fc23.x86_64 kernel. >> >> Then after browsing over this issue, I found that some people are >> recommending to enable 'CONFIG_IDE_TASK_IOCTL' Kconfig flag. I had a >> compiled 4.4.0 kernel, so I have enabled this CONFIG_IDE_TASK_IOCTL >> and recompiled this 4.4.0 kernel and booted in to this kernel. Then I >> tried same test case and I haven't observed this issue and secure >> erase operation was completed successfully. >> >> So, can you please try once with CONFIG_IDE_TASK_IOCTL enabled. >> > Errm. > CONFIG_IDE_TASK_IOCTL is for the old IDE subsystem, which isn't in use here. > So this option does not make a difference when using mpt3sas, as this is a > 'real' SCSI driver which never calls out into any of these subsystems. > > I would be _VERY_ much surprised if that would make a difference. > > The reason why this behaviour did go unnoticed with older kernels was that a > command timeout would trigger SCSI EH to engage, and that in turn required > all outstanding commands to complete. > So by the time SCSI EH started the ERASE command was complete, and a retry > of the timed-out commands would work. Indeed, when retesting with CONFIG_IDE_TASK_IOCTL=y and. reverting the fix the bug is back. Thanks, Andrey > > > Cheers, > > Hannes > -- > Dr. Hannes Reinecke zSeries & Storage > h...@suse.de +49 911 74053 688 > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
> "Hannes" == Hannes Reineckewrites: Hannes> Checking with SAT-3 (section 6.2.4: Commands the SATL queues Hannes> internally) the implemented behaviour is standards conformant, Hannes> although the standard also allows for returning 'TASK SET FULL' Hannes> or 'BUSY' in these cases. Doing so would nicely solve this Hannes> issue. I agree with Hannes that it would be appropriate for the SATL to report busy when it makes an non-queued command queueable. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
On 11/04/2016 01:45 PM, Sreekanth Reddy wrote: Hi All, From last two days, I was working with my firmware team to get the required info over this issue. Here is my firmware team response "For ATA PASSTHROUGH commands, the IOC SATL will not check for the opcode and will direct it to the drive. So even though ATA PASSTHOUGH has ATA erase to the drive, IOC SATL FW will not know that and as a general logic for all ATA PASSTHOGH commands, IOC FW will pend the upcoming IOs untill the previous ATA PASSTHORUGH completes. This is as per the SAT specification for SAS controllers and we can't compare it with the SATA controllers in the on board that have full fledge SATA implementation". So this is an expected behavior from our HBA firmware. i.e. it will pend the subsequent commands if any ATA PASSTHROUGH command is going on. So their is no issue with the FW. But is there a way to figure out if the firmware / SATL layer is busy processing requests? With 'real' ATA HBAs these issue doesn't occur, as the ATA erase command is a non-queued command, and hence the next command automatically has to wait for the erase command to complete. But this wait happens as the ATA HBA returns 'BUSY', and the linux I/O stack will then reset the timeout for all consecutive commands. With mpt3sas _all_ commands are queued, so if there is a long-running I/O command all other commands already in the queue will time out. Which is at least a very awkward behaviour. Checking with SAT-3 (section 6.2.4: Commands the SATL queues internally) the implemented behaviour is standards conformant, although the standard also allows for returning 'TASK SET FULL' or 'BUSY' in these cases. Doing so would nicely solve this issue. Today I have tried the same test case on my local setup. i.e. I have issued a secure erase command using hdparm utility and observed the same issue on 4.2.3-300.fc23.x86_64 kernel. Then after browsing over this issue, I found that some people are recommending to enable 'CONFIG_IDE_TASK_IOCTL' Kconfig flag. I had a compiled 4.4.0 kernel, so I have enabled this CONFIG_IDE_TASK_IOCTL and recompiled this 4.4.0 kernel and booted in to this kernel. Then I tried same test case and I haven't observed this issue and secure erase operation was completed successfully. So, can you please try once with CONFIG_IDE_TASK_IOCTL enabled. Errm. CONFIG_IDE_TASK_IOCTL is for the old IDE subsystem, which isn't in use here. So this option does not make a difference when using mpt3sas, as this is a 'real' SCSI driver which never calls out into any of these subsystems. I would be _VERY_ much surprised if that would make a difference. The reason why this behaviour did go unnoticed with older kernels was that a command timeout would trigger SCSI EH to engage, and that in turn required all outstanding commands to complete. So by the time SCSI EH started the ERASE command was complete, and a retry of the timed-out commands would work. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
Hi All, >From last two days, I was working with my firmware team to get the required info over this issue. Here is my firmware team response "For ATA PASSTHROUGH commands, the IOC SATL will not check for the opcode and will direct it to the drive. So even though ATA PASSTHOUGH has ATA erase to the drive, IOC SATL FW will not know that and as a general logic for all ATA PASSTHOGH commands, IOC FW will pend the upcoming IOs untill the previous ATA PASSTHORUGH completes. This is as per the SAT specification for SAS controllers and we can't compare it with the SATA controllers in the on board that have full fledge SATA implementation". So this is an expected behavior from our HBA firmware. i.e. it will pend the subsequent commands if any ATA PASSTHROUGH command is going on. So their is no issue with the FW. Today I have tried the same test case on my local setup. i.e. I have issued a secure erase command using hdparm utility and observed the same issue on 4.2.3-300.fc23.x86_64 kernel. Then after browsing over this issue, I found that some people are recommending to enable 'CONFIG_IDE_TASK_IOCTL' Kconfig flag. I had a compiled 4.4.0 kernel, so I have enabled this CONFIG_IDE_TASK_IOCTL and recompiled this 4.4.0 kernel and booted in to this kernel. Then I tried same test case and I haven't observed this issue and secure erase operation was completed successfully. So, can you please try once with CONFIG_IDE_TASK_IOCTL enabled. Thanks, Sreekanth On Thu, Nov 3, 2016 at 9:19 AM, Igor Rybak <i...@media-clone.com> wrote: > Hi, > > We tried the latest LSI firmware 20.0.0.7, also collected logs by the > Broadcom script and emailed to their tech support already. > > Thanks, > > Igor Rybak > VP Engineering > MediaClone Inc > > > Original message > From: Andrey Grodzovsky <andrey2...@gmail.com> > Date: 11/2/16 9:31 PM (GMT+05:30) > To: Sreekanth Reddy <sreekanth.re...@broadcom.com>, Igor Rybak > <i...@media-clone.com>, Ezra Kohavi <e...@media-clone.com> > Cc: PDL-MPT-FUSIONLINUX <mpt-fusionlinux@broadcom.com>, > linux-scsi@vger.kernel.org, Hannes Reinecke <h...@suse.de>, Sathya Prakash > <sathya.prak...@broadcom.com>, Chaitra P B <chaitra.basa...@broadcom.com>, > Suganath Prabu Subramani <suganath-prabu.subram...@broadcom.com>, > sta...@vger.kernel.org > Subject: Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination > (v2) > > > > On Wed, Nov 2, 2016 at 6:05 AM, Sreekanth Reddy > <sreekanth.re...@broadcom.com> wrote: >> >> On Wed, Nov 2, 2016 at 7:37 AM, Hannes Reinecke <h...@suse.de> wrote: >> > On 11/02/2016 01:09 AM, Andrey Grodzovsky wrote: >> >> >> >> Problem: >> >> This is a work around for a bug with LSI Fusion MPT SAS2 when >> >> pefroming secure erase. Due to the very long time the operation >> >> takes commands issued during the erase will time out and will trigger >> >> execution of abort hook. Even though the abort hook is called for >> >> the specifc command which timed out this leads to entire device halt >> >> (scsi_state terminated) and premature termination of the secured erase. >> >> >> >> Fix: >> >> Set device state to busy while erase in progress to reject any incoming >> >> commands until the erase is done. The device is blocked any way during >> >> this time and cannot execute any other command. >> >> More data and logs can be found here - >> >> https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view >> >> >> >> v2: Update according to example patch by Hannes Reinecke to apply >> >> the blocking logic to any ATA 12/16 command. >> >> >> >> Signed-off-by: Andrey Grodzovsky <andrey2...@gmail.com> >> >> Cc: <linux-scsi@vger.kernel.org> >> >> Cc: Sathya Prakash <sathya.prak...@broadcom.com> >> >> Cc: Chaitra P B <chaitra.basa...@broadcom.com> >> >> Cc: Suganath Prabu Subramani <suganath-prabu.subram...@broadcom.com> >> >> Cc: Sreekanth Reddy <sreekanth.re...@broadcom.com> >> >> Cc: Hannes Reinecke <h...@suse.de> >> >> Cc: <sta...@vger.kernel.org> >> >> --- >> >> drivers/scsi/mpt3sas/mpt3sas_scsih.c | 26 ++ >> >> 1 file changed, 26 insertions(+) >> >> >> >> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c >> >> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c >> >> index 5a97e32..43ab0cc 100644 >> >> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c >> >> +++ b/drivers/scsi/mpt3sa
Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
On Wed, Nov 2, 2016 at 7:37 AM, Hannes Reineckewrote: > On 11/02/2016 01:09 AM, Andrey Grodzovsky wrote: >> >> Problem: >> This is a work around for a bug with LSI Fusion MPT SAS2 when >> pefroming secure erase. Due to the very long time the operation >> takes commands issued during the erase will time out and will trigger >> execution of abort hook. Even though the abort hook is called for >> the specifc command which timed out this leads to entire device halt >> (scsi_state terminated) and premature termination of the secured erase. >> >> Fix: >> Set device state to busy while erase in progress to reject any incoming >> commands until the erase is done. The device is blocked any way during >> this time and cannot execute any other command. >> More data and logs can be found here - >> https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view >> >> v2: Update according to example patch by Hannes Reinecke to apply >> the blocking logic to any ATA 12/16 command. >> >> Signed-off-by: Andrey Grodzovsky >> Cc: >> Cc: Sathya Prakash >> Cc: Chaitra P B >> Cc: Suganath Prabu Subramani >> Cc: Sreekanth Reddy >> Cc: Hannes Reinecke >> Cc: >> --- >> drivers/scsi/mpt3sas/mpt3sas_scsih.c | 26 ++ >> 1 file changed, 26 insertions(+) >> >> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c >> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c >> index 5a97e32..43ab0cc 100644 >> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c >> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c >> @@ -3500,6 +3500,20 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, >> u16 ioc_status) >> SAM_STAT_CHECK_CONDITION; >> } >> >> +/** >> + * This is a work around for a bug with LSI Fusion MPT SAS2 when >> + * pefroming secure erase. Due to the verly long time the operation >> + * takes commands issued during the erase will time out and will trigger >> + * execution of abort hook. This leads to device reset and premature >> + * termination of the secured erase. >> + * >> + */ >> +static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd) >> +{ >> + return (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85); >> +} >> + >> + >> >> /** >> * _scsih_qcmd - main scsi request entry point >> @@ -3528,6 +3542,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct >> scsi_cmnd *scmd) >> scsi_print_command(scmd); >> #endif >> >> + /** >> + * Lock the device for any subsequent command until >> + * command is done. >> + */ >> + if (ata_12_16_cmd(scmd)) >> + scsi_internal_device_block(scmd->device); >> + >> + >> sas_device_priv_data = scmd->device->hostdata; >> if (!sas_device_priv_data || !sas_device_priv_data->sas_target) { >> scmd->result = DID_NO_CONNECT << 16; >> @@ -4062,6 +4084,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 >> smid, u8 msix_index, u32 reply) >> if (scmd == NULL) >> return 1; >> >> + if (ata_12_16_cmd(scmd)) >> + scsi_internal_device_unblock(scmd->device, SDEV_RUNNING); >> + >> + >> mpi_request = mpt3sas_base_get_msg_frame(ioc, smid); >> >> if (mpi_reply == NULL) { >> > Yeah, it's ugly, but I can't think of a better solution for the moment. > Thanks for debugging this. May I known the result of same test case if the SATA drive is connected to on-bord SATA? If it is assumed to be HBA firmware issue then it should be fixed in the Firmware not in the driver. Have you tried with the latest HBA Firmware image? if it still occurs then is it possible for you to share the firmware logs? I think that service request has raised for this issue with Broadcom, in this service request our support people can help you in collecting the firmware logs and can provide the analysis of those firmware logs. Thanks, Sreekanth > > Reviewed-by: Hannes Reinecke > > Cheers, > > Hannes > -- > Dr. Hannes Reinecke zSeries & Storage > h...@suse.de +49 911 74053 688 > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
On 11/02/2016 01:09 AM, Andrey Grodzovsky wrote: Problem: This is a work around for a bug with LSI Fusion MPT SAS2 when pefroming secure erase. Due to the very long time the operation takes commands issued during the erase will time out and will trigger execution of abort hook. Even though the abort hook is called for the specifc command which timed out this leads to entire device halt (scsi_state terminated) and premature termination of the secured erase. Fix: Set device state to busy while erase in progress to reject any incoming commands until the erase is done. The device is blocked any way during this time and cannot execute any other command. More data and logs can be found here - https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view v2: Update according to example patch by Hannes Reinecke to apply the blocking logic to any ATA 12/16 command. Signed-off-by: Andrey GrodzovskyCc: Cc: Sathya Prakash Cc: Chaitra P B Cc: Suganath Prabu Subramani Cc: Sreekanth Reddy Cc: Hannes Reinecke Cc: --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 26 ++ 1 file changed, 26 insertions(+) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 5a97e32..43ab0cc 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -3500,6 +3500,20 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status) SAM_STAT_CHECK_CONDITION; } +/** + * This is a work around for a bug with LSI Fusion MPT SAS2 when + * pefroming secure erase. Due to the verly long time the operation + * takes commands issued during the erase will time out and will trigger + * execution of abort hook. This leads to device reset and premature + * termination of the secured erase. + * + */ +static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd) +{ + return (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85); +} + + /** * _scsih_qcmd - main scsi request entry point @@ -3528,6 +3542,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd) scsi_print_command(scmd); #endif + /** + * Lock the device for any subsequent command until + * command is done. + */ + if (ata_12_16_cmd(scmd)) + scsi_internal_device_block(scmd->device); + + sas_device_priv_data = scmd->device->hostdata; if (!sas_device_priv_data || !sas_device_priv_data->sas_target) { scmd->result = DID_NO_CONNECT << 16; @@ -4062,6 +4084,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) if (scmd == NULL) return 1; + if (ata_12_16_cmd(scmd)) + scsi_internal_device_unblock(scmd->device, SDEV_RUNNING); + + mpi_request = mpt3sas_base_get_msg_frame(ioc, smid); if (mpi_reply == NULL) { Yeah, it's ugly, but I can't think of a better solution for the moment. Thanks for debugging this. Reviewed-by: Hannes Reinecke Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
Problem: This is a work around for a bug with LSI Fusion MPT SAS2 when pefroming secure erase. Due to the very long time the operation takes commands issued during the erase will time out and will trigger execution of abort hook. Even though the abort hook is called for the specifc command which timed out this leads to entire device halt (scsi_state terminated) and premature termination of the secured erase. Fix: Set device state to busy while erase in progress to reject any incoming commands until the erase is done. The device is blocked any way during this time and cannot execute any other command. More data and logs can be found here - https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view v2: Update according to example patch by Hannes Reinecke to apply the blocking logic to any ATA 12/16 command. Signed-off-by: Andrey GrodzovskyCc: Cc: Sathya Prakash Cc: Chaitra P B Cc: Suganath Prabu Subramani Cc: Sreekanth Reddy Cc: Hannes Reinecke Cc: --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 26 ++ 1 file changed, 26 insertions(+) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 5a97e32..43ab0cc 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -3500,6 +3500,20 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status) SAM_STAT_CHECK_CONDITION; } +/** + * This is a work around for a bug with LSI Fusion MPT SAS2 when + * pefroming secure erase. Due to the verly long time the operation + * takes commands issued during the erase will time out and will trigger + * execution of abort hook. This leads to device reset and premature + * termination of the secured erase. + * + */ +static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd) +{ + return (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85); +} + + /** * _scsih_qcmd - main scsi request entry point @@ -3528,6 +3542,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd) scsi_print_command(scmd); #endif + /** + * Lock the device for any subsequent command until + * command is done. + */ + if (ata_12_16_cmd(scmd)) + scsi_internal_device_block(scmd->device); + + sas_device_priv_data = scmd->device->hostdata; if (!sas_device_priv_data || !sas_device_priv_data->sas_target) { scmd->result = DID_NO_CONNECT << 16; @@ -4062,6 +4084,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) if (scmd == NULL) return 1; + if (ata_12_16_cmd(scmd)) + scsi_internal_device_unblock(scmd->device, SDEV_RUNNING); + + mpi_request = mpt3sas_base_get_msg_frame(ioc, smid); if (mpi_reply == NULL) { -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html