Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-17 Thread Martin K. Petersen
> "Christoph" == Christoph Hellwig  writes:

Christoph> On Tue, Jan 17, 2017 at 07:43:51PM +0530, Sreekanth Reddy wrote:
>> [Sreekanth] Just for readability purpose, can use use "if
>> (test_bit(0, _device_priv_data->ata_command_pending)" instead of
>> "if (sas_device_priv_data->ata_command_pending)".  Since while
>> setting & clearing the bit we are using atomic bit operations. I
>> don't see any issue functionality wise.

Christoph> I agree.  Also while we're into nitpicking - it would be good
Christoph> to give bit 0 of the bitmap a name instead of hardcoding the
Christoph> 0.

I tweaked the test case. We can name the bit later if more flags are
needed (and in that case the ata_command_pending would need to get
renamed too).

In any case. This issue has taken waaay too long to get resolved so the
patch is now queued up in 4.10/scsi-fixes.

Thanks everyone!

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-17 Thread James Bottomley
On Tue, 2017-01-17 at 19:43 +0530, Sreekanth Reddy wrote:
> On Tue, Jan 17, 2017 at 1:31 AM, James Bottomley
>  wrote:
> > From 91d249409546569444897a1ffde65c421e064899 Mon Sep 17 00:00:00
> > 2001
> > From: James Bottomley 
> > Date: Sun, 1 Jan 2017 09:39:24 -0800
> > Subject: [PATCH] scsi: mpt3sas: fix hang on ata passthrough
> > commands
> > 
> > mpt3sas has a firmware failure where it can only handle one pass
> > through ATA command at a time.  If another comes in, contrary to
> > the
> > SAT standard, it will hang until the first one completes (causing
> > long
> > commands like secure erase to timeout).  The original fix was to
> > block
> > the device when an ATA command came in, but this caused a
> > regression
> > with
> > 
> > commit 669f044170d8933c3d66d231b69ea97cb8447338
> > Author: Bart Van Assche 
> > Date:   Tue Nov 22 16:17:13 2016 -0800
> > 
> > scsi: srp_transport: Move queuecommand() wait code to SCSI core
> > 
> > So fix the original fix of the secure erase timeout by properly
> > returning SAM_STAT_BUSY like the SAT recommends.  The original
> > patch
> > also had a concurrency problem since scsih_qcmd is lockless at that
> > point (this is fixed by using atomic bitops to set and test the
> > flag).
> > 
> > Fixes: 18f6084a989ba1b (mpt3sas: Fix secure erase premature
> > termination)
> > Signed-off-by: James Bottomley <
> > james.bottom...@hansenpartnership.com>
> > 
> > ---
> > 
> > v2 - use bitops for lockless atomicity
> > v3 - update description, change function name
> > ---
> >  drivers/scsi/mpt3sas/mpt3sas_base.h  | 12 +++
> >  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 40 +++-
> > 
> >  2 files changed, 38 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h
> > b/drivers/scsi/mpt3sas/mpt3sas_base.h
> > index 394fe13..dcb33f4 100644
> > --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> > +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> > @@ -393,6 +393,7 @@ struct MPT3SAS_TARGET {
> >   * @eedp_enable: eedp support enable bit
> >   * @eedp_type: 0(type_1), 1(type_2), 2(type_3)
> >   * @eedp_block_length: block size
> > + * @ata_command_pending: SATL passthrough outstanding for device
> >   */
> >  struct MPT3SAS_DEVICE {
> > struct MPT3SAS_TARGET *sas_target;
> > @@ -404,6 +405,17 @@ struct MPT3SAS_DEVICE {
> > u8  ignore_delay_remove;
> > /* Iopriority Command Handling */
> > u8  ncq_prio_enable;
> > +   /*
> > +* Bug workaround for SATL handling: the mpt2/3sas firmware
> > +* doesn't return BUSY or TASK_SET_FULL for subsequent
> > +* commands while a SATL pass through is in operation as
> > the
> > +* spec requires, it simply does nothing with them until
> > the
> > +* pass through completes, causing them possibly to timeout
> > if
> > +* the passthrough is a long executing command (like format
> > or
> > +* secure erase).  This variable allows us to do the right
> > +* thing while a SATL command is pending.
> > +*/
> > +   unsigned long ata_command_pending;
> > 
> >  };
> > 
> > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > index b5c966e..830e2c1 100644
> > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > @@ -3899,9 +3899,18 @@ _scsih_temp_threshold_events(struct
> > MPT3SAS_ADAPTER *ioc,
> > }
> >  }
> > 
> > -static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
> > +static int _scsih_set_satl_pending(struct scsi_cmnd *scmd, bool
> > pending)
> >  {
> > -   return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] ==
> > ATA_16);
> > +   struct MPT3SAS_DEVICE *priv = scmd->device->hostdata;
> > +
> > +   if  (scmd->cmnd[0] != ATA_12 && scmd->cmnd[0] != ATA_16)
> > +   return 0;
> > +
> > +   if (pending)
> > +   return test_and_set_bit(0, 
> > ->ata_command_pending);
> > +
> > +   clear_bit(0, >ata_command_pending);
> > +   return 0;
> >  }
> > 
> >  /**
> > @@ -3925,9 +3934,7 @@ _scsih_flush_running_cmds(struct
> > MPT3SAS_ADAPTER *ioc)
> > if (!scmd)
> > continue;
> > count++;
> > -   if (ata_12_16_cmd(scmd))
> > -   scsi_internal_device_unblock(scmd->device,
> > -  
> >  SDEV_RUNNING);
> > +   _scsih_set_satl_pending(scmd, false);
> > mpt3sas_base_free_smid(ioc, smid);
> > scsi_dma_unmap(scmd);
> > if (ioc->pci_error_recovery)
> > @@ -4063,13 +4070,6 @@ scsih_qcmd(struct Scsi_Host *shost, struct
> > scsi_cmnd *scmd)
> > if (ioc->logging_level & MPT_DEBUG_SCSI)
> > scsi_print_command(scmd);
> > 
> > -   /*
> > -   

Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-17 Thread Christoph Hellwig
On Tue, Jan 17, 2017 at 07:43:51PM +0530, Sreekanth Reddy wrote:
> [Sreekanth] Just for readability purpose, can use use "if (test_bit(0,
> _device_priv_data->ata_command_pending)"
>  instead of "if (sas_device_priv_data->ata_command_pending)".
> Since while setting & clearing the bit we are using atomic bit
> operations. I don't see any issue functionality wise.

I agree.  Also while we're into nitpicking - it would be good to
give bit 0 of the bitmap a name instead of hardcoding the 0.

Except for these patch looks fine:

Reviewed-by: Christoph Hellwig 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-17 Thread Sreekanth Reddy
On Tue, Jan 17, 2017 at 1:31 AM, James Bottomley
 wrote:
> From 91d249409546569444897a1ffde65c421e064899 Mon Sep 17 00:00:00 2001
> From: James Bottomley 
> Date: Sun, 1 Jan 2017 09:39:24 -0800
> Subject: [PATCH] scsi: mpt3sas: fix hang on ata passthrough commands
>
> mpt3sas has a firmware failure where it can only handle one pass
> through ATA command at a time.  If another comes in, contrary to the
> SAT standard, it will hang until the first one completes (causing long
> commands like secure erase to timeout).  The original fix was to block
> the device when an ATA command came in, but this caused a regression
> with
>
> commit 669f044170d8933c3d66d231b69ea97cb8447338
> Author: Bart Van Assche 
> Date:   Tue Nov 22 16:17:13 2016 -0800
>
> scsi: srp_transport: Move queuecommand() wait code to SCSI core
>
> So fix the original fix of the secure erase timeout by properly
> returning SAM_STAT_BUSY like the SAT recommends.  The original patch
> also had a concurrency problem since scsih_qcmd is lockless at that
> point (this is fixed by using atomic bitops to set and test the flag).
>
> Fixes: 18f6084a989ba1b (mpt3sas: Fix secure erase premature termination)
> Signed-off-by: James Bottomley 
>
> ---
>
> v2 - use bitops for lockless atomicity
> v3 - update description, change function name
> ---
>  drivers/scsi/mpt3sas/mpt3sas_base.h  | 12 +++
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 40 
> +++-
>  2 files changed, 38 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
> b/drivers/scsi/mpt3sas/mpt3sas_base.h
> index 394fe13..dcb33f4 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> @@ -393,6 +393,7 @@ struct MPT3SAS_TARGET {
>   * @eedp_enable: eedp support enable bit
>   * @eedp_type: 0(type_1), 1(type_2), 2(type_3)
>   * @eedp_block_length: block size
> + * @ata_command_pending: SATL passthrough outstanding for device
>   */
>  struct MPT3SAS_DEVICE {
> struct MPT3SAS_TARGET *sas_target;
> @@ -404,6 +405,17 @@ struct MPT3SAS_DEVICE {
> u8  ignore_delay_remove;
> /* Iopriority Command Handling */
> u8  ncq_prio_enable;
> +   /*
> +* Bug workaround for SATL handling: the mpt2/3sas firmware
> +* doesn't return BUSY or TASK_SET_FULL for subsequent
> +* commands while a SATL pass through is in operation as the
> +* spec requires, it simply does nothing with them until the
> +* pass through completes, causing them possibly to timeout if
> +* the passthrough is a long executing command (like format or
> +* secure erase).  This variable allows us to do the right
> +* thing while a SATL command is pending.
> +*/
> +   unsigned long ata_command_pending;
>
>  };
>
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index b5c966e..830e2c1 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -3899,9 +3899,18 @@ _scsih_temp_threshold_events(struct MPT3SAS_ADAPTER 
> *ioc,
> }
>  }
>
> -static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
> +static int _scsih_set_satl_pending(struct scsi_cmnd *scmd, bool pending)
>  {
> -   return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
> +   struct MPT3SAS_DEVICE *priv = scmd->device->hostdata;
> +
> +   if  (scmd->cmnd[0] != ATA_12 && scmd->cmnd[0] != ATA_16)
> +   return 0;
> +
> +   if (pending)
> +   return test_and_set_bit(0, >ata_command_pending);
> +
> +   clear_bit(0, >ata_command_pending);
> +   return 0;
>  }
>
>  /**
> @@ -3925,9 +3934,7 @@ _scsih_flush_running_cmds(struct MPT3SAS_ADAPTER *ioc)
> if (!scmd)
> continue;
> count++;
> -   if (ata_12_16_cmd(scmd))
> -   scsi_internal_device_unblock(scmd->device,
> -   SDEV_RUNNING);
> +   _scsih_set_satl_pending(scmd, false);
> mpt3sas_base_free_smid(ioc, smid);
> scsi_dma_unmap(scmd);
> if (ioc->pci_error_recovery)
> @@ -4063,13 +4070,6 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
> *scmd)
> if (ioc->logging_level & MPT_DEBUG_SCSI)
> scsi_print_command(scmd);
>
> -   /*
> -* Lock the device for any subsequent command until command is
> -* done.
> -*/
> -   if (ata_12_16_cmd(scmd))
> -   scsi_internal_device_block(scmd->device);
> -
> sas_device_priv_data = scmd->device->hostdata;
> if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
> scmd->result = 

Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-17 Thread Ingo Molnar

* Martin K. Petersen  wrote:

> > "James" == James Bottomley  
> > writes:
> 
> James> Subject: [PATCH] scsi: mpt3sas: fix hang on ata passthrough
> James> commands
> 
> James> mpt3sas has a firmware failure where it can only handle one pass
> James> through ATA command at a time.  If another comes in, contrary to
> James> the SAT standard, it will hang until the first one completes
> James> (causing long commands like secure erase to timeout).  The
> James> original fix was to block the device when an ATA command came in,
> James> but this caused a regression with
> 
> Broadcom folks: Please test and ack as soon as possible so we can get
> this fix queued up.
> 
> Ingo: Since you appear to have hardware, it would be great if you could
> test James' v3 (https://patchwork.kernel.org/patch/9519383/). Sorry for
> the inconvenience.

As per the interdiff below v2->v3 did not change the code in any way, only the 
name of the function and a comment, so you can add this to v3 as well:

  Reported-by: Ingo Molnar 
  Tested-by: Ingo Molnar 

Thanks,

Ingo

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 6f9b4c051e4d..830e2c10ba02 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3899,7 +3899,7 @@ _scsih_temp_threshold_events(struct MPT3SAS_ADAPTER *ioc,
}
 }
 
-static int set_satl_pending(struct scsi_cmnd *scmd, bool pending)
+static int _scsih_set_satl_pending(struct scsi_cmnd *scmd, bool pending)
 {
struct MPT3SAS_DEVICE *priv = scmd->device->hostdata;
 
@@ -3934,7 +3934,7 @@ _scsih_flush_running_cmds(struct MPT3SAS_ADAPTER *ioc)
if (!scmd)
continue;
count++;
-   set_satl_pending(scmd, false);
+   _scsih_set_satl_pending(scmd, false);
mpt3sas_base_free_smid(ioc, smid);
scsi_dma_unmap(scmd);
if (ioc->pci_error_recovery)
@@ -4084,7 +4084,9 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
*scmd)
}
 
/*
-* Bug work around for firmware SATL handling
+* Bug work around for firmware SATL handling.  The loop
+* is based on atomic operations and ensures consistency
+* since we're lockless at this point
 */
do {
if (sas_device_priv_data->ata_command_pending) {
@@ -4092,7 +4094,7 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
*scmd)
scmd->scsi_done(scmd);
return 0;
}
-   } while (set_satl_pending(scmd, true));
+   } while (_scsih_set_satl_pending(scmd, true));
 
sas_target_priv_data = sas_device_priv_data->sas_target;
 
@@ -4661,7 +4663,7 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 
msix_index, u32 reply)
if (scmd == NULL)
return 1;
 
-   set_satl_pending(scmd, false);
+   _scsih_set_satl_pending(scmd, false);
 
mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-16 Thread Martin K. Petersen
> "James" == James Bottomley  writes:

James> Subject: [PATCH] scsi: mpt3sas: fix hang on ata passthrough
James> commands

James> mpt3sas has a firmware failure where it can only handle one pass
James> through ATA command at a time.  If another comes in, contrary to
James> the SAT standard, it will hang until the first one completes
James> (causing long commands like secure erase to timeout).  The
James> original fix was to block the device when an ATA command came in,
James> but this caused a regression with

Broadcom folks: Please test and ack as soon as possible so we can get
this fix queued up.

Ingo: Since you appear to have hardware, it would be great if you could
test James' v3 (https://patchwork.kernel.org/patch/9519383/). Sorry for
the inconvenience.

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-16 Thread James Bottomley
>From 91d249409546569444897a1ffde65c421e064899 Mon Sep 17 00:00:00 2001
From: James Bottomley 
Date: Sun, 1 Jan 2017 09:39:24 -0800
Subject: [PATCH] scsi: mpt3sas: fix hang on ata passthrough commands

mpt3sas has a firmware failure where it can only handle one pass
through ATA command at a time.  If another comes in, contrary to the
SAT standard, it will hang until the first one completes (causing long
commands like secure erase to timeout).  The original fix was to block
the device when an ATA command came in, but this caused a regression
with

commit 669f044170d8933c3d66d231b69ea97cb8447338
Author: Bart Van Assche 
Date:   Tue Nov 22 16:17:13 2016 -0800

scsi: srp_transport: Move queuecommand() wait code to SCSI core

So fix the original fix of the secure erase timeout by properly
returning SAM_STAT_BUSY like the SAT recommends.  The original patch
also had a concurrency problem since scsih_qcmd is lockless at that
point (this is fixed by using atomic bitops to set and test the flag).

Fixes: 18f6084a989ba1b (mpt3sas: Fix secure erase premature termination)
Signed-off-by: James Bottomley 

---

v2 - use bitops for lockless atomicity
v3 - update description, change function name
---
 drivers/scsi/mpt3sas/mpt3sas_base.h  | 12 +++
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 40 +++-
 2 files changed, 38 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 394fe13..dcb33f4 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -393,6 +393,7 @@ struct MPT3SAS_TARGET {
  * @eedp_enable: eedp support enable bit
  * @eedp_type: 0(type_1), 1(type_2), 2(type_3)
  * @eedp_block_length: block size
+ * @ata_command_pending: SATL passthrough outstanding for device
  */
 struct MPT3SAS_DEVICE {
struct MPT3SAS_TARGET *sas_target;
@@ -404,6 +405,17 @@ struct MPT3SAS_DEVICE {
u8  ignore_delay_remove;
/* Iopriority Command Handling */
u8  ncq_prio_enable;
+   /*
+* Bug workaround for SATL handling: the mpt2/3sas firmware
+* doesn't return BUSY or TASK_SET_FULL for subsequent
+* commands while a SATL pass through is in operation as the
+* spec requires, it simply does nothing with them until the
+* pass through completes, causing them possibly to timeout if
+* the passthrough is a long executing command (like format or
+* secure erase).  This variable allows us to do the right
+* thing while a SATL command is pending.
+*/
+   unsigned long ata_command_pending;
 
 };
 
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index b5c966e..830e2c1 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3899,9 +3899,18 @@ _scsih_temp_threshold_events(struct MPT3SAS_ADAPTER *ioc,
}
 }
 
-static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
+static int _scsih_set_satl_pending(struct scsi_cmnd *scmd, bool pending)
 {
-   return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
+   struct MPT3SAS_DEVICE *priv = scmd->device->hostdata;
+
+   if  (scmd->cmnd[0] != ATA_12 && scmd->cmnd[0] != ATA_16)
+   return 0;
+
+   if (pending)
+   return test_and_set_bit(0, >ata_command_pending);
+
+   clear_bit(0, >ata_command_pending);
+   return 0;
 }
 
 /**
@@ -3925,9 +3934,7 @@ _scsih_flush_running_cmds(struct MPT3SAS_ADAPTER *ioc)
if (!scmd)
continue;
count++;
-   if (ata_12_16_cmd(scmd))
-   scsi_internal_device_unblock(scmd->device,
-   SDEV_RUNNING);
+   _scsih_set_satl_pending(scmd, false);
mpt3sas_base_free_smid(ioc, smid);
scsi_dma_unmap(scmd);
if (ioc->pci_error_recovery)
@@ -4063,13 +4070,6 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
*scmd)
if (ioc->logging_level & MPT_DEBUG_SCSI)
scsi_print_command(scmd);
 
-   /*
-* Lock the device for any subsequent command until command is
-* done.
-*/
-   if (ata_12_16_cmd(scmd))
-   scsi_internal_device_block(scmd->device);
-
sas_device_priv_data = scmd->device->hostdata;
if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
scmd->result = DID_NO_CONNECT << 16;
@@ -4083,6 +4083,19 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
*scmd)
return 0;
}
 
+   /*
+* Bug work around for firmware SATL handling.  The loop
+* is based on atomic operations and ensures consistency
+* since we're lockless at this point
+ 

Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-16 Thread Bart Van Assche
On Sun, 2017-01-15 at 09:01 -0800, James Bottomley wrote:
> From b47c28434e9cee9cbb95a794c97ec53657408111 Mon Sep 17 00:00:00 2001
> From: James Bottomley 
> Date: Sun, 1 Jan 2017 09:39:24 -0800
> Subject: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands
> 
> mp3sas has a firmware failure where it can only handle one pass
> through ATA command at a time.  If another comes in, contrary to the
> SAT standard, it will hang until the first one completes (causing long
> commands like secure erase to timeout).  The original fix was to block
> the device when an ATA command came in, but this caused a regression
> with
> 
> commit 669f044170d8933c3d66d231b69ea97cb8447338
> Author: Bart Van Assche 
> Date:   Tue Nov 22 16:17:13 2016 -0800
> 
> scsi: srp_transport: Move queuecommand() wait code to SCSI core
> 
> So fix the original fix of the secure erase timeout by properly
> returning SAM_STAT_BUSY like the SAT recommends.
> 
> Fixes: 18f6084a989ba1b38702f9af37a2e4049a924be6
> Signed-off-by: James Bottomley 

Hello James,

This description looks incomplete to me. It doesn't mention the race
condition that was introduced by patch "scsi: mpt3sas: Fix secure
erase premature termination". Please also follow the guidelines from
process/submitting-patches.rst for the "Fixes:" tag. A quote from that
document: "please use the 'Fixes:' tag with the first 12 characters
of the SHA-1 ID, and the one line summary". Please also fix the
spelling of the adapter name "mp3sas".

Thanks,

Bart.

Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-15 Thread James Bottomley
On Tue, 2017-01-03 at 15:46 -0500, Jason Baron wrote:
> On 01/01/2017 12:39 PM, James Bottomley wrote:
> > +   /*
> > +* Bug work around for firmware SATL handling
> > +*/
> > +   if (sas_device_priv_data->ata_command_pending) {
> > +   scmd->result = SAM_STAT_BUSY;
> > +   scmd->scsi_done(scmd);
> > +   return 0;
> > +   }
> > +   set_satl_pending(scmd, true);
> > +
> > sas_target_priv_data = sas_device_priv_data->sas_target;
> > 
> > /* invalid device handle */
> 
> 
> I was also wondering if 2 threads could both fall through and do:
> 'set_satl_pending(scmd, true)'; ?
> 
> Afaict there is nothing preventing that race?

Yes, it does look like queuecommand is lockless in the mpt3sas.  How
about this version using bitops for atomicity?

James

---

>From b47c28434e9cee9cbb95a794c97ec53657408111 Mon Sep 17 00:00:00 2001
From: James Bottomley 
Date: Sun, 1 Jan 2017 09:39:24 -0800
Subject: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

mp3sas has a firmware failure where it can only handle one pass
through ATA command at a time.  If another comes in, contrary to the
SAT standard, it will hang until the first one completes (causing long
commands like secure erase to timeout).  The original fix was to block
the device when an ATA command came in, but this caused a regression
with

commit 669f044170d8933c3d66d231b69ea97cb8447338
Author: Bart Van Assche 
Date:   Tue Nov 22 16:17:13 2016 -0800

scsi: srp_transport: Move queuecommand() wait code to SCSI core

So fix the original fix of the secure erase timeout by properly
returning SAM_STAT_BUSY like the SAT recommends.

Fixes: 18f6084a989ba1b38702f9af37a2e4049a924be6
Signed-off-by: James Bottomley 

---

v2 - use bitops for lockless atomicity

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 394fe13..dcb33f4 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -393,6 +393,7 @@ struct MPT3SAS_TARGET {
  * @eedp_enable: eedp support enable bit
  * @eedp_type: 0(type_1), 1(type_2), 2(type_3)
  * @eedp_block_length: block size
+ * @ata_command_pending: SATL passthrough outstanding for device
  */
 struct MPT3SAS_DEVICE {
struct MPT3SAS_TARGET *sas_target;
@@ -404,6 +405,17 @@ struct MPT3SAS_DEVICE {
u8  ignore_delay_remove;
/* Iopriority Command Handling */
u8  ncq_prio_enable;
+   /*
+* Bug workaround for SATL handling: the mpt2/3sas firmware
+* doesn't return BUSY or TASK_SET_FULL for subsequent
+* commands while a SATL pass through is in operation as the
+* spec requires, it simply does nothing with them until the
+* pass through completes, causing them possibly to timeout if
+* the passthrough is a long executing command (like format or
+* secure erase).  This variable allows us to do the right
+* thing while a SATL command is pending.
+*/
+   unsigned long ata_command_pending;
 
 };
 
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index b5c966e..6f9b4c0 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3899,9 +3899,18 @@ _scsih_temp_threshold_events(struct MPT3SAS_ADAPTER *ioc,
}
 }
 
-static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
+static int set_satl_pending(struct scsi_cmnd *scmd, bool pending)
 {
-   return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
+   struct MPT3SAS_DEVICE *priv = scmd->device->hostdata;
+
+   if  (scmd->cmnd[0] != ATA_12 && scmd->cmnd[0] != ATA_16)
+   return 0;
+
+   if (pending)
+   return test_and_set_bit(0, >ata_command_pending);
+
+   clear_bit(0, >ata_command_pending);
+   return 0;
 }
 
 /**
@@ -3925,9 +3934,7 @@ _scsih_flush_running_cmds(struct MPT3SAS_ADAPTER *ioc)
if (!scmd)
continue;
count++;
-   if (ata_12_16_cmd(scmd))
-   scsi_internal_device_unblock(scmd->device,
-   SDEV_RUNNING);
+   set_satl_pending(scmd, false);
mpt3sas_base_free_smid(ioc, smid);
scsi_dma_unmap(scmd);
if (ioc->pci_error_recovery)
@@ -4063,13 +4070,6 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
*scmd)
if (ioc->logging_level & MPT_DEBUG_SCSI)
scsi_print_command(scmd);
 
-   /*
-* Lock the device for any subsequent command until command is
-* done.
-*/
-   if (ata_12_16_cmd(scmd))
-   scsi_internal_device_block(scmd->device);
-
sas_device_priv_data = scmd->device->hostdata;
if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {

Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-09 Thread Martin K. Petersen
> "Sreekanth" == Sreekanth Reddy  writes:

Sreekanth> We are fine with this patch. Can we rename function
Sreekanth> 'set_satl_pending()' name to '_scsih_set_satl_pending()' and
Sreekanth> can add headers to this function.

Sreekanth> other wise I am OK.

Sreekanth> Acked-by: Sreekanth Reddy 

James: Please tweak and post an updated patch.

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-06 Thread Sreekanth Reddy
On Fri, Jan 6, 2017 at 7:29 AM, Martin K. Petersen
 wrote:
>> "James" == James Bottomley  writes:
>
> James> Now that I look at the reviews, each of the reviewers said what
> James> the correct thing to do was: return SAM_STAT_BUSY if SATL
> James> commands are outstanding like the spec says.  You all get
> James> negative brownie points for not insisting on a rework.
>
> James> Does this patch (compile tested only) fix the problems for
> James> everyone?
>
> I also like this approach better.
>
> Broadcom folks: Please comment and test.

Matin, We are fine with this patch. Can we rename function
'set_satl_pending()' name to '_scsih_set_satl_pending()' and can add
headers to this function.

other wise I am OK.

Acked-by: Sreekanth Reddy 

>
> --
> Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-05 Thread Martin K. Petersen
> "James" == James Bottomley  writes:

James> Now that I look at the reviews, each of the reviewers said what
James> the correct thing to do was: return SAM_STAT_BUSY if SATL
James> commands are outstanding like the spec says.  You all get
James> negative brownie points for not insisting on a rework.

James> Does this patch (compile tested only) fix the problems for
James> everyone?

I also like this approach better.

Broadcom folks: Please comment and test.

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-03 Thread Jason Baron

On 01/01/2017 12:39 PM, James Bottomley wrote:

On Sun, 2017-01-01 at 11:33 -0500, David Miller wrote:

From: Bart Van Assche 
Date: Sun, 1 Jan 2017 14:22:11 +


My recommendation is to revert commit 18f6084a989b ("scsi: mpt3sas: Fix
secure erase premature termination"). Since the mpt3sas driver uses the
single-queue approach and since the SCSI core unlocks the block layer
request queue lock before the .queuecommand callback function is called,
multiple threads can execute that callback function (scsih_qcmd() in this
case) simultaneously. This means that using scsi_internal_device_block()
from inside .queuecommand to serialize SCSI command execution is wrong.


But this causes a regression for the thing being fixed by that
commit.


Right, we don't do that; that's why I didn't list it as one of the
options.


Why don't we figure out what that semantics that commit was trying to
achieve and implement that properly?


Now that I look at the reviews, each of the reviewers said what the
correct thing to do was: return SAM_STAT_BUSY if SATL commands are
outstanding like the spec says.  You all get negative brownie points
for not insisting on a rework.

Does this patch (compile tested only) fix the problems for everyone?



Hi,

Yes, with this patch applied my system boots :)

...


@@ -4083,6 +4077,16 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
*scmd)
return 0;
}

+   /*
+* Bug work around for firmware SATL handling
+*/
+   if (sas_device_priv_data->ata_command_pending) {
+   scmd->result = SAM_STAT_BUSY;
+   scmd->scsi_done(scmd);
+   return 0;
+   }
+   set_satl_pending(scmd, true);
+
sas_target_priv_data = sas_device_priv_data->sas_target;

/* invalid device handle */



I was also wondering if 2 threads could both fall through and do:
'set_satl_pending(scmd, true)'; ?

Afaict there is nothing preventing that race?

Thanks,

-Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-01 Thread James Bottomley
On Sun, 2017-01-01 at 11:33 -0500, David Miller wrote:
> From: Bart Van Assche 
> Date: Sun, 1 Jan 2017 14:22:11 +
> 
> > My recommendation is to revert commit 18f6084a989b ("scsi: mpt3sas: Fix
> > secure erase premature termination"). Since the mpt3sas driver uses the
> > single-queue approach and since the SCSI core unlocks the block layer
> > request queue lock before the .queuecommand callback function is called,
> > multiple threads can execute that callback function (scsih_qcmd() in this
> > case) simultaneously. This means that using scsi_internal_device_block()
> > from inside .queuecommand to serialize SCSI command execution is wrong.
> 
> But this causes a regression for the thing being fixed by that
> commit.

Right, we don't do that; that's why I didn't list it as one of the
options.

> Why don't we figure out what that semantics that commit was trying to
> achieve and implement that properly?

Now that I look at the reviews, each of the reviewers said what the
correct thing to do was: return SAM_STAT_BUSY if SATL commands are
outstanding like the spec says.  You all get negative brownie points
for not insisting on a rework.

Does this patch (compile tested only) fix the problems for everyone?

James

---

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 394fe13..0983a65 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -393,6 +393,7 @@ struct MPT3SAS_TARGET {
  * @eedp_enable: eedp support enable bit
  * @eedp_type: 0(type_1), 1(type_2), 2(type_3)
  * @eedp_block_length: block size
+ * @ata_command_pending: SATL passthrough outstanding for device
  */
 struct MPT3SAS_DEVICE {
struct MPT3SAS_TARGET *sas_target;
@@ -404,6 +405,17 @@ struct MPT3SAS_DEVICE {
u8  ignore_delay_remove;
/* Iopriority Command Handling */
u8  ncq_prio_enable;
+   /*
+* Bug workaround for SATL handling: the mpt2/3sas firmware
+* doesn't return BUSY or TASK_SET_FULL for subsequent
+* commands while a SATL pass through is in operation as the
+* spec requires, it simply does nothing with them until the
+* pass through completes, causing them possibly to timeout if
+* the passthrough is a long executing command (like format or
+* secure erase).  This variable allows us to do the right
+* thing while a SATL command is pending.
+*/
+   u8 ata_command_pending;
 
 };
 
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index b5c966e..1446a0a 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3899,9 +3899,12 @@ _scsih_temp_threshold_events(struct MPT3SAS_ADAPTER *ioc,
}
 }
 
-static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
+static void set_satl_pending(struct scsi_cmnd *scmd, bool pending)
 {
-   return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
+   struct MPT3SAS_DEVICE *priv = scmd->device->hostdata;
+
+   if  (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16)
+   priv->ata_command_pending = pending;
 }
 
 /**
@@ -3925,9 +3928,7 @@ _scsih_flush_running_cmds(struct MPT3SAS_ADAPTER *ioc)
if (!scmd)
continue;
count++;
-   if (ata_12_16_cmd(scmd))
-   scsi_internal_device_unblock(scmd->device,
-   SDEV_RUNNING);
+   set_satl_pending(scmd, false);
mpt3sas_base_free_smid(ioc, smid);
scsi_dma_unmap(scmd);
if (ioc->pci_error_recovery)
@@ -4063,13 +4064,6 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
*scmd)
if (ioc->logging_level & MPT_DEBUG_SCSI)
scsi_print_command(scmd);
 
-   /*
-* Lock the device for any subsequent command until command is
-* done.
-*/
-   if (ata_12_16_cmd(scmd))
-   scsi_internal_device_block(scmd->device);
-
sas_device_priv_data = scmd->device->hostdata;
if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
scmd->result = DID_NO_CONNECT << 16;
@@ -4083,6 +4077,16 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
*scmd)
return 0;
}
 
+   /*
+* Bug work around for firmware SATL handling
+*/
+   if (sas_device_priv_data->ata_command_pending) {
+   scmd->result = SAM_STAT_BUSY;
+   scmd->scsi_done(scmd);
+   return 0;
+   }
+   set_satl_pending(scmd, true);
+
sas_target_priv_data = sas_device_priv_data->sas_target;
 
/* invalid device handle */
@@ -4650,8 +4654,7 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 
msix_index, u32 reply)
if (scmd == NULL)
return 1;
 
-   if 

Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-01 Thread David Miller
From: Bart Van Assche 
Date: Sun, 1 Jan 2017 14:22:11 +

> My recommendation is to revert commit 18f6084a989b ("scsi: mpt3sas: Fix
> secure erase premature termination"). Since the mpt3sas driver uses the
> single-queue approach and since the SCSI core unlocks the block layer
> request queue lock before the .queuecommand callback function is called,
> multiple threads can execute that callback function (scsih_qcmd() in this
> case) simultaneously. This means that using scsi_internal_device_block()
> from inside .queuecommand to serialize SCSI command execution is wrong.

But this causes a regression for the thing being fixed by that commit.

Why don't we figure out what that semantics that commit was trying to
achieve and implement that properly?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-01 Thread Jason Baron


On 01/01/2017 09:22 AM, Bart Van Assche wrote:
> On Sat, 2016-12-31 at 15:19 -0800, James Bottomley wrote:
>> On Thu, 2016-12-29 at 00:02 -0800, Christoph Hellwig wrote:
>>> On Wed, Dec 28, 2016 at 11:30:24PM -0500, Jason Baron wrote:
 Add a new parameter to scsi_internal_device_block() to decide 
 whether or not to invoke scsi_wait_for_queuecommand().
>>> We'll also need to deal with the blk-mq wait path that Bart has been
>>> working on (I think it's already in the scsi tree, but I'd have to
>>> check).
>>>
>>> Also adding a bool flag for the last call in a function is style 
>>> that's a little annoying.
>>>
>>> I'd prefer to add a scsi_internal_device_block_nowait that contains
>>> all the code except for the waiting, and then make
>>> scsi_internal_device_block_nowait a wrapper around it.  Or drop the
>>> annoying internal for both while we're at it :)
>> OK, I know it's new year, but this is an unpatched regression in -rc1
>> that's causing serious issues.  I would like this fixed by -rc3 so we
>> have 3 options
>>
>>1. revert all the queuecommand wait stuff until it proves it's actually
>>   working without regressions
>>2. apply this patch and fix the style issues later
>>3. someone else supplies the correctly styled fix patch
>>
>> The conservative in me says that 1. is probably the most correct thing
>> to do because it gives us time to get the queuecommand wait stuff
>> right; that's what I'll probably do if there's no movement next week. 
>>  However, since we're reasonably early in the -rc cycle, so either 2 or
>> 3 are viable provided no further regressions caused by the queuecommand
>> wait stuff pop up.
> Hello James,
>
> My recommendation is to revert commit 18f6084a989b ("scsi: mpt3sas: Fix
> secure erase premature termination"). Since the mpt3sas driver uses the
> single-queue approach and since the SCSI core unlocks the block layer
> request queue lock before the .queuecommand callback function is called,
> multiple threads can execute that callback function (scsih_qcmd() in this
> case) simultaneously. This means that using scsi_internal_device_block()
> from inside .queuecommand to serialize SCSI command execution is wrong.
>
> Bart.

Hi,

Yes, commit 18f6084a989b looked racy to me as well. I noted that
in a follow up to my patch, and a suggested a way of fixing it:
http://lkml.iu.edu/hypermail/linux/kernel/1612.3/01301.html

Also worth noting is that commit 18f6084a989b was backported
to at least 4.8 stable.

Thanks,

-Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2017-01-01 Thread Bart Van Assche
On Sat, 2016-12-31 at 15:19 -0800, James Bottomley wrote:
> On Thu, 2016-12-29 at 00:02 -0800, Christoph Hellwig wrote:
> > On Wed, Dec 28, 2016 at 11:30:24PM -0500, Jason Baron wrote:
> > > Add a new parameter to scsi_internal_device_block() to decide 
> > > whether or not to invoke scsi_wait_for_queuecommand().
> > 
> > We'll also need to deal with the blk-mq wait path that Bart has been
> > working on (I think it's already in the scsi tree, but I'd have to
> > check).
> > 
> > Also adding a bool flag for the last call in a function is style 
> > that's a little annoying.
> > 
> > I'd prefer to add a scsi_internal_device_block_nowait that contains
> > all the code except for the waiting, and then make
> > scsi_internal_device_block_nowait a wrapper around it.  Or drop the
> > annoying internal for both while we're at it :)
> 
> OK, I know it's new year, but this is an unpatched regression in -rc1
> that's causing serious issues.  I would like this fixed by -rc3 so we
> have 3 options
> 
>1. revert all the queuecommand wait stuff until it proves it's actually
>   working without regressions
>2. apply this patch and fix the style issues later
>3. someone else supplies the correctly styled fix patch
> 
> The conservative in me says that 1. is probably the most correct thing
> to do because it gives us time to get the queuecommand wait stuff
> right; that's what I'll probably do if there's no movement next week. 
>  However, since we're reasonably early in the -rc cycle, so either 2 or
> 3 are viable provided no further regressions caused by the queuecommand
> wait stuff pop up.

Hello James,

My recommendation is to revert commit 18f6084a989b ("scsi: mpt3sas: Fix
secure erase premature termination"). Since the mpt3sas driver uses the
single-queue approach and since the SCSI core unlocks the block layer
request queue lock before the .queuecommand callback function is called,
multiple threads can execute that callback function (scsih_qcmd() in this
case) simultaneously. This means that using scsi_internal_device_block()
from inside .queuecommand to serialize SCSI command execution is wrong.

Bart.

Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2016-12-31 Thread James Bottomley
On Thu, 2016-12-29 at 00:02 -0800, Christoph Hellwig wrote:
> On Wed, Dec 28, 2016 at 11:30:24PM -0500, Jason Baron wrote:
> > Add a new parameter to scsi_internal_device_block() to decide 
> > whether or not to invoke scsi_wait_for_queuecommand().
> 
> We'll also need to deal with the blk-mq wait path that Bart has been
> working on (I think it's already in the scsi tree, but I'd have to
> check).
> 
> Also adding a bool flag for the last call in a function is style 
> that's a little annoying.
> 
> I'd prefer to add a scsi_internal_device_block_nowait that contains
> all the code except for the waiting, and then make
> scsi_internal_device_block_nowait a wrapper around it.  Or drop the
> annoying internal for both while we're at it :)

OK, I know it's new year, but this is an unpatched regression in -rc1
that's causing serious issues.  I would like this fixed by -rc3 so we
have 3 options

   1. revert all the queuecommand wait stuff until it proves it's actually
  working without regressions
   2. apply this patch and fix the style issues later
   3. someone else supplies the correctly styled fix patch

The conservative in me says that 1. is probably the most correct thing
to do because it gives us time to get the queuecommand wait stuff
right; that's what I'll probably do if there's no movement next week. 
 However, since we're reasonably early in the -rc cycle, so either 2 or
3 are viable provided no further regressions caused by the queuecommand
wait stuff pop up.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2016-12-29 Thread David Miller
From: Jason Baron 
Date: Wed, 28 Dec 2016 23:30:24 -0500

> On ata passthru commands scsih_qcmd() ends up spinning in
> scsi_wait_for_queuecommand() indefinitely. scsih_qcmd() is called from
> __blk_run_queue_uncond() which first increments request_fn_active to a
> non-zero value. Thus, scsi_wait_for_queuecommand() never completes because
> its spinning waiting for request_fn_active to become 0.
> 
> Two patches interact here. The first:
> 
> commit 18f6084a989b ("scsi: mpt3sas: Fix secure erase premature
> termination") calls scsi_internal_device_block() for ata passthru commands.
> 
> The second patch:
> 
> commit 669f044170d8 ("scsi: srp_transport: Move queuecommand() wait code
> to SCSI core") adds a call to scsi_wait_for_queuecommand() from
> scsi_internal_device_block().
> 
> Add a new parameter to scsi_internal_device_block() to decide whether
> or not to invoke scsi_wait_for_queuecommand().
> 
> Signed-off-by: Jason Baron 

Tested-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2016-12-29 Thread Jason Baron
On 12/29/2016 03:02 AM, Christoph Hellwig wrote:

> On Wed, Dec 28, 2016 at 11:30:24PM -0500, Jason Baron wrote:
>> Add a new parameter to scsi_internal_device_block() to decide whether
>> or not to invoke scsi_wait_for_queuecommand().
> We'll also need to deal with the blk-mq wait path that Bart has been
> working on (I think it's already in the scsi tree, but I'd have to
> check).
Ok, I'm not sure either.

> Also adding a bool flag for the last call in a function is style that's
> a little annoying.
>
> I'd prefer to add a scsi_internal_device_block_nowait that contains
> all the code except for the waiting, and then make
> scsi_internal_device_block_nowait a wrapper around it.  Or drop the
> annoying internal for both while we're at it :)

The proposed patch brings the code in-line with what is in 4.8 stable
where scsi_internal_device_block() does not call
scsi_wait_for_queuecommand(). So I saw it as a minimal fix to make
my system boot again :)

I was wondering if the original fix is racy in that there could be multiple
threads in the queuecommand. Perhaps we should do something like:
   
if (ata_12_16_cmd(scmd))
{   


if (!test_and_set_bit(MPT_DEVICE_EXCLUSIVE,
_device_priv_data->flags))
{ 
   
scsi_internal_device_block(scmd->device);   
 

}
else
   

return
SCSI_MLQUEUE_HOST_BUSY; 
  

}

where scsi_internal_device_block() could be taught to wait for
request_fn_active becoming 1 instead of 0.

Thanks,

-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands

2016-12-29 Thread Christoph Hellwig
On Wed, Dec 28, 2016 at 11:30:24PM -0500, Jason Baron wrote:
> Add a new parameter to scsi_internal_device_block() to decide whether
> or not to invoke scsi_wait_for_queuecommand().

We'll also need to deal with the blk-mq wait path that Bart has been
working on (I think it's already in the scsi tree, but I'd have to
check).

Also adding a bool flag for the last call in a function is style that's
a little annoying.

I'd prefer to add a scsi_internal_device_block_nowait that contains
all the code except for the waiting, and then make
scsi_internal_device_block_nowait a wrapper around it.  Or drop the
annoying internal for both while we're at it :)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html