Re: [PATCH 5/6] libata: Honor RQF_QUIET flag

2018-03-01 Thread Hannes Reinecke
On 03/01/2018 08:40 PM, Damien Le Moal wrote:
> Currently, libata ignores requests RQF_QUIET flag and print error
> messages for failed commands, regardless if this flag is set in the
> command request. Fix this by introducing the ata_eh_quiet() function and
> using this function in ata_eh_link_autopsy() to determine if the EH
> context should be quiet. This works by counting the number of failed
> commands and the number of commands with the quiet flag set. If both
> numbers are equal, the the EH context can be set to quiet and all error
> messages suppressed. Otherwise, only the error messages for the failed
> commands are suppressed and the link Emask and irq_stat messages printed.
> 
> Signed-off-by: Damien Le Moal 
> ---
>  drivers/ata/libata-eh.c   | 26 +-
>  drivers/ata/libata-scsi.c |  3 +++
>  2 files changed, 28 insertions(+), 1 deletion(-)
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH 6/6] libata: Be quiet when asked to

2018-03-01 Thread Hannes Reinecke
On 03/01/2018 08:40 PM, Damien Le Moal wrote:
> For a successful setting of the device transfer speed mode in
> ata_dev_set_mode(), do not print the message
> "ataX.XX: configured for xxx" if the EH context has the quiet flag set.
> 
> Signed-off-by: Damien Le Moal 
> ---
>  drivers/ata/libata-core.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index 3c09122bf038..258afc2e8efd 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -3573,9 +3573,10 @@ static int ata_dev_set_mode(struct ata_device *dev)
>   DPRINTK("xfer_shift=%u, xfer_mode=0x%x\n",
>   dev->xfer_shift, (int)dev->xfer_mode);
>  
> - ata_dev_info(dev, "configured for %s%s\n",
> -  ata_mode_string(ata_xfer_mode2mask(dev->xfer_mode)),
> -  dev_err_whine);
> + if (!(ehc->i.flags & ATA_EHI_QUIET))
> + ata_dev_info(dev, "configured for %s%s\n",
> + ata_mode_string(ata_xfer_mode2mask(dev->xfer_mode)),
> + dev_err_whine);
>  
>   return 0;
>  
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH 4/6] libata: Fix ata_err_string()

2018-03-01 Thread Hannes Reinecke
On 03/01/2018 08:40 PM, Damien Le Moal wrote:
> Add proper error string output for ATA_ERR_NCQ and ATA_ERR_NODEV_HINT
> instead of returning "unknown error".
> 
> Signed-off-by: Damien Le Moal 
> ---
>  drivers/ata/libata-eh.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index d6264235c3d7..006ea1507dcf 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -1482,6 +1482,10 @@ static const char *ata_err_string(unsigned int 
> err_mask)
>   return "invalid argument";
>   if (err_mask & AC_ERR_DEV)
>   return "device error";
> + if (err_mask & AC_ERR_NCQ)
> + return "NCQ error";
> + if (err_mask & AC_ERR_NODEV_HINT)
> + return "Polling detection error";
>   return "unknown error";
>  }
>  
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH 3/6] libata: Fix comment typo in ata_eh_analyze_tf()

2018-03-01 Thread Hannes Reinecke
On 03/01/2018 08:40 PM, Damien Le Moal wrote:
> Signed-off-by: Damien Le Moal 
> ---
>  drivers/ata/libata-eh.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index 504272b18e75..d6264235c3d7 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -1865,10 +1865,10 @@ static unsigned int ata_eh_analyze_tf(struct 
> ata_queued_cmd *qc,
>   if (qc->flags & ATA_QCFLAG_SENSE_VALID) {
>   int ret = scsi_check_sense(qc->scsicmd);
>   /*
> -  * SUCCESS here means that the sense code could
> +  * SUCCESS here means that the sense code could be
>* evaluated and should be passed to the upper layers
>* for correct evaluation.
> -  * FAILED means the sense code could not interpreted
> +  * FAILED means the sense code could not be interpreted
>* and the device would need to be reset.
>* NEEDS_RETRY and ADD_TO_MLQUEUE means that the
>* command would need to be retried.
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH 1/6] scsi: Introduce scsi_zbc_noretry_cmd()

2018-03-01 Thread Hannes Reinecke
On 03/01/2018 08:40 PM, Damien Le Moal wrote:
> For ZBC/ZAC devices, retrying a command with a condition known to lead
> to a failure is useless. One example is an unaligned write past the
> write pointer of a sequential zone. Retrying the same command will
> result in an error again.
> 
> Currently, these iknown error condition cases are handled in sd_zbc.c
> using the sd_zbc_complete() function which is called from sd_done() when
> a command completes. However, these known error conditions are not
> handled in libata, nor is scsi_noretry_cmd() considering them.
> 
> Fix this by introducing the function scsi_zbc_noretry_cmd() and use this
> function in scsi_noretry_cmd(). This allows simplifying
> sd_zbc_complete() which now only has to deal with report zones command
> reply.
> 
> scsi_zbc_noretry_cmd() is also exported so that it can be used from
> libata.
> 
> Signed-off-by: Damien Le Moal 
> ---
>  drivers/scsi/scsi_error.c | 66 
> +++
>  drivers/scsi/sd.c |  2 +-
>  drivers/scsi/sd.h |  8 +++---
>  drivers/scsi/sd_zbc.c | 47 -
>  include/scsi/scsi_eh.h|  1 +
>  5 files changed, 77 insertions(+), 47 deletions(-)
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH 2/6] libata: Use scsi_zbc_noretry_cmd() for ZAC devices

2018-03-01 Thread Hannes Reinecke
On 03/01/2018 08:40 PM, Damien Le Moal wrote:
> Improve decisions regarding command retry worthiness by calling
> the funtion scsi_zbc_noretry_cmd() in ata_eh_worth_retry() if the
> command target is a ZAC device.
> 
> Signed-off-by: Damien Le Moal 
> ---
>  drivers/ata/libata-eh.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index 11c3137d7b0a..504272b18e75 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -2139,6 +2139,10 @@ static unsigned int ata_eh_speed_down(struct 
> ata_device *dev,
>   */
>  static inline int ata_eh_worth_retry(struct ata_queued_cmd *qc)
>  {
> + if (qc->dev->flags & ATA_DFLAG_ZAC &&
> + qc->flags & ATA_QCFLAG_SENSE_VALID &&
> + scsi_zbc_noretry_cmd(qc->scsicmd))
> + return 0;   /* retrying will fail again */
>   if (qc->err_mask & AC_ERR_MEDIA)
>   return 0;   /* don't retry media errors */
>   if (qc->flags & ATA_QCFLAG_IO)
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH] scsi_lib: increase {host|target|device}_busy count after dispatch cmd

2018-03-01 Thread Bart Van Assche
On Fri, 2018-03-02 at 12:56 +0800, Ganesh Mahendran wrote:
> 2018-03-02 7:11 GMT+08:00 Bart Van Assche :
> > On Mon, 2017-06-05 at 17:37 +0800, Ganesh Mahendran wrote:
> > > In android system, when there are lots of threads running. Thread A
> > > holding *host_busy* count is easily to be preempted, and if at the
> > > same time, thread B set *host_blocked*, then all other threads will
> > > be io blocked.
> > 
> > Have you considered to insert preempt_disable() and preempt_enable() calls
> > where necessary to achieve the same effect? I think that would result in a
> > much less intrusive patch.
> 
> Yes, preempt_disable()preempt_enable will also achieve the same effect.
> But I just think preempt_disable()preempt_enable may be a little heavy for
> this problem which can be fixed by increaseing {host|target|device}_busy count
> after dispatch cmd.

Hello Ganesh,

If the {host,target,device}_busy counts would be increased after dispatch,
could that result in scsi_device_unbusy() being called from the completion
path before these counts have been increased?

Additionally, have you noticed that your patch does not apply anymore to
recent kernels since some of the counters are now increased from inside
scsi_mq_get_budget(), a function that is called from inside the block layer?

Thanks,

Bart.




Re: [PATCH] scsi_lib: increase {host|target|device}_busy count after dispatch cmd

2018-03-01 Thread Ganesh Mahendran
Hi, Bart:

2018-03-02 7:11 GMT+08:00 Bart Van Assche :
> On Mon, 2017-06-05 at 17:37 +0800, Ganesh Mahendran wrote:
>> In android system, when there are lots of threads running. Thread A
>> holding *host_busy* count is easily to be preempted, and if at the
>> same time, thread B set *host_blocked*, then all other threads will
>> be io blocked.
>
> Hello Ganesh,
>
> Have you considered to insert preempt_disable() and preempt_enable() calls
> where necessary to achieve the same effect? I think that would result in a
> much less intrusive patch.

Yes, preempt_disable()preempt_enable will also achieve the same effect.
But I just think preempt_disable()preempt_enable may be a little heavy for
this problem which can be fixed by increaseing {host|target|device}_busy count
after dispatch cmd.

Thanks.

>
> Thanks,
>
> Bart.
>
>


[PATCH V3] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-03-01 Thread Jianchao Wang
In scsi core, __scsi_queue_insert should just put request back on
the queue and retry using the same command as before. However, for
blk-mq, scsi_mq_requeue_cmd is employed here which will unprepare
the request. To align with the semantics of __scsi_queue_insert,
use blk_mq_requeue_request with kick_requeue_list == true and put
the reference of scsi_device.

Cc: Christoph Hellwig 
Signed-off-by: Jianchao Wang 
Reviewed-by: Bart Van Assche 

---
Changelog:
V2 -> V3:
 - add commit to explain why we need a put_device in
   __scsi_queue_insert
 - add reviewed-by

V1 -> V2:
 - add put_device on scsi_device->sdev_gendev

 drivers/scsi/scsi_lib.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a86df9c..d2f1838 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -191,7 +191,13 @@ static void __scsi_queue_insert(struct scsi_cmnd *cmd, int 
reason, bool unbusy)
 */
cmd->result = 0;
if (q->mq_ops) {
-   scsi_mq_requeue_cmd(cmd);
+   /*
+* scsi_device.sdev_gendev will be get every time in 
.get_budget and
+* be put in scsi_end_request. Hence we need to put the 
reference
+* here when we decide to requeue request.
+*/
+   blk_mq_requeue_request(cmd->request, true);
+   put_device(>sdev_gendev);
return;
}
spin_lock_irqsave(q->queue_lock, flags);
-- 
2.7.4



Re: [PATCH] scsi:Documentation: add sd-parameters.txt

2018-03-01 Thread Martin K. Petersen

Weiping,

> this document include SCSI disk and SCSI device parameters, add
> cache_type firstly, other parameters will be added later.

Applied to 4.17/scsi-queue with a few tweaks. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH V2] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-03-01 Thread jianchao.wang
Hi martin

On 03/02/2018 09:44 AM, Martin K. Petersen wrote:
>> In scsi core, __scsi_queue_insert should just put request back on the
>> queue and retry using the same command as before. However, for blk-mq,
>> scsi_mq_requeue_cmd is employed here which will unprepare the
>> request. To align with the semantics of __scsi_queue_insert, use
>> blk_mq_requeue_request with kick_requeue_list == true and put the
>> reference of scsi_device.
>>
>> V1 -> V2:
>>  - add put_device on scsi_device->sdev_gendev
> Also, please put changelog after the --- delimiter.
> 

Yes, I will modify this next version.

Thanks 
Jianchao


Re: [PATCH V2] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-03-01 Thread jianchao.wang
Hi martin

Thanks for your kindly response.

On 03/02/2018 09:43 AM, Martin K. Petersen wrote:
> 
> Jianchao,
> 
>> Yes, the block layer core guarantees that scsi_mq_get_budget() will be
>> called before scsi_queue_rq(). I think the full picture is as follows:
> 
>> * Before scsi_queue_rq() calls .queuecommand(), get_device() is called for 
>> the
>>   SCSI device and the device, target and host busy counters are incremented.
>> * If the SCSI core decides to requeue a command, scsi_queue_insert() causes
>>   __scsi_queue_insert() to call scsi_device_unbusy(). That last function
>>   decreases the device, target and host busy counters but not 
>> put_device(sdev).
>>   Hence the need for a separate put_device() call after requeuing.
>>
>> It's unfortunate that the SCSI core became so asymmetric. Anyway,
>> since I am now convinced that this patch is correct, feel free to add:
> 
> Please add something akin to Bart's explanation as a comment and repost.

yes, sure.


Thanks
Jianchao


Re: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue

2018-03-01 Thread Ming Lei
On Thu, Mar 01, 2018 at 04:19:34PM -0500, Laurence Oberman wrote:
> On Thu, 2018-03-01 at 14:01 -0500, Laurence Oberman wrote:
> > On Thu, 2018-03-01 at 16:18 +, Don Brace wrote:
> > > > -Original Message-
> > > > From: Ming Lei [mailto:ming@redhat.com]
> > > > Sent: Tuesday, February 27, 2018 4:08 AM
> > > > To: Jens Axboe ; linux-bl...@vger.kernel.org;
> > > > Christoph
> > > > Hellwig ; Mike Snitzer 
> > > > Cc: linux-scsi@vger.kernel.org; Hannes Reinecke ;
> > > > Arun Easi
> > > > ; Omar Sandoval ; Martin K
> > > > .
> > > > Petersen ; James Bottomley
> > > > ; Christoph Hellwig  > > > st
> > > > .de>;
> > > > Don Brace ; Kashyap Desai
> > > > ; Peter Rivera  > > > .c
> > > > om>;
> > > > Laurence Oberman ; Ming Lei
> > > > ; Meelis Roos 
> > > > Subject: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue
> > > > 
> > > > EXTERNAL EMAIL
> > > > 
> > > > 
> > > > From 84676c1f21 (genirq/affinity: assign vectors to all possible
> > > > CPUs),
> > > > one msix vector can be created without any online CPU mapped,
> > > > then
> > > > one
> > > > command's completion may not be notified.
> > > > 
> > > > This patch setups mapping between cpu and reply queue according
> > > > to
> > > > irq
> > > > affinity info retrived by pci_irq_get_affinity(), and uses this
> > > > mapping
> > > > table to choose reply queue for queuing one command.
> > > > 
> > > > Then the chosen reply queue has to be active, and fixes IO hang
> > > > caused
> > > > by using inactive reply queue which doesn't have any online CPU
> > > > mapped.
> > > > 
> > > > Cc: Hannes Reinecke 
> > > > Cc: Arun Easi 
> > > > Cc: "Martin K. Petersen" ,
> > > > Cc: James Bottomley ,
> > > > Cc: Christoph Hellwig ,
> > > > Cc: Don Brace 
> > > > Cc: Kashyap Desai 
> > > > Cc: Peter Rivera 
> > > > Cc: Laurence Oberman 
> > > > Cc: Meelis Roos 
> > > > Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all
> > > > possible CPUs")
> > > > Signed-off-by: Ming Lei 
> > > 
> > > I am getting some issues that need to be tracked down:
> > > 
> > > [ 1636.032984] hpsa :87:00.0: Acknowledging event: 0xc032
> > > (HP
> > > SSD Smart Path configuration change)
> > > [ 1638.510656] hpsa :87:00.0: scsi 3:0:8:0: updated Direct-
> > > Access HP   MO0400JDVEU  PHYS DRV SSDSmartPathCap- En-
> > > Exp=0
> > > [ 1653.967695] hpsa :87:00.0: Acknowledging event: 0x8020
> > > (HP
> > > SSD Smart Path configuration change)
> > > [ 1656.770377] hpsa :87:00.0: scsi 3:0:8:0: updated Direct-
> > > Access HP   MO0400JDVEU  PHYS DRV SSDSmartPathCap- En-
> > > Exp=0
> > > [ 2839.762267] hpsa :87:00.0: Acknowledging event: 0x8020
> > > (HP
> > > SSD Smart Path configuration change)
> > > [ 2840.841290] hpsa :87:00.0: scsi 3:0:8:0: updated Direct-
> > > Access HP   MO0400JDVEU  PHYS DRV SSDSmartPathCap- En-
> > > Exp=0
> > > [ 2917.582653] hpsa :87:00.0: Acknowledging event: 0xc020
> > > (HP
> > > SSD Smart Path configuration change)
> > > [ 2919.087191] hpsa :87:00.0: scsi 3:1:0:1: updated Direct-
> > > Access HP   LOGICAL VOLUME   RAID-5 SSDSmartPathCap+ En+
> > > Exp=1
> > > [ 2919.142527] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> > > [3:1:0:2] A phys disk component of LV is missing, turning off
> > > offload_enabled for LV.
> > > [ 2919.203915] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> > > [3:1:0:2] A phys disk component of LV is missing, turning off
> > > offload_enabled for LV.
> > > [ 2919.266921] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> > > [3:1:0:2] A phys disk component of LV is missing, turning off
> > > offload_enabled for LV.
> > > [ 2934.999629] hpsa :87:00.0: Acknowledging event: 0x4000
> > > (HP
> > > SSD Smart Path state change)
> > > [ 2936.937333] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> > > [3:1:0:2] A phys disk component of LV is missing, turning off
> > > offload_enabled for LV.
> > > [ 2936.998707] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> > > [3:1:0:2] A phys disk component of LV is missing, turning off
> > > offload_enabled for LV.
> > > [ 2937.060101] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> > > [3:1:0:2] A phys disk component of LV is missing, turning off
> > > offload_enabled for LV.
> > > [ 3619.711122] sd 3:1:0:3: [sde] tag#436 FAILED Result:
> > > hostbyte=DID_OK driverbyte=DRIVER_SENSE
> > > [ 3619.751150] sd 3:1:0:3: 

Re: [PATCH V2] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-03-01 Thread jianchao.wang
Hi Bart

Thanks for your precious time and detailed summary.

On 03/02/2018 01:43 AM, Bart Van Assche wrote:
> Yes, the block layer core guarantees that scsi_mq_get_budget() will be called
> before scsi_queue_rq(). I think the full picture is as follows:
> * Before scsi_queue_rq() calls .queuecommand(), get_device() is called for the
>   SCSI device and the device, target and host busy counters are incremented.

Supply some details here:
scsi_mq_get_budget before calling .queuecommand get_device and increase 
device_busy.
scsi_queue_rq increases target_busy and host_busy.

> * If the SCSI core decides to requeue a command, scsi_queue_insert() causes
>   __scsi_queue_insert() to call scsi_device_unbusy(). That last function
>   decreases the device, target and host busy counters but not 
> put_device(sdev).
>   Hence the need for a separate put_device() call after requeuing.
> 
> It's unfortunate that the SCSI core became so asymmetric. Anyway, since I am
> now convinced that this patch is correct, feel free to add:
> 
> Reviewed-by: Bart Van Assche 


Sincerely
Jianchao


Re: [patch RESEND] atp870u: 64 bit bug in atp885_init()

2018-03-01 Thread Martin K. Petersen

Dan,

> On 64 bit CPUs there is a memory corruption bug on probe().  It should
> be a u32 pointer instead of an unsigned long pointer or we write past
> the end of the setupdata[] array.

Applied to 4.17/scsi-queue. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] mvsas: fix wrong endianness of sgpio api

2018-03-01 Thread Martin K. Petersen

Wilfried,

> This patch fixes the byte order of the SGPIO api and brings it back in
> sync with ledmon v0.80 and above.

The patch was missing your Signed-off-by: tag. I added it and applied
patch to 4.17/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] sd_zbc: Fix sd_zbc_get_seq_zones() kernel-doc header

2018-03-01 Thread Martin K. Petersen

Bart,

> Avoid that the kernel-doc tool complains about a mismatch between
> the kernel-doc header and the function argument list.

Applied to 4.17/scsi-queue. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] ufs: Fix kernel-doc errors and warnings

2018-03-01 Thread Martin K. Petersen

Bart,

> Avoid that the kernel-doc tool complains about mismatches between
> kernel-doc headers and function definitions. Avoid that errors like
> the following are reported when building the UFS driver with W=1:
>
> drivers/scsi/ufs/tc-dwc-g210-pci.c:60: error: Cannot parse struct or union!
> drivers/scsi/ufs/tc-dwc-g210-pltfrm.c:26: warning: cannot understand function 
> prototype: 'struct ufs_hba_variant_ops tc_dwc_g210_20bit_pltfm_hba_vops = '

Applied to 4.17/scsi-queue. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] sd_zbc: Fix potential memory leak

2018-03-01 Thread Martin K. Petersen

Damien,

> Rework sd_zbc_check_zone_size() to avoid a memory leak due to an early
> return if sd_zbc_report_zones() fails.

Applied to 4.16/scsi-fixes. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: sd: Keep disk read-only when re-reading partition

2018-03-01 Thread Martin K. Petersen

Jeremy,

Thanks for looking into this!

> This commit reads the disk's old state and combines it with the device
> disk-reported state rather than unconditionally marking it as RW.

Applied to 4.16/scsi-fixes with a few minor tweaks. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: mpt3sas: clarify mmio pointer types

2018-03-01 Thread Martin K. Petersen

Broadcom folks: Please review!

> The newly added code mixes up phys_addr_t/resource_size_t with dma_addr_t
> and void pointers, as seen from these compiler warning:
>
> drivers/scsi/mpt3sas/mpt3sas_base.c: In function '_base_get_chain_phys':
> drivers/scsi/mpt3sas/mpt3sas_base.c:235:21: error: cast to pointer from 
> integer of different size [-Werror=int-to-pointer-cast]
>   base_chain_phys  = (void *)ioc->chip_phys + MPI_FRAME_START_OFFSET +
>  ^
> drivers/scsi/mpt3sas/mpt3sas_base.c: In function '_clone_sg_entries':
> drivers/scsi/mpt3sas/mpt3sas_base.c:427:20: error: cast from pointer to 
> integer of different size [-Werror=pointer-to-int-cast]
> sgel->Address = (dma_addr_t)dst_addr_phys;
> ^
> drivers/scsi/mpt3sas/mpt3sas_base.c:438:7: error: cast from pointer to 
> integer of different size [-Werror=pointer-to-int-cast]
>(dma_addr_t)buff_ptr_phys;
>^
> drivers/scsi/mpt3sas/mpt3sas_base.c:444:10: error: cast from pointer to 
> integer of different size [-Werror=pointer-to-int-cast]
>   (dma_addr_t)buff_ptr_phys;
>
> Both dma_addr_t and phys_addr_t may be wider than a pointer, so we must
> avoid the conversion to pointer types. This also helps readability.
>
> A second problem is treating MMIO addresses from a 'struct resource'
> as addresses that can be used for DMA on that device. In almost all
> cases, those are the same, but on some of the more obscure architectures,
> PCI memory address 0 is mapped into the CPU address space at a nonzero
> offset. I don't have a good fix for that, so I'm adding a comment here,
> plus a WARN_ON() that triggers whenever the phys_addr_t number is
> outside of the low 32-bit address space and causes a straight overflow
> when assigned to the 32-bit sgel->Address.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH V2] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-03-01 Thread Martin K. Petersen

Jianchao,

> Yes, the block layer core guarantees that scsi_mq_get_budget() will be
> called before scsi_queue_rq(). I think the full picture is as follows:

> * Before scsi_queue_rq() calls .queuecommand(), get_device() is called for the
>   SCSI device and the device, target and host busy counters are incremented.
> * If the SCSI core decides to requeue a command, scsi_queue_insert() causes
>   __scsi_queue_insert() to call scsi_device_unbusy(). That last function
>   decreases the device, target and host busy counters but not 
> put_device(sdev).
>   Hence the need for a separate put_device() call after requeuing.
>
> It's unfortunate that the SCSI core became so asymmetric. Anyway,
> since I am now convinced that this patch is correct, feel free to add:

Please add something akin to Bart's explanation as a comment and repost.

Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH V2] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-03-01 Thread Martin K. Petersen

Jianchao,

> In scsi core, __scsi_queue_insert should just put request back on the
> queue and retry using the same command as before. However, for blk-mq,
> scsi_mq_requeue_cmd is employed here which will unprepare the
> request. To align with the semantics of __scsi_queue_insert, use
> blk_mq_requeue_request with kick_requeue_list == true and put the
> reference of scsi_device.
>
> V1 -> V2:
>  - add put_device on scsi_device->sdev_gendev

Also, please put changelog after the --- delimiter.

> Cc: Christoph Hellwig 
> Signed-off-by: Jianchao Wang 
> ---
>  drivers/scsi/scsi_lib.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index a86df9c..6fa7b0c 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -191,7 +191,8 @@ static void __scsi_queue_insert(struct scsi_cmnd *cmd, 
> int reason, bool unbusy)
>*/
>   cmd->result = 0;
>   if (q->mq_ops) {
> - scsi_mq_requeue_cmd(cmd);
> + blk_mq_requeue_request(cmd->request, true);
> + put_device(>sdev_gendev);
>   return;
>   }
>   spin_lock_irqsave(q->queue_lock, flags);

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] mpt3sas: Do not mark fw_event workqueue as WQ_MEM_RECLAIM

2018-03-01 Thread Martin K. Petersen

Hannes,

> The firmware event workqueue should not be marked as WQ_MEM_RECLAIM as
> it's doesn't need to make forward progress under memory pressure.  In
> the current state it will result in a deadlock if the device had been
> forcefully removed.

Applied to 4.16/scsi-fixes. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-03-01 Thread Martin K. Petersen

James,

>> Cc:  #4.4+

Ugh.

> Martin, can you fix this up with a rebase and I'll resync my tree?

Done and pushed.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue

2018-03-01 Thread Ming Lei
Hi Don,

Thanks for your test!

On Thu, Mar 01, 2018 at 04:18:17PM +, Don Brace wrote:
> > -Original Message-
> > From: Ming Lei [mailto:ming@redhat.com]
> > Sent: Tuesday, February 27, 2018 4:08 AM
> > To: Jens Axboe ; linux-bl...@vger.kernel.org; Christoph
> > Hellwig ; Mike Snitzer 
> > Cc: linux-scsi@vger.kernel.org; Hannes Reinecke ; Arun Easi
> > ; Omar Sandoval ; Martin K .
> > Petersen ; James Bottomley
> > ; Christoph Hellwig ;
> > Don Brace ; Kashyap Desai
> > ; Peter Rivera ;
> > Laurence Oberman ; Ming Lei
> > ; Meelis Roos 
> > Subject: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue
> > 
> > EXTERNAL EMAIL
> > 
> > 
> > From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs),
> > one msix vector can be created without any online CPU mapped, then one
> > command's completion may not be notified.
> > 
> > This patch setups mapping between cpu and reply queue according to irq
> > affinity info retrived by pci_irq_get_affinity(), and uses this mapping
> > table to choose reply queue for queuing one command.
> > 
> > Then the chosen reply queue has to be active, and fixes IO hang caused
> > by using inactive reply queue which doesn't have any online CPU mapped.
> > 
> > Cc: Hannes Reinecke 
> > Cc: Arun Easi 
> > Cc: "Martin K. Petersen" ,
> > Cc: James Bottomley ,
> > Cc: Christoph Hellwig ,
> > Cc: Don Brace 
> > Cc: Kashyap Desai 
> > Cc: Peter Rivera 
> > Cc: Laurence Oberman 
> > Cc: Meelis Roos 
> > Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs")
> > Signed-off-by: Ming Lei 
> 
> I am getting some issues that need to be tracked down:

I check the patch one more time, not find odd thing, and the only one
is that inside hpsa_do_reset(), wait_for_device_to_become_ready() is
called to send 'test unit ready' always by the reply queue 0. Do you know
if something bad may happen if other non-zero reply queue is used?

Could you share us how you reproduce this issue?

Looks you can boot successfully, so could you please provide the
following output?

1) what is your server type? We may find one in our lab, so that I can
try to reproduce it.

2) lscpu

3) irq affinity info, and you need to pass the 1st column of
'lspci' of your hpsa PCI device to this script:

#!/bin/sh
if [ $# -ge 1 ]; then
PCID=$1
else
PCID=`lspci | grep "Non-Volatile memory" | cut -c1-7`
fi
PCIP=`find /sys/devices -name *$PCID | grep pci`
IRQS=`ls $PCIP/msi_irqs`

echo "kernel version: "
uname -a

echo "PCI name is $PCID, dump its irq affinity:"
for IRQ in $IRQS; do
CPUS=`cat /proc/irq/$IRQ/smp_affinity_list`
echo "\tirq $IRQ, cpu list $CPUS"
done


Thanks,
Ming


Re: [bug report] Don't enter SCSI error handler on kernel 4.16-rc1

2018-03-01 Thread Bart Van Assche
On Wed, 2018-02-28 at 14:17 +0800, chenxiang (M) wrote:
> It seems the patch is for block mq, but the issue i encount is under 
> block legacy as CONFIG_SCSI_MQ_DEFAULT is not enabled.

Since the call traces refer to the ATA code I hope that an ATA expert will
have the time to help you further.

Bart.

Re: [PATCH] scsi_lib: increase {host|target|device}_busy count after dispatch cmd

2018-03-01 Thread Bart Van Assche
On Mon, 2017-06-05 at 17:37 +0800, Ganesh Mahendran wrote:
> In android system, when there are lots of threads running. Thread A
> holding *host_busy* count is easily to be preempted, and if at the
> same time, thread B set *host_blocked*, then all other threads will
> be io blocked.

Hello Ganesh,

Have you considered to insert preempt_disable() and preempt_enable() calls
where necessary to achieve the same effect? I think that would result in a
much less intrusive patch.

Thanks,

Bart.




[PATCH] sd_zbc: Fix sd_zbc_get_seq_zones() kernel-doc header

2018-03-01 Thread Bart Van Assche
Avoid that the kernel-doc tool complains about a mismatch between
the kernel-doc header and the function argument list.

Signed-off-by: Bart Van Assche 
Cc: Damien Le Moal 
Cc: Hannes Reinecke 
---
 drivers/scsi/sd_zbc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 6c348a211ebb..8f3669fd490d 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -520,7 +520,8 @@ static inline unsigned long 
*sd_zbc_alloc_zone_bitmap(struct scsi_disk *sdkp)
  * sd_zbc_get_seq_zones - Parse report zones reply to identify sequential zones
  * @sdkp: disk used
  * @buf: report reply buffer
- * @seq_zone_bitamp: bitmap of sequential zones to set
+ * @buflen: length of @buf
+ * @seq_zones_bitmap: bitmap of sequential zones to set
  *
  * Parse reported zone descriptors in @buf to identify sequential zones and
  * set the reported zone bit in @seq_zones_bitmap accordingly.
-- 
2.16.2



[PATCH] ufs: Fix kernel-doc errors and warnings

2018-03-01 Thread Bart Van Assche
Avoid that the kernel-doc tool complains about mismatches between
kernel-doc headers and function definitions. Avoid that errors like
the following are reported when building the UFS driver with W=1:

drivers/scsi/ufs/tc-dwc-g210-pci.c:60: error: Cannot parse struct or union!
drivers/scsi/ufs/tc-dwc-g210-pltfrm.c:26: warning: cannot understand function 
prototype: 'struct ufs_hba_variant_ops tc_dwc_g210_20bit_pltfm_hba_vops = '

Signed-off-by: Bart Van Assche 
Cc: Vivek Gautam 
Cc: Stanislav Nijnikov 
---
 drivers/scsi/ufs/tc-dwc-g210-pci.c|  4 +-
 drivers/scsi/ufs/tc-dwc-g210-pltfrm.c |  2 +-
 drivers/scsi/ufs/ufshcd-pci.c |  7 ++-
 drivers/scsi/ufs/ufshcd.c | 94 +--
 4 files changed, 52 insertions(+), 55 deletions(-)

diff --git a/drivers/scsi/ufs/tc-dwc-g210-pci.c 
b/drivers/scsi/ufs/tc-dwc-g210-pci.c
index 325d5e14fc0d..2f41722a8c28 100644
--- a/drivers/scsi/ufs/tc-dwc-g210-pci.c
+++ b/drivers/scsi/ufs/tc-dwc-g210-pci.c
@@ -51,7 +51,7 @@ static int tc_dwc_g210_pci_runtime_idle(struct device *dev)
return ufshcd_runtime_idle(dev_get_drvdata(dev));
 }
 
-/**
+/*
  * struct ufs_hba_dwc_vops - UFS DWC specific variant operations
  */
 static struct ufs_hba_variant_ops tc_dwc_g210_pci_hba_vops = {
@@ -71,7 +71,7 @@ static void tc_dwc_g210_pci_shutdown(struct pci_dev *pdev)
 /**
  * tc_dwc_g210_pci_remove - de-allocate PCI/SCSI host and host memory space
  * data structure memory
- * @pdev - pointer to PCI handle
+ * @pdev: pointer to PCI handle
  */
 static void tc_dwc_g210_pci_remove(struct pci_dev *pdev)
 {
diff --git a/drivers/scsi/ufs/tc-dwc-g210-pltfrm.c 
b/drivers/scsi/ufs/tc-dwc-g210-pltfrm.c
index 2d3f5270f875..6dfe5a9206e9 100644
--- a/drivers/scsi/ufs/tc-dwc-g210-pltfrm.c
+++ b/drivers/scsi/ufs/tc-dwc-g210-pltfrm.c
@@ -20,7 +20,7 @@
 #include "ufshcd-dwc.h"
 #include "tc-dwc-g210.h"
 
-/**
+/*
  * UFS DWC specific variant operations
  */
 static struct ufs_hba_variant_ops tc_dwc_g210_20bit_pltfm_hba_vops = {
diff --git a/drivers/scsi/ufs/ufshcd-pci.c b/drivers/scsi/ufs/ufshcd-pci.c
index 925b0ec7ec54..ffe6f82182ba 100644
--- a/drivers/scsi/ufs/ufshcd-pci.c
+++ b/drivers/scsi/ufs/ufshcd-pci.c
@@ -75,8 +75,7 @@ static struct ufs_hba_variant_ops ufs_intel_cnl_hba_vops = {
 #ifdef CONFIG_PM_SLEEP
 /**
  * ufshcd_pci_suspend - suspend power management function
- * @pdev: pointer to PCI device handle
- * @state: power state
+ * @dev: pointer to PCI device handle
  *
  * Returns 0 if successful
  * Returns non-zero otherwise
@@ -88,7 +87,7 @@ static int ufshcd_pci_suspend(struct device *dev)
 
 /**
  * ufshcd_pci_resume - resume power management function
- * @pdev: pointer to PCI device handle
+ * @dev: pointer to PCI device handle
  *
  * Returns 0 if successful
  * Returns non-zero otherwise
@@ -126,7 +125,7 @@ static void ufshcd_pci_shutdown(struct pci_dev *pdev)
 /**
  * ufshcd_pci_remove - de-allocate PCI/SCSI host and host memory space
  * data structure memory
- * @pdev - pointer to PCI handle
+ * @pdev: pointer to PCI handle
  */
 static void ufshcd_pci_remove(struct pci_dev *pdev)
 {
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index a64f560efee7..747a56be32b7 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -524,7 +524,7 @@ int ufshcd_wait_for_register(struct ufs_hba *hba, u32 reg, 
u32 mask,
 
 /**
  * ufshcd_get_intr_mask - Get the interrupt bit mask
- * @hba - Pointer to adapter instance
+ * @hba: Pointer to adapter instance
  *
  * Returns interrupt bit mask per version
  */
@@ -551,7 +551,7 @@ static inline u32 ufshcd_get_intr_mask(struct ufs_hba *hba)
 
 /**
  * ufshcd_get_ufs_version - Get the UFS version supported by the HBA
- * @hba - Pointer to adapter instance
+ * @hba: Pointer to adapter instance
  *
  * Returns UFSHCI version supported by the controller
  */
@@ -578,7 +578,7 @@ static inline bool ufshcd_is_device_present(struct ufs_hba 
*hba)
 
 /**
  * ufshcd_get_tr_ocs - Get the UTRD Overall Command Status
- * @lrb: pointer to local command reference block
+ * @lrbp: pointer to local command reference block
  *
  * This function is used to get the OCS field from UTRD
  * Returns the OCS field in the UTRD
@@ -1738,7 +1738,7 @@ void ufshcd_send_command(struct ufs_hba *hba, unsigned 
int task_tag)
 
 /**
  * ufshcd_copy_sense_data - Copy sense data in case of check condition
- * @lrb - pointer to local reference block
+ * @lrbp: pointer to local reference block
  */
 static inline void ufshcd_copy_sense_data(struct ufshcd_lrb *lrbp)
 {
@@ -1760,7 +1760,7 @@ static inline void ufshcd_copy_sense_data(struct 
ufshcd_lrb *lrbp)
  * ufshcd_copy_query_response() - Copy the Query Response and the data
  * descriptor
  * @hba: per adapter instance
- * @lrb - pointer to local reference block
+ * @lrbp: pointer to local reference block
  */
 static
 int 

Re: [PATCH] sd_zbc: Fix potential memory leak

2018-03-01 Thread Bart Van Assche

On 03/01/18 14:19, Damien Le Moal wrote:

Rework sd_zbc_check_zone_size() to avoid a memory leak due to an early
return if sd_zbc_report_zones() fails.


Reviewed-by: Bart Van Assche 





[PATCH] sd_zbc: Fix potential memory leak

2018-03-01 Thread Damien Le Moal
Rework sd_zbc_check_zone_size() to avoid a memory leak due to an early
return if sd_zbc_report_zones() fails.

Reported-by: David.butterfield 
Signed-off-by: Damien Le Moal 
Cc: sta...@vger.kernel.org
---
 drivers/scsi/sd_zbc.c | 35 +++
 1 file changed, 15 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 14174f26af98..ec55a255d1a2 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -366,7 +366,7 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp, 
unsigned char *buf)
  */
 static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 {
-   u64 zone_blocks;
+   u64 zone_blocks = 0;
sector_t block = 0;
unsigned char *buf;
unsigned char *rec;
@@ -384,10 +384,8 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 
/* Do a report zone to get the same field */
ret = sd_zbc_report_zones(sdkp, buf, SD_ZBC_BUF_SIZE, 0);
-   if (ret) {
-   zone_blocks = 0;
-   goto out;
-   }
+   if (ret)
+   goto out_free;
 
same = buf[4] & 0x0f;
if (same > 0) {
@@ -427,7 +425,7 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
ret = sd_zbc_report_zones(sdkp, buf,
  SD_ZBC_BUF_SIZE, block);
if (ret)
-   return ret;
+   goto out_free;
}
 
} while (block < sdkp->capacity);
@@ -435,35 +433,32 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
zone_blocks = sdkp->zone_blocks;
 
 out:
-   kfree(buf);
-
if (!zone_blocks) {
if (sdkp->first_scan)
sd_printk(KERN_NOTICE, sdkp,
  "Devices with non constant zone "
  "size are not supported\n");
-   return -ENODEV;
-   }
-
-   if (!is_power_of_2(zone_blocks)) {
+   ret = -ENODEV;
+   } else if (!is_power_of_2(zone_blocks)) {
if (sdkp->first_scan)
sd_printk(KERN_NOTICE, sdkp,
  "Devices with non power of 2 zone "
  "size are not supported\n");
-   return -ENODEV;
-   }
-
-   if (logical_to_sectors(sdkp->device, zone_blocks) > UINT_MAX) {
+   ret = -ENODEV;
+   } else if (logical_to_sectors(sdkp->device, zone_blocks) > UINT_MAX) {
if (sdkp->first_scan)
sd_printk(KERN_NOTICE, sdkp,
  "Zone size too large\n");
-   return -ENODEV;
+   ret = -ENODEV;
+   } else {
+   sdkp->zone_blocks = zone_blocks;
+   sdkp->zone_shift = ilog2(zone_blocks);
}
 
-   sdkp->zone_blocks = zone_blocks;
-   sdkp->zone_shift = ilog2(zone_blocks);
+out_free:
+   kfree(buf);
 
-   return 0;
+   return ret;
 }
 
 /**
-- 
2.14.3



Re: [PATCH] scsi: ata: don't reset three times if device is offline for SAS host

2018-03-01 Thread Tejun Heo
Hello,

On Wed, Feb 28, 2018 at 03:18:39PM +0800, chenxiang (M) wrote:
> If we can introduce a port flags such as ATA_LFLAG_DISABLED or
> ATA_EHI_NO_RECOVERY before ata_eh_recover,
> it will skip recovery. But we only get device status NODEV from
> ata_do_reset which is after ata_eh_recover, i don't
> think it is used to introduce a port flags at that time.
> From function ata_eh_reset, it seems there are two situations that
> end the recovery logic in current code:
> 1. Retry 3 times;
> 2. Reset function return value -ERESTART, but this return value
> seems be specal situation for ATA;

So, we have -ERESTART for restarting, -EPIPE for speeding down.  We
can just add another special return code - say, -ENOENT - to indicate
that there's nothing and it shouldn't keep trying.

Thanks.

-- 
tejun


Re: [PATCH v2] ata: do not schedule hot plug if it is a sas host

2018-03-01 Thread Tejun Heo
On Wed, Feb 28, 2018 at 09:11:10AM +0800, Jason Yan wrote:
> We've got a kernel panic when using sata disk with sas controller:
> 
> [115946.152283] Unable to handle kernel NULL pointer dereference at virtual 
> address 07d8
> [115946.223963] CPU: 0 PID: 22175 Comm: kworker/0:1 Tainted: G   W OEL  
> 4.14.0 #1
> [115946.232925] Workqueue: events ata_scsi_hotplug
> [115946.237938] task: 8021ee50b180 task.stack: 0d5d
> [115946.244717] PC is at sas_find_dev_by_rphy+0x44/0x114
> [115946.250224] LR is at sas_find_dev_by_rphy+0x3c/0x114
> ..
> [115946.355701] Process kworker/0:1 (pid: 22175, stack limit = 
> 0x0d5d)
> [115946.363369] Call trace:
> [115946.456356] [] sas_find_dev_by_rphy+0x44/0x114
> [115946.462908] [] sas_target_alloc+0x20/0x5c
> [115946.469408] [] scsi_alloc_target+0x250/0x308
> [115946.475781] [] __scsi_add_device+0xb0/0x154
> [115946.481991] [] ata_scsi_scan_host+0x180/0x218
> [115946.488367] [] ata_scsi_hotplug+0xb0/0xcc
> [115946.494801] [] process_one_work+0x144/0x390
> [115946.501115] [] worker_thread+0x144/0x418
> [115946.507093] [] kthread+0x10c/0x138
> [115946.512792] [] ret_from_fork+0x10/0x18
> 
> We found that Ding Xiang has reported a similar bug before:
> https://patchwork.kernel.org/patch/9179817/
> 
> And this bug still exists in mainline. Since libsas handles hotplug and
> device adding/removing itself, do not need to schedule ata hot plug task
> here if it is a sas host.
> 
> Signed-off-by: Jason Yan 
> CC: Ding Xiang 

Applied to libata/for-4.16-fixes w/ stable cc'd.

Thanks.

-- 
tejun


Re: [PATCH V3 0/8] blk-mq & scsi: fix reply queue selection and improve host wide tagset

2018-03-01 Thread Laurence Oberman
On Tue, 2018-02-27 at 18:07 +0800, Ming Lei wrote:
> Hi All,
> 
> The 1st two patches fixes reply queue selection, and this issue has
> been
> reported and can cause IO hang during booting, please consider the
> two
> for V4.16.
> 
> The following 6 patches try to improve hostwide tagset on hpsa and
> megaraid_sas by making hw queue per NUMA node.
> 
> I don't have high-performance hpsa and megaraid_sas device at hand.
> 
> Don Brace, could you test this patchset on concurrent IOs over you
> hpsa
> SSD and see if this approach is well?
> 
> Kashyap, could you test this patchset on your megaraid_sas SSDs?
> 
>   gitweb: https://github.com/ming1/linux/tree/v4.16-rc-host-tags-
> v3.2
> 
> thanks,
> Ming
> 
> Hannes Reinecke (1):
>   scsi: Add template flag 'host_tagset'
> 
> Ming Lei (7):
>   scsi: hpsa: fix selection of reply queue
>   scsi: megaraid_sas: fix selection of reply queue
>   blk-mq: introduce 'start_tag' field to 'struct blk_mq_tags'
>   blk-mq: introduce BLK_MQ_F_HOST_TAGS
>   block: null_blk: introduce module parameter of 'g_host_tags'
>   scsi: hpsa: improve scsi_mq performance via .host_tagset
>   scsi: megaraid: improve scsi_mq performance via .host_tagset
> 
>  block/blk-mq-debugfs.c  |  2 +
>  block/blk-mq-sched.c|  2 +-
>  block/blk-mq-tag.c  | 13 +++--
>  block/blk-mq-tag.h  | 11 ++--
>  block/blk-mq.c  | 50 +++---
>  block/blk-mq.h  |  3 +-
>  drivers/block/null_blk.c|  6 +++
>  drivers/scsi/hpsa.c | 79
> ++---
>  drivers/scsi/hpsa.h |  1 +
>  drivers/scsi/megaraid/megaraid_sas.h|  2 +-
>  drivers/scsi/megaraid/megaraid_sas_base.c   | 40 ++-
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 12 ++---
>  drivers/scsi/scsi_lib.c |  2 +
>  include/linux/blk-mq.h  |  2 +
>  include/scsi/scsi_host.h|  3 ++
>  15 files changed, 182 insertions(+), 46 deletions(-)
> 

For the patchset above
All functional I/O tests and boot tests passed with multiple concurrent
fio runs.

Original HPSA booting issue is also resolved and its important or we
will have to revert original genirq commit
commit 84676c1f21e8ff54befe985f4f14dc1edc10046b

Tested-by: Laurence Oberman 

Thanks
Laurence


Re: [PATCH 1/8] dt-bindings: scsi: hisi_sas: add an property of signal attenuation

2018-03-01 Thread Rob Herring
On Tue, Feb 20, 2018 at 03:13:24AM +0800, John Garry wrote:
> From: Xiaofei Tan 
> 
> For some new boards with hip07 chipset we are required to
> set PHY config registers differently. The hw property which
> determines how to set these registers is in the PHY signal
> attenuation readings.
> 
> This patch add an devicetree property, signal-attenuation, which
> is used to describe the signal attenuation of an board.
> 
> Cc: Rob Herring 
> Cc: Mark Rutland 
> Signed-off-by: Xiaofei Tan 
> Signed-off-by: John Garry 
> ---
>  Documentation/devicetree/bindings/scsi/hisilicon-sas.txt | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/scsi/hisilicon-sas.txt 
> b/Documentation/devicetree/bindings/scsi/hisilicon-sas.txt
> index df3bef7..bd32ecd 100644
> --- a/Documentation/devicetree/bindings/scsi/hisilicon-sas.txt
> +++ b/Documentation/devicetree/bindings/scsi/hisilicon-sas.txt
> @@ -53,6 +53,13 @@ Main node required properties:
>  Optional main node properties:
>   - hip06-sas-v2-quirk-amt : when set, indicates that the v2 controller has 
> the
>   "am-max-transmissions" limitation.
> + - signal-attenuation : array of 3 32-bit values, containing de-emphasis,

Needs a vendor prefix.

> + preshoot, and boost attenuation readings for the board. They
> + are used to describe the signal attenuation of the board. These
> + values' range is 7600 to 12400, and used to represent -24dB to
> + 24dB.
> + The formula is "y = (x-1)/1". For example, 10478
> + means 4.78dB.
>  
>  Example:
>   sas0: sas@c100 {
> -- 
> 1.9.1
> 


Re: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue

2018-03-01 Thread Laurence Oberman
On Thu, 2018-03-01 at 14:01 -0500, Laurence Oberman wrote:
> On Thu, 2018-03-01 at 16:18 +, Don Brace wrote:
> > > -Original Message-
> > > From: Ming Lei [mailto:ming@redhat.com]
> > > Sent: Tuesday, February 27, 2018 4:08 AM
> > > To: Jens Axboe ; linux-bl...@vger.kernel.org;
> > > Christoph
> > > Hellwig ; Mike Snitzer 
> > > Cc: linux-scsi@vger.kernel.org; Hannes Reinecke ;
> > > Arun Easi
> > > ; Omar Sandoval ; Martin K
> > > .
> > > Petersen ; James Bottomley
> > > ; Christoph Hellwig  > > st
> > > .de>;
> > > Don Brace ; Kashyap Desai
> > > ; Peter Rivera  > > .c
> > > om>;
> > > Laurence Oberman ; Ming Lei
> > > ; Meelis Roos 
> > > Subject: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue
> > > 
> > > EXTERNAL EMAIL
> > > 
> > > 
> > > From 84676c1f21 (genirq/affinity: assign vectors to all possible
> > > CPUs),
> > > one msix vector can be created without any online CPU mapped,
> > > then
> > > one
> > > command's completion may not be notified.
> > > 
> > > This patch setups mapping between cpu and reply queue according
> > > to
> > > irq
> > > affinity info retrived by pci_irq_get_affinity(), and uses this
> > > mapping
> > > table to choose reply queue for queuing one command.
> > > 
> > > Then the chosen reply queue has to be active, and fixes IO hang
> > > caused
> > > by using inactive reply queue which doesn't have any online CPU
> > > mapped.
> > > 
> > > Cc: Hannes Reinecke 
> > > Cc: Arun Easi 
> > > Cc: "Martin K. Petersen" ,
> > > Cc: James Bottomley ,
> > > Cc: Christoph Hellwig ,
> > > Cc: Don Brace 
> > > Cc: Kashyap Desai 
> > > Cc: Peter Rivera 
> > > Cc: Laurence Oberman 
> > > Cc: Meelis Roos 
> > > Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all
> > > possible CPUs")
> > > Signed-off-by: Ming Lei 
> > 
> > I am getting some issues that need to be tracked down:
> > 
> > [ 1636.032984] hpsa :87:00.0: Acknowledging event: 0xc032
> > (HP
> > SSD Smart Path configuration change)
> > [ 1638.510656] hpsa :87:00.0: scsi 3:0:8:0: updated Direct-
> > Access HP   MO0400JDVEU  PHYS DRV SSDSmartPathCap- En-
> > Exp=0
> > [ 1653.967695] hpsa :87:00.0: Acknowledging event: 0x8020
> > (HP
> > SSD Smart Path configuration change)
> > [ 1656.770377] hpsa :87:00.0: scsi 3:0:8:0: updated Direct-
> > Access HP   MO0400JDVEU  PHYS DRV SSDSmartPathCap- En-
> > Exp=0
> > [ 2839.762267] hpsa :87:00.0: Acknowledging event: 0x8020
> > (HP
> > SSD Smart Path configuration change)
> > [ 2840.841290] hpsa :87:00.0: scsi 3:0:8:0: updated Direct-
> > Access HP   MO0400JDVEU  PHYS DRV SSDSmartPathCap- En-
> > Exp=0
> > [ 2917.582653] hpsa :87:00.0: Acknowledging event: 0xc020
> > (HP
> > SSD Smart Path configuration change)
> > [ 2919.087191] hpsa :87:00.0: scsi 3:1:0:1: updated Direct-
> > Access HP   LOGICAL VOLUME   RAID-5 SSDSmartPathCap+ En+
> > Exp=1
> > [ 2919.142527] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> > [3:1:0:2] A phys disk component of LV is missing, turning off
> > offload_enabled for LV.
> > [ 2919.203915] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> > [3:1:0:2] A phys disk component of LV is missing, turning off
> > offload_enabled for LV.
> > [ 2919.266921] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> > [3:1:0:2] A phys disk component of LV is missing, turning off
> > offload_enabled for LV.
> > [ 2934.999629] hpsa :87:00.0: Acknowledging event: 0x4000
> > (HP
> > SSD Smart Path state change)
> > [ 2936.937333] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> > [3:1:0:2] A phys disk component of LV is missing, turning off
> > offload_enabled for LV.
> > [ 2936.998707] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> > [3:1:0:2] A phys disk component of LV is missing, turning off
> > offload_enabled for LV.
> > [ 2937.060101] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> > [3:1:0:2] A phys disk component of LV is missing, turning off
> > offload_enabled for LV.
> > [ 3619.711122] sd 3:1:0:3: [sde] tag#436 FAILED Result:
> > hostbyte=DID_OK driverbyte=DRIVER_SENSE
> > [ 3619.751150] sd 3:1:0:3: [sde] tag#436 Sense Key : Aborted
> > Command
> > [current] 
> > [ 3619.784375] sd 3:1:0:3: [sde] tag#436 Add. Sense: Internal
> > target
> > failure
> > [ 3619.816530] sd 3:1:0:3: [sde] tag#436 CDB: Read(10) 28 00 01 1b
> > ad
> > af 00 00 01 00
> > [ 3619.852295] print_req_error: I/O 

[PATCH 2/6] libata: Use scsi_zbc_noretry_cmd() for ZAC devices

2018-03-01 Thread Damien Le Moal
Improve decisions regarding command retry worthiness by calling
the funtion scsi_zbc_noretry_cmd() in ata_eh_worth_retry() if the
command target is a ZAC device.

Signed-off-by: Damien Le Moal 
---
 drivers/ata/libata-eh.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 11c3137d7b0a..504272b18e75 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -2139,6 +2139,10 @@ static unsigned int ata_eh_speed_down(struct ata_device 
*dev,
  */
 static inline int ata_eh_worth_retry(struct ata_queued_cmd *qc)
 {
+   if (qc->dev->flags & ATA_DFLAG_ZAC &&
+   qc->flags & ATA_QCFLAG_SENSE_VALID &&
+   scsi_zbc_noretry_cmd(qc->scsicmd))
+   return 0;   /* retrying will fail again */
if (qc->err_mask & AC_ERR_MEDIA)
return 0;   /* don't retry media errors */
if (qc->flags & ATA_QCFLAG_IO)
-- 
2.14.3



[PATCH 1/6] scsi: Introduce scsi_zbc_noretry_cmd()

2018-03-01 Thread Damien Le Moal
For ZBC/ZAC devices, retrying a command with a condition known to lead
to a failure is useless. One example is an unaligned write past the
write pointer of a sequential zone. Retrying the same command will
result in an error again.

Currently, these iknown error condition cases are handled in sd_zbc.c
using the sd_zbc_complete() function which is called from sd_done() when
a command completes. However, these known error conditions are not
handled in libata, nor is scsi_noretry_cmd() considering them.

Fix this by introducing the function scsi_zbc_noretry_cmd() and use this
function in scsi_noretry_cmd(). This allows simplifying
sd_zbc_complete() which now only has to deal with report zones command
reply.

scsi_zbc_noretry_cmd() is also exported so that it can be used from
libata.

Signed-off-by: Damien Le Moal 
---
 drivers/scsi/scsi_error.c | 66 +++
 drivers/scsi/sd.c |  2 +-
 drivers/scsi/sd.h |  8 +++---
 drivers/scsi/sd_zbc.c | 47 -
 include/scsi/scsi_eh.h|  1 +
 5 files changed, 77 insertions(+), 47 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index ca53a5f785ee..abb33d250176 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1671,6 +1671,66 @@ static void scsi_eh_offline_sdevs(struct list_head 
*work_q,
return;
 }
 
+/**
+ * scsi_zbc_noretry_cmd - Determine if ZBC device command can be retried
+ * @scmd:   Failed cmd to check
+ *
+ * Test the error condition of a failed ZBC device command to determine cases
+ * that are known to be not worth retrying.
+ * If the specified command is not intended for a ZBC device, do nothing.
+ */
+bool scsi_zbc_noretry_cmd(struct scsi_cmnd *scmd)
+{
+   struct request *rq = scmd->request;
+   struct scsi_sense_hdr sshdr;
+
+   /*
+* The request queue zone model may not be set when this is called
+* during device probe/revalidation. In that case, just fall back to
+* default behavior and let the caller decide what to do with failures.
+*/
+   if (!blk_queue_is_zoned(rq->q))
+   return false;
+
+   if (!scsi_command_normalize_sense(scmd, ))
+   /* no valid sense data, don't know, so maybe retry */
+   return false;
+
+   if (sshdr.sense_key != ILLEGAL_REQUEST)
+   return false;
+
+   switch (req_op(rq)) {
+   case REQ_OP_ZONE_RESET:
+
+   if (sshdr.asc == 0x24) {
+   /*
+* INVALID FIELD IN CDB error: reset of a conventional
+* zone was attempted. Nothing to worry about, so be
+* quiet about the error.
+*/
+   if (!blk_rq_is_passthrough(rq))
+   rq->rq_flags |= RQF_QUIET;
+   return true;
+   }
+   return false;
+
+   case REQ_OP_WRITE:
+   case REQ_OP_WRITE_ZEROES:
+   case REQ_OP_WRITE_SAME:
+
+   /*
+* INVALID ADDRESS FOR WRITE error: It is unlikely that
+* retrying write requests failed with any kind of
+* alignement error will result in success. So don't.
+*/
+   return sshdr.asc == 0x21;
+
+   default:
+   return false;
+   }
+}
+EXPORT_SYMBOL_GPL(scsi_zbc_noretry_cmd);
+
 /**
  * scsi_noretry_cmd - determine if command should be failed fast
  * @scmd:  SCSI cmd to examine.
@@ -1699,6 +1759,12 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)
return 0;
 
 check_type:
+   /*
+* For ZBC, do not retry conditions that will only fail again.
+*/
+   if (scmd->device->type == TYPE_ZBC &&
+   scsi_zbc_noretry_cmd(scmd))
+   return 1;
/*
 * assume caller has checked sense and determined
 * the check condition was retryable.
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index bff21e636ddd..93c6baa7d677 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2041,7 +2041,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 
  out:
if (sd_is_zoned(sdkp))
-   sd_zbc_complete(SCpnt, good_bytes, );
+   sd_zbc_complete(SCpnt, good_bytes);
 
SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt,
   "sd_done: completed %d of %d 
bytes\n",
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 0d663b5e45bb..b777ffecf386 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -284,8 +284,7 @@ extern void sd_zbc_remove(struct scsi_disk *sdkp);
 extern void sd_zbc_print_zones(struct scsi_disk *sdkp);
 extern int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd);
 extern int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd);
-extern void sd_zbc_complete(struct scsi_cmnd *cmd, unsigned 

[PATCH 4/6] libata: Fix ata_err_string()

2018-03-01 Thread Damien Le Moal
Add proper error string output for ATA_ERR_NCQ and ATA_ERR_NODEV_HINT
instead of returning "unknown error".

Signed-off-by: Damien Le Moal 
---
 drivers/ata/libata-eh.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index d6264235c3d7..006ea1507dcf 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1482,6 +1482,10 @@ static const char *ata_err_string(unsigned int err_mask)
return "invalid argument";
if (err_mask & AC_ERR_DEV)
return "device error";
+   if (err_mask & AC_ERR_NCQ)
+   return "NCQ error";
+   if (err_mask & AC_ERR_NODEV_HINT)
+   return "Polling detection error";
return "unknown error";
 }
 
-- 
2.14.3



[PATCH 3/6] libata: Fix comment typo in ata_eh_analyze_tf()

2018-03-01 Thread Damien Le Moal
Signed-off-by: Damien Le Moal 
---
 drivers/ata/libata-eh.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 504272b18e75..d6264235c3d7 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1865,10 +1865,10 @@ static unsigned int ata_eh_analyze_tf(struct 
ata_queued_cmd *qc,
if (qc->flags & ATA_QCFLAG_SENSE_VALID) {
int ret = scsi_check_sense(qc->scsicmd);
/*
-* SUCCESS here means that the sense code could
+* SUCCESS here means that the sense code could be
 * evaluated and should be passed to the upper layers
 * for correct evaluation.
-* FAILED means the sense code could not interpreted
+* FAILED means the sense code could not be interpreted
 * and the device would need to be reset.
 * NEEDS_RETRY and ADD_TO_MLQUEUE means that the
 * command would need to be retried.
-- 
2.14.3



[PATCH 6/6] libata: Be quiet when asked to

2018-03-01 Thread Damien Le Moal
For a successful setting of the device transfer speed mode in
ata_dev_set_mode(), do not print the message
"ataX.XX: configured for xxx" if the EH context has the quiet flag set.

Signed-off-by: Damien Le Moal 
---
 drivers/ata/libata-core.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 3c09122bf038..258afc2e8efd 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3573,9 +3573,10 @@ static int ata_dev_set_mode(struct ata_device *dev)
DPRINTK("xfer_shift=%u, xfer_mode=0x%x\n",
dev->xfer_shift, (int)dev->xfer_mode);
 
-   ata_dev_info(dev, "configured for %s%s\n",
-ata_mode_string(ata_xfer_mode2mask(dev->xfer_mode)),
-dev_err_whine);
+   if (!(ehc->i.flags & ATA_EHI_QUIET))
+   ata_dev_info(dev, "configured for %s%s\n",
+   ata_mode_string(ata_xfer_mode2mask(dev->xfer_mode)),
+   dev_err_whine);
 
return 0;
 
-- 
2.14.3



[PATCH 5/6] libata: Honor RQF_QUIET flag

2018-03-01 Thread Damien Le Moal
Currently, libata ignores requests RQF_QUIET flag and print error
messages for failed commands, regardless if this flag is set in the
command request. Fix this by introducing the ata_eh_quiet() function and
using this function in ata_eh_link_autopsy() to determine if the EH
context should be quiet. This works by counting the number of failed
commands and the number of commands with the quiet flag set. If both
numbers are equal, the the EH context can be set to quiet and all error
messages suppressed. Otherwise, only the error messages for the failed
commands are suppressed and the link Emask and irq_stat messages printed.

Signed-off-by: Damien Le Moal 
---
 drivers/ata/libata-eh.c   | 26 +-
 drivers/ata/libata-scsi.c |  3 +++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 006ea1507dcf..c9f0c8660a7b 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -2156,6 +2156,21 @@ static inline int ata_eh_worth_retry(struct 
ata_queued_cmd *qc)
return qc->err_mask != AC_ERR_DEV;  /* retry if not dev error */
 }
 
+/**
+ *  ata_eh_quiet - check if we need to be quiet about a command error
+ *  @qc: qc to check
+ *
+ *  Look at the qc flags anbd its scsi command request flags to determine
+ *  if we need to be quiet about the command failure.
+ */
+static inline bool ata_eh_quiet(struct ata_queued_cmd *qc)
+{
+   if (qc->scsicmd &&
+   qc->scsicmd->request->rq_flags & RQF_QUIET)
+   qc->flags |= ATA_QCFLAG_QUIET;
+   return qc->flags & ATA_QCFLAG_QUIET;
+}
+
 /**
  * ata_eh_link_autopsy - analyze error and determine recovery action
  * @link: host link to perform autopsy on
@@ -2173,7 +2188,7 @@ static void ata_eh_link_autopsy(struct ata_link *link)
struct ata_eh_context *ehc = >eh_context;
struct ata_device *dev;
unsigned int all_err_mask = 0, eflags = 0;
-   int tag;
+   int tag, nr_failed = 0, nr_quiet = 0;
u32 serror;
int rc;
 
@@ -2239,8 +2254,17 @@ static void ata_eh_link_autopsy(struct ata_link *link)
if (qc->flags & ATA_QCFLAG_IO)
eflags |= ATA_EFLAG_IS_IO;
trace_ata_eh_link_autopsy_qc(qc);
+
+   /* Count quiet errors */
+   if (ata_eh_quiet(qc))
+   nr_quiet++;
+   nr_failed++;
}
 
+   /* If all failed commands requested silence, then be quiet */
+   if (nr_quiet == nr_failed)
+   ehc->i.flags |= ATA_EHI_QUIET;
+
/* enforce default EH actions */
if (ap->pflags & ATA_PFLAG_FROZEN ||
all_err_mask & (AC_ERR_HSM | AC_ERR_TIMEOUT))
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 66be961c93a4..84b6fc8906a2 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -872,6 +872,9 @@ static struct ata_queued_cmd *ata_scsi_qc_new(struct 
ata_device *dev,
 
qc->sg = scsi_sglist(cmd);
qc->n_elem = scsi_sg_count(cmd);
+
+   if (cmd->request->rq_flags & RQF_QUIET)
+   qc->flags |= ATA_QCFLAG_QUIET;
} else {
cmd->result = (DID_OK << 16) | (QUEUE_FULL << 1);
cmd->scsi_done(cmd);
-- 
2.14.3



[PATCH] Improve ZBC/ZAC error handling

2018-03-01 Thread Damien Le Moal
This series introduces changes to scsi and libata error handling for ZBC and ZAC
devices.

The first patch moves ZBC specific error handling in sd_zbc_complete() to a
generic scsi error function that can be used also in libata (second patch). The
goal of this change is to limit retries for commands that are identify as not
worth retrying (know failure condition) for both scsi/ZBC and libata/ZAC.
Without these two patches, only ZBC behaves nicely (commands that are known to
fail are retried in the ZAC case).

The following 2 snmall patches are simple fixes.

The last 2 pacthes are improvements in libata error handling verbosity.

Damien Le Moal (6):
  scsi: Introduce scsi_zbc_noretry_cmd()
  libata: Use scsi_zbc_noretry_cmd() for ZAC devices
  libata: Fix comment typo in ata_eh_analyze_tf()
  libata: Fix ata_err_string()
  libata: Honor RQF_QUIET flag
  libata: Be quiet when asked to

 drivers/ata/libata-core.c |  7 ++---
 drivers/ata/libata-eh.c   | 38 ---
 drivers/ata/libata-scsi.c |  3 +++
 drivers/scsi/scsi_error.c | 66 +++
 drivers/scsi/sd.c |  2 +-
 drivers/scsi/sd.h |  8 +++---
 drivers/scsi/sd_zbc.c | 47 -
 include/scsi/scsi_eh.h|  1 +
 8 files changed, 119 insertions(+), 53 deletions(-)

-- 
2.14.3



Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-03-01 Thread James Bottomley
On Mon, 2018-02-12 at 10:28 -0800, Himanshu Madhani wrote:
[...]
> Cc:  #4.4+

This is the wrong stable tag, which would lead to stable not picking up
the patch automatically.  The correct stable address is

Cc:  #4.4+
                 ^^

Please be more careful in future.

Martin, can you fix this up with a rebase and I'll resync my tree?

Thanks,

James



Re: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue

2018-03-01 Thread Laurence Oberman
On Thu, 2018-03-01 at 16:18 +, Don Brace wrote:
> > -Original Message-
> > From: Ming Lei [mailto:ming@redhat.com]
> > Sent: Tuesday, February 27, 2018 4:08 AM
> > To: Jens Axboe ; linux-bl...@vger.kernel.org;
> > Christoph
> > Hellwig ; Mike Snitzer 
> > Cc: linux-scsi@vger.kernel.org; Hannes Reinecke ;
> > Arun Easi
> > ; Omar Sandoval ; Martin K .
> > Petersen ; James Bottomley
> > ; Christoph Hellwig  > .de>;
> > Don Brace ; Kashyap Desai
> > ; Peter Rivera  > om>;
> > Laurence Oberman ; Ming Lei
> > ; Meelis Roos 
> > Subject: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue
> > 
> > EXTERNAL EMAIL
> > 
> > 
> > From 84676c1f21 (genirq/affinity: assign vectors to all possible
> > CPUs),
> > one msix vector can be created without any online CPU mapped, then
> > one
> > command's completion may not be notified.
> > 
> > This patch setups mapping between cpu and reply queue according to
> > irq
> > affinity info retrived by pci_irq_get_affinity(), and uses this
> > mapping
> > table to choose reply queue for queuing one command.
> > 
> > Then the chosen reply queue has to be active, and fixes IO hang
> > caused
> > by using inactive reply queue which doesn't have any online CPU
> > mapped.
> > 
> > Cc: Hannes Reinecke 
> > Cc: Arun Easi 
> > Cc: "Martin K. Petersen" ,
> > Cc: James Bottomley ,
> > Cc: Christoph Hellwig ,
> > Cc: Don Brace 
> > Cc: Kashyap Desai 
> > Cc: Peter Rivera 
> > Cc: Laurence Oberman 
> > Cc: Meelis Roos 
> > Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all
> > possible CPUs")
> > Signed-off-by: Ming Lei 
> 
> I am getting some issues that need to be tracked down:
> 
> [ 1636.032984] hpsa :87:00.0: Acknowledging event: 0xc032 (HP
> SSD Smart Path configuration change)
> [ 1638.510656] hpsa :87:00.0: scsi 3:0:8:0: updated Direct-
> Access HP   MO0400JDVEU  PHYS DRV SSDSmartPathCap- En-
> Exp=0
> [ 1653.967695] hpsa :87:00.0: Acknowledging event: 0x8020 (HP
> SSD Smart Path configuration change)
> [ 1656.770377] hpsa :87:00.0: scsi 3:0:8:0: updated Direct-
> Access HP   MO0400JDVEU  PHYS DRV SSDSmartPathCap- En-
> Exp=0
> [ 2839.762267] hpsa :87:00.0: Acknowledging event: 0x8020 (HP
> SSD Smart Path configuration change)
> [ 2840.841290] hpsa :87:00.0: scsi 3:0:8:0: updated Direct-
> Access HP   MO0400JDVEU  PHYS DRV SSDSmartPathCap- En-
> Exp=0
> [ 2917.582653] hpsa :87:00.0: Acknowledging event: 0xc020 (HP
> SSD Smart Path configuration change)
> [ 2919.087191] hpsa :87:00.0: scsi 3:1:0:1: updated Direct-
> Access HP   LOGICAL VOLUME   RAID-5 SSDSmartPathCap+ En+
> Exp=1
> [ 2919.142527] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> [3:1:0:2] A phys disk component of LV is missing, turning off
> offload_enabled for LV.
> [ 2919.203915] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> [3:1:0:2] A phys disk component of LV is missing, turning off
> offload_enabled for LV.
> [ 2919.266921] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> [3:1:0:2] A phys disk component of LV is missing, turning off
> offload_enabled for LV.
> [ 2934.999629] hpsa :87:00.0: Acknowledging event: 0x4000 (HP
> SSD Smart Path state change)
> [ 2936.937333] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> [3:1:0:2] A phys disk component of LV is missing, turning off
> offload_enabled for LV.
> [ 2936.998707] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> [3:1:0:2] A phys disk component of LV is missing, turning off
> offload_enabled for LV.
> [ 2937.060101] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs:
> [3:1:0:2] A phys disk component of LV is missing, turning off
> offload_enabled for LV.
> [ 3619.711122] sd 3:1:0:3: [sde] tag#436 FAILED Result:
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> [ 3619.751150] sd 3:1:0:3: [sde] tag#436 Sense Key : Aborted Command
> [current] 
> [ 3619.784375] sd 3:1:0:3: [sde] tag#436 Add. Sense: Internal target
> failure
> [ 3619.816530] sd 3:1:0:3: [sde] tag#436 CDB: Read(10) 28 00 01 1b ad
> af 00 00 01 00
> [ 3619.852295] print_req_error: I/O error, dev sde, sector 18591151
> [ 3619.880850] sd 3:1:0:3: [sde] tag#461 FAILED Result:
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> [ 3619.920981] sd 3:1:0:3: [sde] tag#461 Sense Key : Aborted Command
> [current] 
> [ 3619.955081] sd 3:1:0:3: [sde] tag#461 Add. Sense: Internal target
> failure
> [ 3619.987054] sd 3:1:0:3: [sde] 

Re: [PATCH V2] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-03-01 Thread Bart Van Assche
On Thu, 2018-03-01 at 09:57 +0800, jianchao.wang wrote:
> On 03/01/2018 01:52 AM, Bart Van Assche wrote:
> > On Wed, 2018-02-28 at 16:55 +0800, Jianchao Wang wrote:
> > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > > index a86df9c..6fa7b0c 100644
> > > --- a/drivers/scsi/scsi_lib.c
> > > +++ b/drivers/scsi/scsi_lib.c
> > > @@ -191,7 +191,8 @@ static void __scsi_queue_insert(struct scsi_cmnd 
> > > *cmd, int reason, bool unbusy)
> > >*/
> > >   cmd->result = 0;
> > >   if (q->mq_ops) {
> > > - scsi_mq_requeue_cmd(cmd);
> > > + blk_mq_requeue_request(cmd->request, true);
> > > + put_device(>sdev_gendev);
> > >   return;
> > >   }
> > >   spin_lock_irqsave(q->queue_lock, flags);
> > 
> > Anyone who sees the put_device() call that follows the 
> > blk_mq_requeue_request()
> > call will wonder why that call occurs there. So I think we need a comment 
> > above
> > that call that explains where the matching get_device() call is.
> 
> Yes, I will add this.
> 
> > For the legacy code path, there is a get_device() call in scsi_prep_fn() 
> > but no
> > put_device() call in scsi_unprep_fn() - the matching put_device() calls 
> > occur in
> > scsi_end_request() and after blk_requeue_request().
> > 
> > For scsi-mq however there is a get_device() call in scsi_mq_get_budget() 
> > and a
> > put_device() call in scsi_mq_put_budget(). So why do we need the 
> > put_device()
> > calls after blk_mq_requeue_request() and in the mq path for 
> > scsi_end_request()?
> > 
> 
> From the source code, we know the scsi_mq_get_budget will be invoked every 
> time
> when we issue a request. But scsi_mq_put_budget is just in the fail path.
> 
> scsi_queue_rq // if any error
>   -> scsi_mq_put_budget
> 
> blk_mq_dispatch_rq_list // if no driver tags
>   -> blk_mq_put_dispatch_budget
> -> scsi_mq_put_budget
> blk_mq_do_dispatch_sched/blk_mq_do_dispatch_ctx // if no requests
>   -> blk_mq_put_dispatch_budget
> -> scsi_mq_put_budget
> 
> So we have to add put_device after blk_mq_requeue_request() and in
> scsi_end_request() to match the scsi_mq_get_budget.

Hello Jianchao,

Yes, the block layer core guarantees that scsi_mq_get_budget() will be called
before scsi_queue_rq(). I think the full picture is as follows:
* Before scsi_queue_rq() calls .queuecommand(), get_device() is called for the
  SCSI device and the device, target and host busy counters are incremented.
* If the SCSI core decides to requeue a command, scsi_queue_insert() causes
  __scsi_queue_insert() to call scsi_device_unbusy(). That last function
  decreases the device, target and host busy counters but not put_device(sdev).
  Hence the need for a separate put_device() call after requeuing.

It's unfortunate that the SCSI core became so asymmetric. Anyway, since I am
now convinced that this patch is correct, feel free to add:

Reviewed-by: Bart Van Assche 




RE: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue

2018-03-01 Thread Don Brace
> -Original Message-
> From: Ming Lei [mailto:ming@redhat.com]
> Sent: Tuesday, February 27, 2018 4:08 AM
> To: Jens Axboe ; linux-bl...@vger.kernel.org; Christoph
> Hellwig ; Mike Snitzer 
> Cc: linux-scsi@vger.kernel.org; Hannes Reinecke ; Arun Easi
> ; Omar Sandoval ; Martin K .
> Petersen ; James Bottomley
> ; Christoph Hellwig ;
> Don Brace ; Kashyap Desai
> ; Peter Rivera ;
> Laurence Oberman ; Ming Lei
> ; Meelis Roos 
> Subject: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue
> 
> EXTERNAL EMAIL
> 
> 
> From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs),
> one msix vector can be created without any online CPU mapped, then one
> command's completion may not be notified.
> 
> This patch setups mapping between cpu and reply queue according to irq
> affinity info retrived by pci_irq_get_affinity(), and uses this mapping
> table to choose reply queue for queuing one command.
> 
> Then the chosen reply queue has to be active, and fixes IO hang caused
> by using inactive reply queue which doesn't have any online CPU mapped.
> 
> Cc: Hannes Reinecke 
> Cc: Arun Easi 
> Cc: "Martin K. Petersen" ,
> Cc: James Bottomley ,
> Cc: Christoph Hellwig ,
> Cc: Don Brace 
> Cc: Kashyap Desai 
> Cc: Peter Rivera 
> Cc: Laurence Oberman 
> Cc: Meelis Roos 
> Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs")
> Signed-off-by: Ming Lei 

I am getting some issues that need to be tracked down:

[ 1636.032984] hpsa :87:00.0: Acknowledging event: 0xc032 (HP SSD Smart 
Path configuration change)
[ 1638.510656] hpsa :87:00.0: scsi 3:0:8:0: updated Direct-Access HP
   MO0400JDVEU  PHYS DRV SSDSmartPathCap- En- Exp=0
[ 1653.967695] hpsa :87:00.0: Acknowledging event: 0x8020 (HP SSD Smart 
Path configuration change)
[ 1656.770377] hpsa :87:00.0: scsi 3:0:8:0: updated Direct-Access HP
   MO0400JDVEU  PHYS DRV SSDSmartPathCap- En- Exp=0
[ 2839.762267] hpsa :87:00.0: Acknowledging event: 0x8020 (HP SSD Smart 
Path configuration change)
[ 2840.841290] hpsa :87:00.0: scsi 3:0:8:0: updated Direct-Access HP
   MO0400JDVEU  PHYS DRV SSDSmartPathCap- En- Exp=0
[ 2917.582653] hpsa :87:00.0: Acknowledging event: 0xc020 (HP SSD Smart 
Path configuration change)
[ 2919.087191] hpsa :87:00.0: scsi 3:1:0:1: updated Direct-Access HP
   LOGICAL VOLUME   RAID-5 SSDSmartPathCap+ En+ Exp=1
[ 2919.142527] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs: [3:1:0:2] A phys 
disk component of LV is missing, turning off offload_enabled for LV.
[ 2919.203915] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs: [3:1:0:2] A phys 
disk component of LV is missing, turning off offload_enabled for LV.
[ 2919.266921] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs: [3:1:0:2] A phys 
disk component of LV is missing, turning off offload_enabled for LV.
[ 2934.999629] hpsa :87:00.0: Acknowledging event: 0x4000 (HP SSD Smart 
Path state change)
[ 2936.937333] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs: [3:1:0:2] A phys 
disk component of LV is missing, turning off offload_enabled for LV.
[ 2936.998707] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs: [3:1:0:2] A phys 
disk component of LV is missing, turning off offload_enabled for LV.
[ 2937.060101] hpsa :87:00.0: hpsa_figure_phys_disk_ptrs: [3:1:0:2] A phys 
disk component of LV is missing, turning off offload_enabled for LV.
[ 3619.711122] sd 3:1:0:3: [sde] tag#436 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
[ 3619.751150] sd 3:1:0:3: [sde] tag#436 Sense Key : Aborted Command [current] 
[ 3619.784375] sd 3:1:0:3: [sde] tag#436 Add. Sense: Internal target failure
[ 3619.816530] sd 3:1:0:3: [sde] tag#436 CDB: Read(10) 28 00 01 1b ad af 00 00 
01 00
[ 3619.852295] print_req_error: I/O error, dev sde, sector 18591151
[ 3619.880850] sd 3:1:0:3: [sde] tag#461 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
[ 3619.920981] sd 3:1:0:3: [sde] tag#461 Sense Key : Aborted Command [current] 
[ 3619.955081] sd 3:1:0:3: [sde] tag#461 Add. Sense: Internal target failure
[ 3619.987054] sd 3:1:0:3: [sde] tag#461 CDB: Read(10) 28 00 02 15 31 40 00 00 
01 00
[ 3620.022569] print_req_error: I/O error, dev sde, sector 34943296
[ 3620.050873] sd 3:1:0:3: [sde] tag#157 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
[ 3620.091124] sd 3:1:0:3: [sde] tag#157 Sense Key : Aborted 

Re: [PATCH] scsi: sd: Keep disk read-only when re-reading partition

2018-03-01 Thread Andy Shevchenko
On Thu, Mar 1, 2018 at 4:08 PM, Jeremy Cline  wrote:
> If the read-only flag is true on a SCSI disk, re-reading the partition
> table sets the flag back to false.
>
> To observe this bug, you can run:
>
> 1. blockdev --setro /dev/sda
> 2. blockdev --rereadpt /dev/sda
> 3. blockdev --getro /dev/sda
>
> This commit reads the disk's old state and combines it with the device
> disk-reported state rather than unconditionally marking it as RW.

> -   sdkp->write_prot = ((data.device_specific & 0x80) != 0);
> +   sdkp->write_prot = ((data.device_specific & 0x80) != 0) ||
> +   disk_ro;

Perhaps
   sdkp->write_prot = (data.device_specific & 0x80) || disk_ro;

will save a line.

-- 
With Best Regards,
Andy Shevchenko


[PATCH] scsi: sd: Keep disk read-only when re-reading partition

2018-03-01 Thread Jeremy Cline
If the read-only flag is true on a SCSI disk, re-reading the partition
table sets the flag back to false.

To observe this bug, you can run:

1. blockdev --setro /dev/sda
2. blockdev --rereadpt /dev/sda
3. blockdev --getro /dev/sda

This commit reads the disk's old state and combines it with the device
disk-reported state rather than unconditionally marking it as RW.

Reported-by: Li Ning 
Signed-off-by: Jeremy Cline 
---
 drivers/scsi/sd.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index bff21e636ddd..7a3a66a7890f 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2595,6 +2595,7 @@ sd_read_write_protect_flag(struct scsi_disk *sdkp, 
unsigned char *buffer)
int res;
struct scsi_device *sdp = sdkp->device;
struct scsi_mode_data data;
+   int disk_ro = get_disk_ro(sdkp->disk);
int old_wp = sdkp->write_prot;
 
set_disk_ro(sdkp->disk, 0);
@@ -2634,7 +2635,8 @@ sd_read_write_protect_flag(struct scsi_disk *sdkp, 
unsigned char *buffer)
sd_first_printk(KERN_WARNING, sdkp,
  "Test WP failed, assume Write Enabled\n");
} else {
-   sdkp->write_prot = ((data.device_specific & 0x80) != 0);
+   sdkp->write_prot = ((data.device_specific & 0x80) != 0) ||
+   disk_ro;
set_disk_ro(sdkp->disk, sdkp->write_prot);
if (sdkp->first_scan || old_wp != sdkp->write_prot) {
sd_printk(KERN_NOTICE, sdkp, "Write Protect is %s\n",
-- 
2.16.2



[PATCH] scsi: mpt3sas: clarify mmio pointer types

2018-03-01 Thread Arnd Bergmann
The newly added code mixes up phys_addr_t/resource_size_t with dma_addr_t
and void pointers, as seen from these compiler warning:

drivers/scsi/mpt3sas/mpt3sas_base.c: In function '_base_get_chain_phys':
drivers/scsi/mpt3sas/mpt3sas_base.c:235:21: error: cast to pointer from integer 
of different size [-Werror=int-to-pointer-cast]
  base_chain_phys  = (void *)ioc->chip_phys + MPI_FRAME_START_OFFSET +
 ^
drivers/scsi/mpt3sas/mpt3sas_base.c: In function '_clone_sg_entries':
drivers/scsi/mpt3sas/mpt3sas_base.c:427:20: error: cast from pointer to integer 
of different size [-Werror=pointer-to-int-cast]
sgel->Address = (dma_addr_t)dst_addr_phys;
^
drivers/scsi/mpt3sas/mpt3sas_base.c:438:7: error: cast from pointer to integer 
of different size [-Werror=pointer-to-int-cast]
   (dma_addr_t)buff_ptr_phys;
   ^
drivers/scsi/mpt3sas/mpt3sas_base.c:444:10: error: cast from pointer to integer 
of different size [-Werror=pointer-to-int-cast]
  (dma_addr_t)buff_ptr_phys;

Both dma_addr_t and phys_addr_t may be wider than a pointer, so we must
avoid the conversion to pointer types. This also helps readability.

A second problem is treating MMIO addresses from a 'struct resource'
as addresses that can be used for DMA on that device. In almost all
cases, those are the same, but on some of the more obscure architectures,
PCI memory address 0 is mapped into the CPU address space at a nonzero
offset. I don't have a good fix for that, so I'm adding a comment here,
plus a WARN_ON() that triggers whenever the phys_addr_t number is
outside of the low 32-bit address space and causes a straight overflow
when assigned to the 32-bit sgel->Address.

Fixes: 182ac784b41f ("scsi: mpt3sas: Introduce Base function for cloning.")
Signed-off-by: Arnd Bergmann 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 42 -
 drivers/scsi/mpt3sas/mpt3sas_base.h |  2 +-
 2 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 9f2434e59b40..61f93a134956 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -225,14 +225,14 @@ _base_get_chain(struct MPT3SAS_ADAPTER *ioc, u16 smid,
  *
  * @Return - Physical chain address.
  */
-static inline void *
+static inline phys_addr_t
 _base_get_chain_phys(struct MPT3SAS_ADAPTER *ioc, u16 smid,
u8 sge_chain_count)
 {
-   void *base_chain_phys, *chain_phys;
+   phys_addr_t base_chain_phys, chain_phys;
u16 cmd_credit = ioc->facts.RequestCredit + 1;
 
-   base_chain_phys  = (void *)ioc->chip_phys + MPI_FRAME_START_OFFSET +
+   base_chain_phys  = ioc->chip_phys + MPI_FRAME_START_OFFSET +
(cmd_credit * ioc->request_sz) +
REPLY_FREE_POOL_SIZE;
chain_phys = base_chain_phys + (smid * ioc->facts.MaxChainDepth *
@@ -272,11 +272,11 @@ _base_get_buffer_bar0(struct MPT3SAS_ADAPTER *ioc, u16 
smid)
  *
  * @Returns - Pointer to buffer location in BAR0.
  */
-static void *
+static phys_addr_t
 _base_get_buffer_phys_bar0(struct MPT3SAS_ADAPTER *ioc, u16 smid)
 {
u16 cmd_credit = ioc->facts.RequestCredit + 1;
-   void *chain_end_phys = _base_get_chain_phys(ioc,
+   phys_addr_t chain_end_phys = _base_get_chain_phys(ioc,
cmd_credit + 1,
ioc->facts.MaxChainDepth);
return chain_end_phys + (smid * 64 * 1024);
@@ -330,11 +330,12 @@ static void _clone_sg_entries(struct MPT3SAS_ADAPTER *ioc,
bool is_write = 0;
u16 i = 0;
void __iomem *buffer_iomem;
-   void  *buffer_iomem_phys;
+   phys_addr_t buffer_iomem_phys;
void __iomem *buff_ptr;
-   void *buff_ptr_phys;
+   phys_addr_t buff_ptr_phys;
void __iomem *dst_chain_addr[MCPU_MAX_CHAINS_PER_IO];
-   void *src_chain_addr[MCPU_MAX_CHAINS_PER_IO], *dst_addr_phys;
+   void *src_chain_addr[MCPU_MAX_CHAINS_PER_IO];
+   phys_addr_t dst_addr_phys;
MPI2RequestHeader_t *request_hdr;
struct scsi_cmnd *scmd;
struct scatterlist *sg_scmd = NULL;
@@ -391,6 +392,7 @@ static void _clone_sg_entries(struct MPT3SAS_ADAPTER *ioc,
 
buff_ptr = buffer_iomem;
buff_ptr_phys = buffer_iomem_phys;
+   WARN_ON(buff_ptr_phys > U32_MAX);
 
if (sgel->FlagsLength &
(MPI2_SGE_FLAGS_HOST_TO_IOC << MPI2_SGE_FLAGS_SHIFT))
@@ -421,10 +423,10 @@ static void _clone_sg_entries(struct MPT3SAS_ADAPTER *ioc,
smid, sge_chain_count);
src_chain_addr[sge_chain_count] =
(void *) sgel_next;
-   dst_addr_phys =
-   _base_get_chain_phys(ioc,
+   dst_addr_phys = _base_get_chain_phys(ioc,
smid, 

[PATCH] scsi: ufs: sysfs: reworking of the rpm_lvl and spm_lvl entries

2018-03-01 Thread Stanislav Nijnikov
The functions related to the rpm_lvl and spm_lvl sysfs entries were moved to
a separate file as part of UFS device management patchset. During the review
phase I was pointed that these entries implementaion could be improved.
The each file output was split to three parts:
rmp_lvl - rpm_lvl, rpm_target_dev_state, rpm_target_link_state
smp_lvl - spm_lvl, spm_target_dev_state, spm_target_link_state.
Each new entry has a single value output and doesn't reqire additional 
parsing.
The information about new entries and possible power management levels was
added to the corresponding ABI file.
The on-write behaviour of these entries wasn't changed.

Stanislav Nijnikov (1):
  scsi: ufs: sysfs: reworking of the rpm_lvl and spm_lvl entries

 Documentation/ABI/testing/sysfs-driver-ufs | 67 ++
 drivers/scsi/ufs/ufs-sysfs.c   | 92 +++---
 2 files changed, 114 insertions(+), 45 deletions(-)

-- 
2.7.4



[PATCH] scsi: ufs: sysfs: reworking of the rpm_lvl and spm_lvl entries

2018-03-01 Thread Stanislav Nijnikov
Read from these files will return the integed value of the chosen power
management level now. Separate entries were added to show the target
UFS device and UIC link states. The description of the possible power
managements levels was added to the ABI file. The on-write behaviour of
these entries wasn't changed.

Signed-off-by: Stanislav Nijnikov 
---
 Documentation/ABI/testing/sysfs-driver-ufs | 67 ++
 drivers/scsi/ufs/ufs-sysfs.c   | 92 +++---
 2 files changed, 114 insertions(+), 45 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-ufs 
b/Documentation/ABI/testing/sysfs-driver-ufs
index 07f1c2f..83735f7 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -802,3 +802,70 @@ Description:   This file shows the The amount of 
physical memory needed
the particular logical unit. The full information about
the attribute could be found at UFS specifications 2.1.
The file is read only.
+
+
+What:  /sys/bus/platform/drivers/ufshcd/*/rpm_lvl
+Date:  September 2014
+Contact:   Subhash Jadavani 
+Description:   This entry could be used to set or show the UFS device
+   runtime power management level. The current driver
+   implementation supports 6 levels with next target states:
+   0 - an UFS device will stay active, an UIC link will
+   stay active
+   1 - an UFS device will stay active, an UIC link will
+   hibernate
+   2 - an UFS device will moved to sleep, an UIC link will
+   stay active
+   3 - an UFS device will moved to sleep, an UIC link will
+   hibernate
+   4 - an UFS device will be powered off, an UIC link will
+   hibernate
+   5 - an UFS device will be powered off, an UIC link will
+   be powered off
+
+What:  /sys/bus/platform/drivers/ufshcd/*/rpm_target_dev_state
+Date:  February 2018
+Contact:   Subhash Jadavani 
+Description:   This entry shows the target power mode of an UFS device
+   for the chosen runtime power management level.
+   The file is read only.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/rpm_target_link_state
+Date:  February 2018
+Contact:   Subhash Jadavani 
+Description:   This entry shows the target state of an UFS UIC link
+   for the chosen runtime power management level.
+   The file is read only.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/spm_lvl
+Date:  September 2014
+Contact:   Subhash Jadavani 
+Description:   This entry could be used to set or show the UFS device
+   system power management level. The current driver
+   implementation supports 6 levels with next target states:
+   0 - an UFS device will stay active, an UIC link will
+   stay active
+   1 - an UFS device will stay active, an UIC link will
+   hibernate
+   2 - an UFS device will moved to sleep, an UIC link will
+   stay active
+   3 - an UFS device will moved to sleep, an UIC link will
+   hibernate
+   4 - an UFS device will be powered off, an UIC link will
+   hibernate
+   5 - an UFS device will be powered off, an UIC link will
+   be powered off
+
+What:  /sys/bus/platform/drivers/ufshcd/*/spm_target_dev_state
+Date:  February 2018
+Contact:   Subhash Jadavani 
+Description:   This entry shows the target power mode of an UFS device
+   for the chosen system power management level.
+   The file is read only.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/spm_target_link_state
+Date:  February 2018
+Contact:   Subhash Jadavani 
+Description:   This entry shows the target state of an UFS UIC link
+   for the chosen system power management level.
+   The file is read only.
diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c
index cd7174d..4ff9e0b 100644
--- a/drivers/scsi/ufs/ufs-sysfs.c
+++ b/drivers/scsi/ufs/ufs-sysfs.c
@@ -57,29 +57,8 @@ static ssize_t rpm_lvl_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
struct ufs_hba *hba = dev_get_drvdata(dev);
-   int curr_len;
-   u8 lvl;
-
-   curr_len = snprintf(buf, PAGE_SIZE,
-   "\nCurrent Runtime PM level [%d] => dev_state [%s] 
link_state [%s]\n",
-   hba->rpm_lvl,
-   ufschd_ufs_dev_pwr_mode_to_string(
-