Re: [PATCH v3] m68k/amiga - Amiga Zorro NCR53C9x boards: new zorro_esp.c

2018-03-13 Thread Michael Schmitz
Hi Finn,


Am 12.03.2018 um 22:04 schrieb Finn Thain:
>> +if (addr == esp->command_block_dma)
>> +addr = (u32) esp->command_block;
> 
> Since you've removed the alternative branch and phys_to_virt(), I suggest 
> you do this at function invocation... (see below)

Keeps it together with the phase and address tests, so easier to follow.


>> +static int zorro_esp_probe(struct zorro_dev *z,
>> +   const struct zorro_device_id *ent)
>> +{
>> +struct scsi_host_template *tpnt = _esp_template;
>> +struct Scsi_Host *host;
>> +struct esp *esp;
>> +struct zorro_driver_data *zdd;
>> +struct zorro_esp_priv *zep;
>> +unsigned long board, ioaddr, dmaaddr;
>> +int err;
>> +
>> +board = zorro_resource_start(z);
>> +zdd = (struct zorro_driver_data *)ent->driver_data;
>> +
>> +pr_info("%s found at address 0x%lx.\n", zdd->name, board);
>> +
>> +zep = kmalloc(sizeof(*zep), GFP_KERNEL);
>> +if (!zep) {
>> +pr_err("Can't allocate device private data!\n");
>> +return -ENOMEM;
>> +}
>> +
>> +/* let's figure out whether we have a Zorro II or Zorro III board */
>> +if ((z->rom.er_Type & ERT_TYPEMASK) == ERT_ZORROIII) {
>> +/* note this is a Zorro III board */
>> +if (board > 0xff)
>> +zep->zorro3 = 1;
> 
> The comment here seems to be redundant (?)

Not the only one.

> More importantly, I think you have to initialize zep->zorro3 = 0 in the 
> other branch, or else use kzalloc() instead of kmalloc() above.

Using kzalloc() saves the zep->error=0 initialization later on so it's a
win.

>> +case 
>> ZORRO_PROD_PHASE5_BLIZZARD_1230_II_FASTLANE_Z3_CYBERSCSI_CYBERSTORM060:
>> +if (zep->zorro3) {
>> +/* Fastlane Zorro III board */
>> +/* map address space up to ESP address for DMA */
> 
> Same here?

I've left the comment explaining what the board low address space is
used for.

>> +zep->board_base = ioremap_nocache(board,
>> +FASTLANE_ESP_ADDR-1);
>> +if (!zep->board_base) {
>> +pr_err("Cannot allocate board address space\n");
>> +err = -ENOMEM;
>> +goto fail_free_host;
>> +}
>> +/* initialize DMA control shadow register */
>> +zep->ctrl_data = (FASTLANE_DMA_FCODE |
>> +  FASTLANE_DMA_EDI | FASTLANE_DMA_ESI);
>> +zorro_esp_ops.send_dma_cmd =
>> +zorro_esp_send_fastlane_dma_cmd;
>> +zorro_esp_ops.irq_pending  = fastlane_esp_irq_pending;
>> +zorro_esp_ops.dma_invalidate =
>> +fastlane_esp_dma_invalidate;
>> +} else {
>> +/* Blizzard 1230 II Zorro II board */

and that one - the board might be a Cyberstorm060 which would require
replacement of driver data, and replacement of the ESP register mapping.
Might need to clarify that here again...

>> +zorro_esp_ops.send_dma_cmd =
>> +zorro_esp_send_blz1230II_dma_cmd;

>> +//zorro_set_drvdata(z, host);
>> +
> 
> Can that be deleted?

Leftover from when private data were static, so can go now.


>> +static void zorro_esp_remove(struct zorro_dev *z)
>> +{
>> +/* equivalent to dev_get_drvdata(z->dev) */
> 
> If zorro_get_drvdata(z) needs a comment to explain it then why not just 
> write dev_get_drvdata(z->dev) instead?

dev_get_drvdata(>dev) (struct zorro device incorporates the whole
struct device, not a pointer), but yes.

>> +struct zorro_esp_priv *zep = zorro_get_drvdata(z);
>> +struct esp *esp = zep->esp;
>> +struct Scsi_Host *host = esp->host;
>> +
>> +scsi_esp_unregister(esp);
>> +
>> +/* Disable interrupts. Perhaps use disable_irq instead ... */
>> +
> 
> Can you clarify? free_irq() calls irq_shutdown()...

Which calls disable_irq() eventually if it's the last of shared
interrupts, true. disable_irq() would clearly be wrong here.

> 
> Thanks!
> 

My pleasure. I forgot to add your Reviewed-by tag - will do that for the
next version, OK?

Cheers,

Michael


答复: [PATCH v8 2/5] dt-bindings: scsi: ufs: add document for hisi-ufs

2018-03-13 Thread liwei (CM)
Hi,Arnd

Sorry to bother you again, please take the time to review the patch. Are there 
any other suggestions?
Looking forward to your reply.

Thanks!



-邮件原件-
发件人: liwei (CM) 
发送时间: 2018年2月23日 16:36
收件人: 'Arnd Bergmann'
抄送: Rob Herring; Mark Rutland; xuwei (O); Catalin Marinas; Will Deacon; Vinayak 
Holikatti; James E.J. Bottomley; Martin K. Petersen; Kevin Hilman; Gregory 
CLEMENT; Thomas Petazzoni; Masahiro Yamada; Riku Voipio; Thierry Reding; 
Krzysztof Kozlowski; DTML; Linux Kernel Mailing List; Linux ARM; linux-scsi; 
zangleigang; Gengjianfeng; Guodong Xu; Zhangfei Gao; Fengbaopeng (kevin, Kirin 
Solution Dept)
主题: 答复: [PATCH v8 2/5] dt-bindings: scsi: ufs: add document for hisi-ufs

Hi, Arnd

Sorry late for you.
The following two suggestions we have really discussed
https://lkml.org/lkml/2017/11/30/1077

-邮件原件-
发件人: arndbergm...@gmail.com [mailto:arndbergm...@gmail.com] 代表 Arnd Bergmann
发送时间: 2018年2月19日 17:58
收件人: liwei (CM)
抄送: Rob Herring; Mark Rutland; xuwei (O); Catalin Marinas; Will Deacon; Vinayak 
Holikatti; James E.J. Bottomley; Martin K. Petersen; Kevin Hilman; Gregory 
CLEMENT; Thomas Petazzoni; Masahiro Yamada; Riku Voipio; Thierry Reding; 
Krzysztof Kozlowski; Eric Anholt; DTML; Linux Kernel Mailing List; Linux ARM; 
linux-scsi; zangleigang; Gengjianfeng; Guodong Xu; Zhangfei Gao; Fengbaopeng 
(kevin, Kirin Solution Dept)
主题: Re: [PATCH v8 2/5] dt-bindings: scsi: ufs: add document for hisi-ufs

On Tue, Feb 13, 2018 at 11:14 AM, Li Wei  wrote:
> add ufs node document for Hisilicon.
>
> Signed-off-by: Li Wei 
> ---
>  Documentation/devicetree/bindings/ufs/ufs-hisi.txt | 37 
> ++
>  1 file changed, 37 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/ufs/ufs-hisi.txt


I'm pretty sure we've discussed it before, but can you make this so that the 
generic properties are part of the ufshcd binding, and you refer to it from 
here, only describing in what ways the hisi ufs binding differs from the 
standard?

> diff --git a/Documentation/devicetree/bindings/ufs/ufs-hisi.txt 
> b/Documentation/devicetree/bindings/ufs/ufs-hisi.txt
> new file mode 100644
> index ..0d21b57496cf
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/ufs/ufs-hisi.txt
> @@ -0,0 +1,37 @@
> +* Hisilicon Universal Flash Storage (UFS) Host Controller
> +
> +UFS nodes are defined to describe on-chip UFS hardware macro.
> +Each UFS Host Controller should have its own node.
> +
> +Required properties:
> +- compatible: compatible list, contains one of the following -
> +   "hisilicon,hi3660-ufs", 
> "jedec,ufs-1.1" for hisi ufs
> +   host controller present on Hi36xx 
> chipset.
> +- reg   : should contain UFS register address space & UFS SYS 
> CTRL register address,
> +- interrupt-parent  : interrupt device
> +- interrupts: interrupt number
> +- clocks   : List of phandle and clock specifier pairs
> +- clock-names   : List of clock input name strings sorted in the same
> +   order as the clocks property. 
> +"ref_clk", "phy_clk" is optional

The clock names sound generic enough, should we have both in the generic 
binding?

"ref_clk" is in the ufshcd-pltfrm binding, but "phy_clk" is not; what do you 
mean is that these two don't need to document here?

> +- resets: reset node register, one reset the clk and the other 
> reset the controller
> +- reset-names   : describe reset node register

This looks incomplete. What is the name of the reset line supposed to be?
I'd also suggest you document it in the ufshcd binding instead.

As discussed in https://lkml.org/lkml/2017/11/30/1077;
If document it in the ufshcd binding, I think it needs some codes to parse them 
in ufshcd.c/ufshcd-pltfrm.c, but I think these codes may not be applicable to 
other SOC manufacturers.

  Arnd


Re: mpt3sas: sleeping function called from invalid context

2018-03-13 Thread Bart Van Assche
(+Jaco)

On Tue, 2018-03-13 at 16:18 +0530, Suganath Prabu Subramani wrote:
> We have root caused the issue and it is same as you mentioned.
> "_scsih_get_enclosure_logicalid_chassis_slot()" is called with interrupts
> disabled and this function
> "_scsih_get_enclosure_logicalid_chassis_slot" again calls
> _config_request(), with mutex_lock().
> 
> We have patch ready along with few other change and we ll be posting
> it by tomorrow after covering BST.




Re: [PATCH] scsi: eata: drop VLA in reorder()

2018-03-13 Thread Salvatore Mesoraca
2018-03-13 10:05 GMT+01:00 Christoph Hellwig :
> On Mon, Mar 12, 2018 at 10:35:36PM -0400, Martin K. Petersen wrote:
>> No objections to Salvatore's patch but I have a slight affinity for
>> retiring unused code over patching it. So unless there are objections...
>
> Lets kill it.  And the not DMA capable eata_pio driver with it for
> good riddance.

Good, I'll send a patch to remove eata & friends.
Thank you for your time,

Salvatore


[PATCH] scsi: eata: eata-pio: Deprecate legacy EATA drivers

2018-03-13 Thread Martin K. Petersen
These two drivers do not appear to be in active use. Deprecate them.

Signed-off-by: Martin K. Petersen 
---
 MAINTAINERS |6 -
 drivers/scsi/Kconfig|   62 --
 drivers/scsi/Makefile   |2 -
 drivers/scsi/eata.c | 2571 ---
 drivers/scsi/eata_generic.h |  401 ---
 drivers/scsi/eata_pio.c |  966 
 drivers/scsi/eata_pio.h |   54 -
 7 files changed, 4062 deletions(-)
 delete mode 100644 drivers/scsi/eata.c
 delete mode 100644 drivers/scsi/eata_generic.h
 delete mode 100644 drivers/scsi/eata_pio.c
 delete mode 100644 drivers/scsi/eata_pio.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 3bdc260e36b7..36eb73238cf5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5003,12 +5003,6 @@ T:   git git://linuxtv.org/anttip/media_tree.git
 S: Maintained
 F: drivers/media/tuners/e4000*
 
-EATA ISA/EISA/PCI SCSI DRIVER
-M: Dario Ballabio 
-L: linux-scsi@vger.kernel.org
-S: Maintained
-F: drivers/scsi/eata.c
-
 EC100 MEDIA DRIVER
 M: Antti Palosaari 
 L: linux-me...@vger.kernel.org
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 8647ca9199b3..b18dfa1c14a1 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -640,68 +640,6 @@ config SCSI_DMX3191D
  To compile this driver as a module, choose M here: the
  module will be called dmx3191d.
 
-config SCSI_EATA
-   tristate "EATA ISA/EISA/PCI (DPT and generic EATA/DMA-compliant boards) 
support"
-   depends on (ISA || EISA || PCI) && SCSI && ISA_DMA_API
-   ---help---
- This driver supports all EATA/DMA-compliant SCSI host adapters.  DPT
- ISA and all EISA I/O addresses are probed looking for the "EATA"
- signature. The addresses of all the PCI SCSI controllers reported
-  by the PCI subsystem are probed as well.
-
- You want to read the start of  and the
- SCSI-HOWTO, available from
- .
-
- To compile this driver as a module, choose M here: the
- module will be called eata.
-
-config SCSI_EATA_TAGGED_QUEUE
-   bool "enable tagged command queueing"
-   depends on SCSI_EATA
-   help
- This is a feature of SCSI-2 which improves performance: the host
- adapter can send several SCSI commands to a device's queue even if
- previous commands haven't finished yet.
- This is equivalent to the "eata=tc:y" boot option.
-
-config SCSI_EATA_LINKED_COMMANDS
-   bool "enable elevator sorting"
-   depends on SCSI_EATA
-   help
- This option enables elevator sorting for all probed SCSI disks and
- CD-ROMs. It definitely reduces the average seek distance when doing
- random seeks, but this does not necessarily result in a noticeable
- performance improvement: your mileage may vary...
- This is equivalent to the "eata=lc:y" boot option.
-
-config SCSI_EATA_MAX_TAGS
-   int "maximum number of queued commands"
-   depends on SCSI_EATA
-   default "16"
-   help
- This specifies how many SCSI commands can be maximally queued for
- each probed SCSI device. You should reduce the default value of 16
- only if you have disks with buggy or limited tagged command support.
- Minimum is 2 and maximum is 62. This value is also the window size
- used by the elevator sorting option above. The effective value used
- by the driver for each probed SCSI device is reported at boot time.
- This is equivalent to the "eata=mq:8" boot option.
-
-config SCSI_EATA_PIO
-   tristate "EATA-PIO (old DPT PM2001, PM2012A) support"
-   depends on (ISA || EISA || PCI) && SCSI && BROKEN
-   ---help---
- This driver supports all EATA-PIO protocol compliant SCSI Host
- Adapters like the DPT PM2001 and the PM2012A.  EATA-DMA compliant
- host adapters could also use this driver but are discouraged from
- doing so, since this driver only supports hard disks and lacks
- numerous features.  You might want to have a look at the SCSI-HOWTO,
- available from .
-
- To compile this driver as a module, choose M here: the
- module will be called eata_pio.
-
 config SCSI_FUTURE_DOMAIN
tristate "Future Domain 16xx SCSI/AHA-2920A support"
depends on (ISA || PCI) && SCSI
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index fcfd28d2884c..46f1c1033f0e 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -93,8 +93,6 @@ obj-$(CONFIG_SCSI_HPSA)   += hpsa.o
 obj-$(CONFIG_SCSI_SMARTPQI)+= smartpqi/
 obj-$(CONFIG_SCSI_SYM53C8XX_2) += sym53c8xx_2/
 obj-$(CONFIG_SCSI_ZALON)   += zalon7xx.o
-obj-$(CONFIG_SCSI_EATA_PIO)+= eata_pio.o
-obj-$(CONFIG_SCSI_EATA)

Re: bsg-lib interface cleanup V2

2018-03-13 Thread Jens Axboe
On 3/13/18 9:28 AM, Christoph Hellwig wrote:
> Hi all,
> 
> this series cleans up various abuses of the bsg interfaces, and then
> splits bsg for SCSI passthrough from bsg for arbitrary transport
> passthrough.  This removes the scsi_request abuse in bsg-lib that is
> very confusing, and also makes sure we can sanity check the requests
> we get.  The current code will happily execute scsi commands against
> bsg-lib queues, and transport pass through against scsi nodes, without
> any indication to the driver that we are doing the wrong thing.

Applied for 4.17, thanks.

-- 
Jens Axboe



RE: [PATCH V5 2/5] scsi: megaraid_sas: fix selection of reply queue

2018-03-13 Thread Kashyap Desai
> -Original Message-
> From: Ming Lei [mailto:ming@redhat.com]
> Sent: Tuesday, March 13, 2018 3:13 PM
> To: James Bottomley; Jens Axboe; Martin K . Petersen
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org; linux-
> bl...@vger.kernel.org; Meelis Roos; Don Brace; Kashyap Desai; Laurence
> Oberman; Mike Snitzer; Paolo Bonzini; Ming Lei; Hannes Reinecke; James
> Bottomley; Artem Bityutskiy
> Subject: [PATCH V5 2/5] scsi: megaraid_sas: fix selection of reply queue
>
> From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs),
one
> msix vector can be created without any online CPU mapped, then command
> may be queued, and won't be notified after its completion.
>
> This patch setups mapping between cpu and reply queue according to irq
> affinity info retrived by pci_irq_get_affinity(), and uses this info to
choose
> reply queue for queuing one command.
>
> Then the chosen reply queue has to be active, and fixes IO hang caused
by
> using inactive reply queue which doesn't have any online CPU mapped.
>
> Cc: Hannes Reinecke 
> Cc: "Martin K. Petersen" ,
> Cc: James Bottomley ,
> Cc: Christoph Hellwig ,
> Cc: Don Brace 
> Cc: Kashyap Desai 
> Cc: Laurence Oberman 
> Cc: Mike Snitzer 
> Cc: Meelis Roos 
> Cc: Artem Bityutskiy 
> Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible
CPUs")
> Signed-off-by: Ming Lei 


Side note - For a max performance, your below proposed patch/series is
required.  Without below patch, performance is going to be dropped due to
fewer reply queues are getting utilized one this particular patch is
included.
genirq/affinity: irq vector spread  among online CPUs as far as possible

Acked-by: Kashyap Desai 
Tested-by: Kashyap Desai 


Re: [PATCH][next] scsi: lpfc: make several unions static, fix non-ANSI prototype

2018-03-13 Thread James Smart

On 3/13/2018 5:08 AM, Colin King wrote:

From: Colin Ian King 

There are several unions that are local to the source and do not need
to be in global scope, so make them static. Also add in a missing void
parameter to functions lpfc_nvme_cmd_template and
lpfc_nvmet_cmd_template to clean up non-ANSI warning.

Cleans up sparse warnings:
drivers/scsi/lpfc/lpfc_nvme.c:68:19: warning: symbol
'lpfc_iread_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvme.c:69:19: warning: symbol
'lpfc_iwrite_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvme.c:70:19: warning: symbol
'lpfc_icmnd_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvme.c:74:24: warning: non-ANSI function
'lpfc_tsend_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:78:19: warning: symbol
'lpfc_treceive_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:79:19: warning: symbol
'lpfc_trsp_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:83:25: warning: non-ANSI function
declaration of function 'lpfc_nvmet_cmd_template'

Signed-off-by: Colin Ian King 
---
  drivers/scsi/lpfc/lpfc_nvme.c  | 8 
  drivers/scsi/lpfc/lpfc_nvmet.c | 8 
  2 files changed, 8 insertions(+), 8 deletions(-)




Signed-off-by:   James Smart 




[PATCH 2/3] bsg-lib: remove bsg_job.req

2018-03-13 Thread Christoph Hellwig
Users of the bsg-lib interface should only use the bsg_job data structure
and not know about implementation details of it.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Benjamin Block 
Reviewed-by: Hannes Reinecke 
Reviewed-by: Johannes Thumshirn 
---
 block/bsg-lib.c | 14 ++
 include/linux/bsg-lib.h |  1 -
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index fb509779a090..f2c2d54a61b4 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -35,7 +35,7 @@
 static void bsg_teardown_job(struct kref *kref)
 {
struct bsg_job *job = container_of(kref, struct bsg_job, kref);
-   struct request *rq = job->req;
+   struct request *rq = blk_mq_rq_from_pdu(job);
 
put_device(job->dev);   /* release reference for the request */
 
@@ -68,19 +68,18 @@ EXPORT_SYMBOL_GPL(bsg_job_get);
 void bsg_job_done(struct bsg_job *job, int result,
  unsigned int reply_payload_rcv_len)
 {
-   struct request *req = job->req;
+   struct request *req = blk_mq_rq_from_pdu(job);
struct request *rsp = req->next_rq;
-   struct scsi_request *rq = scsi_req(req);
int err;
 
-   err = scsi_req(job->req)->result = result;
+   err = job->sreq.result = result;
if (err < 0)
/* we're only returning the result field in the reply */
-   rq->sense_len = sizeof(u32);
+   job->sreq.sense_len = sizeof(u32);
else
-   rq->sense_len = job->reply_len;
+   job->sreq.sense_len = job->reply_len;
/* we assume all request payload was transferred, residual == 0 */
-   rq->resid_len = 0;
+   job->sreq.resid_len = 0;
 
if (rsp) {
WARN_ON(reply_payload_rcv_len > scsi_req(rsp)->resid_len);
@@ -232,7 +231,6 @@ static void bsg_initialize_rq(struct request *req)
sreq->sense = sense;
sreq->sense_len = SCSI_SENSE_BUFFERSIZE;
 
-   job->req = req;
job->reply = sense;
job->reply_len = sreq->sense_len;
job->dd_data = job + 1;
diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
index 402223c95ce1..08762d297cbd 100644
--- a/include/linux/bsg-lib.h
+++ b/include/linux/bsg-lib.h
@@ -40,7 +40,6 @@ struct bsg_buffer {
 struct bsg_job {
struct scsi_request sreq;
struct device *dev;
-   struct request *req;
 
struct kref kref;
 
-- 
2.14.2



[PATCH 3/3] bsg: split handling of SCSI CDBs vs transport requeues

2018-03-13 Thread Christoph Hellwig
The current BSG design tries to shoe-horn the transport-specific
passthrough commands into the overall framework for SCSI passthrough
requests.  This has a couple problems:

 - each passthrough queue has to set the QUEUE_FLAG_SCSI_PASSTHROUGH flag
   despite not dealing with SCSI commands at all.  Because of that these
   queues could also incorrectly accept SCSI commands from in-kernel
   users or through the legacy SCSI_IOCTL_SEND_COMMAND ioctl.
 - the real SCSI bsg queues also incorrectly accept bsg requests of the
   BSG_SUB_PROTOCOL_SCSI_TRANSPORT type
 - the bsg transport code is almost unredable because it tries to reuse
   different SCSI concepts for its own purpose.

This patch instead adds a new bsg_ops structure to handle the two cases
differently, and thus solves all of the above problems.  Another side
effect is that the bsg-lib queues also don't need to embedd a
struct scsi_request anymore.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
Reviewed-by: Johannes Thumshirn 
---
 block/bsg-lib.c   | 158 +++
 block/bsg.c   | 262 +-
 drivers/scsi/scsi_lib.c   |   4 +-
 drivers/scsi/scsi_sysfs.c |   3 +-
 drivers/scsi/scsi_transport_sas.c |   1 -
 include/linux/bsg-lib.h   |   4 +-
 include/linux/bsg.h   |  35 +++--
 7 files changed, 250 insertions(+), 217 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index f2c2d54a61b4..fc2e5ff2c4b9 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -27,6 +27,94 @@
 #include 
 #include 
 #include 
+#include 
+
+#define uptr64(val) ((void __user *)(uintptr_t)(val))
+
+static int bsg_transport_check_proto(struct sg_io_v4 *hdr)
+{
+   if (hdr->protocol != BSG_PROTOCOL_SCSI  ||
+   hdr->subprotocol != BSG_SUB_PROTOCOL_SCSI_TRANSPORT)
+   return -EINVAL;
+   if (!capable(CAP_SYS_RAWIO))
+   return -EPERM;
+   return 0;
+}
+
+static int bsg_transport_fill_hdr(struct request *rq, struct sg_io_v4 *hdr,
+   fmode_t mode)
+{
+   struct bsg_job *job = blk_mq_rq_to_pdu(rq);
+
+   job->request_len = hdr->request_len;
+   job->request = memdup_user(uptr64(hdr->request), hdr->request_len);
+   if (IS_ERR(job->request))
+   return PTR_ERR(job->request);
+   return 0;
+}
+
+static int bsg_transport_complete_rq(struct request *rq, struct sg_io_v4 *hdr)
+{
+   struct bsg_job *job = blk_mq_rq_to_pdu(rq);
+   int ret = 0;
+
+   /*
+* The assignments below don't make much sense, but are kept for
+* bug by bug backwards compatibility:
+*/
+   hdr->device_status = job->result & 0xff;
+   hdr->transport_status = host_byte(job->result);
+   hdr->driver_status = driver_byte(job->result);
+   hdr->info = 0;
+   if (hdr->device_status || hdr->transport_status || hdr->driver_status)
+   hdr->info |= SG_INFO_CHECK;
+   hdr->response_len = 0;
+
+   if (job->result < 0) {
+   /* we're only returning the result field in the reply */
+   job->reply_len = sizeof(u32);
+   ret = job->result;
+   }
+
+   if (job->reply_len && hdr->response) {
+   int len = min(hdr->max_response_len, job->reply_len);
+
+   if (copy_to_user(uptr64(hdr->response), job->reply, len))
+   ret = -EFAULT;
+   else
+   hdr->response_len = len;
+   }
+
+   /* we assume all request payload was transferred, residual == 0 */
+   hdr->dout_resid = 0;
+
+   if (rq->next_rq) {
+   unsigned int rsp_len = job->reply_payload.payload_len;
+
+   if (WARN_ON(job->reply_payload_rcv_len > rsp_len))
+   hdr->din_resid = 0;
+   else
+   hdr->din_resid = rsp_len - job->reply_payload_rcv_len;
+   } else {
+   hdr->din_resid = 0;
+   }
+
+   return ret;
+}
+
+static void bsg_transport_free_rq(struct request *rq)
+{
+   struct bsg_job *job = blk_mq_rq_to_pdu(rq);
+
+   kfree(job->request);
+}
+
+static const struct bsg_ops bsg_transport_ops = {
+   .check_proto= bsg_transport_check_proto,
+   .fill_hdr   = bsg_transport_fill_hdr,
+   .complete_rq= bsg_transport_complete_rq,
+   .free_rq= bsg_transport_free_rq,
+};
 
 /**
  * bsg_teardown_job - routine to teardown a bsg job
@@ -68,27 +156,9 @@ EXPORT_SYMBOL_GPL(bsg_job_get);
 void bsg_job_done(struct bsg_job *job, int result,
  unsigned int reply_payload_rcv_len)
 {
-   struct request *req = blk_mq_rq_from_pdu(job);
-   struct request *rsp = req->next_rq;
-   int err;
-
-   err = job->sreq.result = result;
-   if (err < 0)
-   /* we're only returning the result field in the 

[PATCH 1/3] bsg-lib: introduce a timeout field in struct bsg_job

2018-03-13 Thread Christoph Hellwig
The zfcp driver wants to know the timeout for a bsg job, so add a field
to struct bsg_job for it in preparation of not exposing the request
to the bsg-lib users.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Benjamin Block 
Reviewed-by: Hannes Reinecke 
Reviewed-by: Johannes Thumshirn 
---
 block/bsg-lib.c | 1 +
 drivers/s390/scsi/zfcp_fc.c | 4 ++--
 include/linux/bsg-lib.h | 2 ++
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index b4fe1a48f111..fb509779a090 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -132,6 +132,7 @@ static int bsg_prepare_job(struct device *dev, struct 
request *req)
struct bsg_job *job = blk_mq_rq_to_pdu(req);
int ret;
 
+   job->timeout = req->timeout;
job->request = rq->cmd;
job->request_len = rq->cmd_len;
 
diff --git a/drivers/s390/scsi/zfcp_fc.c b/drivers/s390/scsi/zfcp_fc.c
index ca218c82321f..6162cf57a20a 100644
--- a/drivers/s390/scsi/zfcp_fc.c
+++ b/drivers/s390/scsi/zfcp_fc.c
@@ -961,7 +961,7 @@ static int zfcp_fc_exec_els_job(struct bsg_job *job,
d_id = ntoh24(bsg_request->rqst_data.h_els.port_id);
 
els->handler = zfcp_fc_ct_els_job_handler;
-   return zfcp_fsf_send_els(adapter, d_id, els, job->req->timeout / HZ);
+   return zfcp_fsf_send_els(adapter, d_id, els, job->timeout / HZ);
 }
 
 static int zfcp_fc_exec_ct_job(struct bsg_job *job,
@@ -980,7 +980,7 @@ static int zfcp_fc_exec_ct_job(struct bsg_job *job,
return ret;
 
ct->handler = zfcp_fc_ct_job_handler;
-   ret = zfcp_fsf_send_ct(wka_port, ct, NULL, job->req->timeout / HZ);
+   ret = zfcp_fsf_send_ct(wka_port, ct, NULL, job->timeout / HZ);
if (ret)
zfcp_fc_wka_port_put(wka_port);
 
diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
index b1be0233ce35..402223c95ce1 100644
--- a/include/linux/bsg-lib.h
+++ b/include/linux/bsg-lib.h
@@ -44,6 +44,8 @@ struct bsg_job {
 
struct kref kref;
 
+   unsigned int timeout;
+
/* Transport/driver specific request/reply structs */
void *request;
void *reply;
-- 
2.14.2



bsg-lib interface cleanup V2

2018-03-13 Thread Christoph Hellwig
Hi all,

this series cleans up various abuses of the bsg interfaces, and then
splits bsg for SCSI passthrough from bsg for arbitrary transport
passthrough.  This removes the scsi_request abuse in bsg-lib that is
very confusing, and also makes sure we can sanity check the requests
we get.  The current code will happily execute scsi commands against
bsg-lib queues, and transport pass through against scsi nodes, without
any indication to the driver that we are doing the wrong thing.

Changes since V1:
 - dropped various patches merged already
 - use job to get the payload length in bsg_transport_complete_rq
 - renamed the ptr64 macro to uptr64
 - make bsg_scsi_complete_rq and bsg_transport_complete_rq look more
   similar
 - trivial cleanup in bsg_map_hdr


Re: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts

2018-03-13 Thread Bart Van Assche
On Tue, 2018-03-13 at 22:32 +0800, Ming Lei wrote:
> On Tue, Mar 13, 2018 at 02:08:23PM +0100, Martin Steigerwald wrote:
> > Ming and Bart, I added you to cc, cause I had to do with you about another 
> > blk-mq report, please feel free to adapt.
> 
> Looks RIP points to scsi_times_out+0x17/0x1d0, maybe a SCSI regression?

I think that it's much more likely that this is a block layer regression. See
e.g. "[PATCH v2] blk-mq: Fix race between resetting the timer and completion
handling" 
(https://www.mail-archive.com/linux-block@vger.kernel.org/msg18338.html).

Bart.

Re: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts

2018-03-13 Thread Ming Lei
On Tue, Mar 13, 2018 at 02:08:23PM +0100, Martin Steigerwald wrote:
> Hans de Goede - 11.03.18, 15:37:
> > Hi Martin,
> > 
> > On 11-03-18 09:20, Martin Steigerwald wrote:
> > > Hello.
> > > 
> > > Since 4.16-rc4 (upgraded from 4.15.2 which worked) I have an issue
> > > with SMART checks occassionally failing like this:
> > > 
> > > smartd[28017]: Device: /dev/sdb [SAT], is in SLEEP mode, suspending checks
> > > udisksd[24408]: Error performing housekeeping for drive
> > > /org/freedesktop/UDisks2/drives/INTEL_SSDSA2CW300G3_[…]: Error updating
> > > SMART data: Error sending ATA command CHECK POWER MODE: Unexpected sense
> > > data returned:#012: 0e 09 0c 00  00 00 ff 00  00 00 00 00  00 00 50
> > > 00..P.#0120010: 00 00 00 00  00 00 00 00  00 00 00 00  00
> > > 00 00 00#012 (g-io-error-quark, 0) merkaba
> > > udisksd[24408]: Error performing housekeeping for drive
> > > /org/freedesktop/UDisks2/drives/Crucial_CT480M500SSD3_[…]: Error updating
> > > SMART dat a: Error sending ATA command CHECK POWER MODE: Unexpected sense
> > > data returned:#012: 01 00 1d 00  00 00 0e 09  0c 00 00 00  ff 00 00
> > > 00#0120010: 00 0 0 00 00  50 00 00 00  00 00 00 00 
> > > 00 00 00 00P...#012 (g-io-error-quark, 0)
> > > 
> > > (Intel SSD is connected via SATA, Crucial via mSATA in a ThinkPad T520)
> > > 
> > > However when I then check manually with smartctl -a | -x | -H the device
> > > reports SMART data just fine.
> > > 
> > > As smartd correctly detects that device is in sleep mode, this may be an
> > > userspace issue in udisksd.
> > > 
> > > Also at some boot attempts the boot hangs with a message like "could not
> > > connect to lvmetad, scanning manually for devices". I use BTRFS RAID 1
> > > on to LVs (each on one of the SSDs). A configuration that requires a
> > > manual
> > > adaption to InitRAMFS in order to boot (basically vgchange -ay before
> > > btrfs device scan).
> > > 
> > > I wonder whether that has to do with the new SATA LPM policy stuff, but as
> > > I had issues with
> > > 
> > >   3 => Medium power with Device Initiated PM enabled
> > > 
> > > (machine did not boot, which could also have been caused by me
> > > accidentally
> > > removing all TCP/IP network support in the kernel with that setting)
> > > 
> > > I set it back to
> > > 
> > > CONFIG_SATA_MOBILE_LPM_POLICY=0
> > > 
> > > (firmware settings)
> > 
> > Right, so at that settings the LPM policy changes are effectively
> > disabled and cannot explain your SMART issues.
> 
> Yes, I now good a photo of one of those boot failures I mentioned, at it 
> seems 
> to be related to blk-mq, as the backtrace contains "blk_mq_terminate_expired".
> 
> I add the screenshot to my bug report.
> 
> [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and 
> boot failures with blk_mq_terminate_expired in backtrace
> https://bugzilla.kernel.org/show_bug.cgi?id=199077
> 
> Hans, I will test your LPM policy horkage for Crucial m500 patch at a later 
> time. I first wanted to add the photo of the boot failure to the bug report.
> 
> Ming and Bart, I added you to cc, cause I had to do with you about another 
> blk-mq report, please feel free to adapt.

Looks RIP points to scsi_times_out+0x17/0x1d0, maybe a SCSI regression?

Thanks,
Ming


Immediate Reply.

2018-03-13 Thread Isa Zongo
Dear  Friend,

   I know that, this letter will come to you as surprise, I got your contact 
address while I was searching for foreign partner to champion this golden 
appoint unity that is present in our favor, My name is Mr. Isa Zongo, I am the 
Bill and Exchange (assistant) Manager CORIS BANK INTERNATIONAL Ouagadougou 
Burkina Faso. I'm proposing to lift in your name (US$10.5 Million Dollars) that 
belong to our later customer, Mr Kurt Kuhle from Alexandra Egypt who  died 
along with his family in Siber airline that crashed into sea  at Isreal on 4th 
October 2001.

I want to present you to my bank here as the beneficiary to the fund, Waiting 
for your response for more details, As you are willing to execute this business 
appoint unity with me.

Thanks,
Yours Sincerely,
Mr. Isa Zongo.


[PATCH][next] scsi: lpfc: make several unions static, fix non-ANSI prototype

2018-03-13 Thread Colin King
From: Colin Ian King 

There are several unions that are local to the source and do not need
to be in global scope, so make them static. Also add in a missing void
parameter to functions lpfc_nvme_cmd_template and
lpfc_nvmet_cmd_template to clean up non-ANSI warning.

Cleans up sparse warnings:
drivers/scsi/lpfc/lpfc_nvme.c:68:19: warning: symbol
'lpfc_iread_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvme.c:69:19: warning: symbol
'lpfc_iwrite_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvme.c:70:19: warning: symbol
'lpfc_icmnd_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvme.c:74:24: warning: non-ANSI function
'lpfc_tsend_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:78:19: warning: symbol
'lpfc_treceive_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:79:19: warning: symbol
'lpfc_trsp_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:83:25: warning: non-ANSI function
declaration of function 'lpfc_nvmet_cmd_template'

Signed-off-by: Colin Ian King 
---
 drivers/scsi/lpfc/lpfc_nvme.c  | 8 
 drivers/scsi/lpfc/lpfc_nvmet.c | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index 52dd9479b538..378dca40ca20 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -65,13 +65,13 @@ lpfc_release_nvme_buf(struct lpfc_hba *, struct 
lpfc_nvme_buf *);
 
 static struct nvme_fc_port_template lpfc_nvme_template;
 
-union lpfc_wqe128 lpfc_iread_cmd_template;
-union lpfc_wqe128 lpfc_iwrite_cmd_template;
-union lpfc_wqe128 lpfc_icmnd_cmd_template;
+static union lpfc_wqe128 lpfc_iread_cmd_template;
+static union lpfc_wqe128 lpfc_iwrite_cmd_template;
+static union lpfc_wqe128 lpfc_icmnd_cmd_template;
 
 /* Setup WQE templates for NVME IOs */
 void
-lpfc_nvme_cmd_template()
+lpfc_nvme_cmd_template(void)
 {
union lpfc_wqe128 *wqe;
 
diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c
index 07f89524c320..7271c9d885dd 100644
--- a/drivers/scsi/lpfc/lpfc_nvmet.c
+++ b/drivers/scsi/lpfc/lpfc_nvmet.c
@@ -74,13 +74,13 @@ static int lpfc_nvmet_unsol_ls_issue_abort(struct lpfc_hba 
*,
 static void lpfc_nvmet_wqfull_flush(struct lpfc_hba *, struct lpfc_queue *,
struct lpfc_nvmet_rcv_ctx *);
 
-union lpfc_wqe128 lpfc_tsend_cmd_template;
-union lpfc_wqe128 lpfc_treceive_cmd_template;
-union lpfc_wqe128 lpfc_trsp_cmd_template;
+static union lpfc_wqe128 lpfc_tsend_cmd_template;
+static union lpfc_wqe128 lpfc_treceive_cmd_template;
+static union lpfc_wqe128 lpfc_trsp_cmd_template;
 
 /* Setup WQE templates for NVME IOs */
 void
-lpfc_nvmet_cmd_template()
+lpfc_nvmet_cmd_template(void)
 {
union lpfc_wqe128 *wqe;
 
-- 
2.15.1



Re: Debugging SCSI 'UNMAP' ("Logical Block Provisioning") failure on an SSD

2018-03-13 Thread Kashyap Chamarthy
On Tue, Mar 13, 2018 at 12:42:47PM +0100, Kashyap Chamarthy wrote:
> [Cross-posted to 'linux-scsi' and 'linux-usb' lists; please keep me
> Cced, as I'm not subscribed to either of them.)

[...]

> I was suggested to return this SSD.  But the Interweb[%] claims Samsung
> T5 *does* have 'TRIM' (i.e. SCSI 'UNMAP') support via UASP.

Earlier, I didn't know which list to email, so I wrote to Martin K.
Petersen (who pointed me to the lists when I asked where I can post
publicly), and he made this observation on the above quoted text:

"Linux can run USB block devices in either legacy or UAS mode.
Chances are this device is being driven in usb-storage mode. We
usually have quirks based on the model strings that decide whether
to use one or the other."


[...]

-- 
/kashyap


Debugging SCSI 'UNMAP' ("Logical Block Provisioning") failure on an SSD

2018-03-13 Thread Kashyap Chamarthy
[Cross-posted to 'linux-scsi' and 'linux-usb' lists; please keep me
Cced, as I'm not subscribed to either of them.)

I unboxed a brand new Samsung T5 SSD[*].  Then formatted it with EXT4
filesystem.  I wanted to clean the disk of any existing software, so I
was reminded by a colleague to use `blkdiscard`.  Giving it a try ... it
fails right off the bat:

$> sudo blkdiscard /dev/sdc
blkdiscard: /dev/sdc1: BLKDISCARD ioctl failed: Operation not supported

Hmm.  Let's run `lsblk -D` to see if it shows discarding support (SCSI
'UNMAP' or 'TRIM' in ATA parlance) for the SSD ('sdc').  It doesn't
(from what I learn, the value of "DISC-GRAN" should be non-zero):

$> lsblk -D
NAME  DISC-ALN DISC-GRAN DISC-MAX 
DISC-ZERO
sdc  00B   0B   
  0
[...]

Next, Paolo Bonzini (KVM / QEMU maintainer) told me on IRC to run a
couple of SCSI commands[+].  The key one being the `sg_readcap`, and
here's where I learnt the bad news that the SCSI "Logical block
provisioning" ('lbpme') bit seemed to be turned off at hardware level:

$> sudo sg_readcap --16 /dev/sdc
Read Capacity results:
   Protection: prot_en=0, p_type=0, p_i_exponent=0
   Logical block provisioning: lbpme=0, lbprz=0
   Last logical block address=976773167 (0x3a38602f), Number of logical 
blocks=976773168
   Logical block length=512 bytes
   Logical blocks per physical block exponent=0
   Lowest aligned logical block address=0
Hence:
   Device size: 500107862016 bytes, 476940.0 MiB, 500.11 GB

As another attempt, Paolo suggested to try UNMAP the following way.  But
it fails with not much actionable error:

$> num=$(sg_readcap -b /dev/sdc | awk '{print $1}')
$> sg_unmap --lba=0 --num=$num /dev/sdc -v
unmap cdb: 42 00 00 00 00 00 00 00 18 00
unmap: transport: Host_status=0x01 [DID_NO_CONNECT]
Driver_status=0x00 [DRIVER_OK]

UNMAP failed (use '-v' to get more information)

* * *

Apparently there are two potential causes:

(1) Seems 'lbpme' is diasbled at the hardware-level (lbpme=0); or
(2) The firmware itself is buggy

I was suggested to return this SSD.  But the Interweb[%] claims Samsung
T5 *does* have 'TRIM' (i.e. SCSI 'UNMAP') support via UASP.

My main purpose of this external hard disk is to store live VM disk
iamges.  At some future point I may want to test "real" discard.  I
learn, since the drive doesn't support it, the host kernel's drivers
won't be running the full SCSI discard code path.  So the said SSD won't
be suitable for that.

Wonder if you have any suggestions / remarks before I return this disk?
(Assuming there's no way to enable UNMAP for this disk, and the only
recourse is to return it.)

Thanks!


[+] The outputs of `sg_vpd` & `sg_sat_identify` commands here:

https://kashyapc.fedorapeople.org/temp/sg_vpd.txt
https://kashyapc.fedorapeople.org/temp/sg_sat_identify.txt

[*] http://www.samsung.com/semiconductor/minisite/ssd/product/portable/t5/

[%] http://www.tomshardware.com/reviews/samsung-t5-portable-ssd,5163-3.html


-- 
/kashyap


RE: [PATCH] scsi: resolve COMMAND_SIZE at compile time

2018-03-13 Thread David Laight
From: Stephen Kitt
> Sent: 09 March 2018 22:34
> 
> COMMAND_SIZE currently uses an array of values in block/scsi_ioctl.c.
> A number of device_handler functions use this to initialise arrays,
> and this is flagged by -Wvla.
> 
> This patch replaces COMMAND_SIZE with a variant using a formula which
> can be resolved at compile time in cases where the opcode is fixed,
> resolving the array size and avoiding the VLA. The downside is that
> the code is less maintainable and that some call sites end up having
> more complex generated code.
> 
> Since scsi_command_size_tbl is dropped, we can remove the dependency
> on BLK_SCSI_REQUEST from drivers/target/Kconfig.
> 
> This was prompted by https://lkml.org/lkml/2018/3/7/621
> 
> Signed-off-by: Stephen Kitt 
> ---
>  block/scsi_ioctl.c |  8 
>  drivers/target/Kconfig |  1 -
>  include/scsi/scsi_common.h | 13 +++--
>  3 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
> index 60b471f8621b..b9e176e537be 100644
> --- a/block/scsi_ioctl.c
> +++ b/block/scsi_ioctl.c
> @@ -41,14 +41,6 @@ struct blk_cmd_filter {
> 
>  static struct blk_cmd_filter blk_default_cmd_filter;
> 
> -/* Command group 3 is reserved and should never be used.  */
> -const unsigned char scsi_command_size_tbl[8] =
> -{
> - 6, 10, 10, 12,
> - 16, 12, 10, 10
> -};
> -EXPORT_SYMBOL(scsi_command_size_tbl);
> -
>  #include 
> 
>  static int sg_get_version(int __user *p)
> diff --git a/drivers/target/Kconfig b/drivers/target/Kconfig
> index 4c44d7bed01a..b5f05b60cf06 100644
> --- a/drivers/target/Kconfig
> +++ b/drivers/target/Kconfig
> @@ -4,7 +4,6 @@ menuconfig TARGET_CORE
>   depends on SCSI && BLOCK
>   select CONFIGFS_FS
>   select CRC_T10DIF
> - select BLK_SCSI_REQUEST # only for scsi_command_size_tbl..
>   select SGL_ALLOC
>   default n
>   help
> diff --git a/include/scsi/scsi_common.h b/include/scsi/scsi_common.h
> index 731ac09ed231..48d950666376 100644
> --- a/include/scsi/scsi_common.h
> +++ b/include/scsi/scsi_common.h
> @@ -15,8 +15,17 @@ scsi_varlen_cdb_length(const void *hdr)
>   return ((struct scsi_varlen_cdb_hdr *)hdr)->additional_cdb_length + 8;
>  }
> 
> -extern const unsigned char scsi_command_size_tbl[8];
> -#define COMMAND_SIZE(opcode) scsi_command_size_tbl[((opcode) >> 5) & 7]
> +/*
> + * SCSI command sizes are as follows, in bytes, for fixed size commands, per
> + * group: 6, 10, 10, 12, 16, 12, 10, 10. The top three bits of an opcode
> + * determine its group.
> + * The size table is encoded into a 32-bit value by subtracting each value
> + * from 16, resulting in a value of 1715488362
> + * (6 << 28 + 6 << 24 + 4 << 20 + 0 << 16 + 4 << 12 + 6 << 8 + 6 << 4 + 10).
> + * Command group 3 is reserved and should never be used.
> + */
> +#define COMMAND_SIZE(opcode) \
> + (16 - (15 & (1715488362 >> (4 * (((opcode) >> 5) & 7)

Why not:
(6 + (15 & (0x446a6440u 

Specifying the constant in hex makes it more obvious.
It is a shame about the 16, but an offset is easier than the negate.

David



Re: mpt3sas: sleeping function called from invalid context

2018-03-13 Thread Suganath Prabu Subramani
Hi Bart,

We have root caused the issue and it is same as you mentioned.
"_scsih_get_enclosure_logicalid_chassis_slot()" is called with interrupts
disabled and this function
"_scsih_get_enclosure_logicalid_chassis_slot" again calls
_config_request(), with mutex_lock().

We have patch ready along with few other change and we ll be posting
it by tomorrow after covering BST.

Thanks,
Suganath Prabu S

On Mon, Mar 12, 2018 at 11:53 PM, Bart Van Assche
 wrote:
> Hello,
>
> For the first I/O request after boot that is sent to a disk attached to an
> mpt3sas adapter I see the below complaint appearing in the kernel log. This
> occurs at least with kernels v4.16-rc4 and v4.16-rc5.
>
> What I see in the mpt3sas source code is that
> _scsih_get_enclosure_logicalid_chassis_slot() is called with interrupts
> disabled and also that a function called by that function, namely
> _config_request(), calls mutex_lock().
>
> Can someone who is more familiar than I with the mpt3sas adapter have a look
> at this and propose a fix?
>
> Thanks,
>
> Bart.
>
> BUG: sleeping function called from invalid context at 
> kernel/locking/mutex.c:747
> in_atomic(): 1, irqs_disabled(): 1, pid: 2389, name: kworker/u64:1
> INFO: lockdep is turned off.
> irq event stamp: 278
> hardirqs last  enabled at (277): [<32c577ec>] 
> _raw_spin_unlock_irq+0x24/0x50
> hardirqs last disabled at (278): [<6082e2fa>] __schedule+0x120/0x1010
> softirqs last  enabled at (0): [<8c2eb285>] 
> copy_process.part.45+0x930/0x3470
> softirqs last disabled at (0): [<  (null)>]   (null)
> Preemption disabled at:
> [<>]   (null)
> CPU: 3 PID: 2389 Comm: kworker/u64:1 Tainted: GW
> 4.16.0-rc5-dbg+ #1
> Workqueue: poll_mpt3sas0_statu _base_fault_reset_work [mpt3sas]
> Call Trace:
> dump_stack+0x67/0x90
> ___might_sleep+0x1da/0x2c0
> __mutex_lock+0xb9/0xbb0
> _config_request.constprop.5+0xa3/0xe70 [mpt3sas]
> mpt3sas_config_get_enclosure_pg0+0xb3/0x110 [mpt3sas]
> _scsih_get_enclosure_logicalid_chassis_slot+0xf8/0x160 [mpt3sas]
> mpt3sas_scsih_reset_handler+0x3f6/0xb30 [mpt3sas]
> mpt3sas_base_hard_reset_handler+0x49a/0x7c0 [mpt3sas]
> _base_fault_reset_work+0x1bb/0x260 [mpt3sas]
> process_one_work+0x441/0xa50
> worker_thread+0x76/0x6c0
> kthread+0x1b2/0x1d0
> ret_from_fork+0x24/0x30
>


[PATCH V5 5/5] scsi: virtio_scsi: unify scsi_host_template

2018-03-13 Thread Ming Lei
Now we switch to use_blk_mq always, and both single queue and multi
queue cases can be handled in one .queuecommand callback, not necessary
to use two scsi_host_template.

Suggested-by: Christoph Hellwig ,
Cc: Omar Sandoval ,
Cc: "Martin K. Petersen" ,
Cc: James Bottomley ,
Cc: Christoph Hellwig ,
Cc: Paolo Bonzini 
Cc: Mike Snitzer 
Cc: Laurence Oberman 
Cc: Hannes Reinecke 
Signed-off-by: Ming Lei 
---
 drivers/scsi/virtio_scsi.c | 74 ++
 1 file changed, 15 insertions(+), 59 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 54e3a0f6844c..45d04631888a 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -522,11 +522,20 @@ static void virtio_scsi_init_hdr_pi(struct virtio_device 
*vdev,
 }
 #endif
 
-static int virtscsi_queuecommand(struct virtio_scsi *vscsi,
-struct virtio_scsi_vq *req_vq,
+static struct virtio_scsi_vq *virtscsi_pick_vq_mq(struct virtio_scsi *vscsi,
+ struct scsi_cmnd *sc)
+{
+   u32 tag = blk_mq_unique_tag(sc->request);
+   u16 hwq = blk_mq_unique_tag_to_hwq(tag);
+
+   return >req_vqs[hwq];
+}
+
+static int virtscsi_queuecommand(struct Scsi_Host *shost,
 struct scsi_cmnd *sc)
 {
-   struct Scsi_Host *shost = virtio_scsi_host(vscsi->vdev);
+   struct virtio_scsi *vscsi = shost_priv(shost);
+   struct virtio_scsi_vq *req_vq = virtscsi_pick_vq_mq(vscsi, sc);
struct virtio_scsi_cmd *cmd = scsi_cmd_priv(sc);
unsigned long flags;
int req_size;
@@ -569,32 +578,6 @@ static int virtscsi_queuecommand(struct virtio_scsi *vscsi,
return 0;
 }
 
-static int virtscsi_queuecommand_single(struct Scsi_Host *sh,
-   struct scsi_cmnd *sc)
-{
-   struct virtio_scsi *vscsi = shost_priv(sh);
-
-   return virtscsi_queuecommand(vscsi, >req_vqs[0], sc);
-}
-
-static struct virtio_scsi_vq *virtscsi_pick_vq_mq(struct virtio_scsi *vscsi,
- struct scsi_cmnd *sc)
-{
-   u32 tag = blk_mq_unique_tag(sc->request);
-   u16 hwq = blk_mq_unique_tag_to_hwq(tag);
-
-   return >req_vqs[hwq];
-}
-
-static int virtscsi_queuecommand_multi(struct Scsi_Host *sh,
-  struct scsi_cmnd *sc)
-{
-   struct virtio_scsi *vscsi = shost_priv(sh);
-   struct virtio_scsi_vq *req_vq = virtscsi_pick_vq_mq(vscsi, sc);
-
-   return virtscsi_queuecommand(vscsi, req_vq, sc);
-}
-
 static int virtscsi_tmf(struct virtio_scsi *vscsi, struct virtio_scsi_cmd *cmd)
 {
DECLARE_COMPLETION_ONSTACK(comp);
@@ -750,34 +733,13 @@ static enum blk_eh_timer_return 
virtscsi_eh_timed_out(struct scsi_cmnd *scmnd)
return BLK_EH_RESET_TIMER;
 }
 
-static struct scsi_host_template virtscsi_host_template_single = {
-   .module = THIS_MODULE,
-   .name = "Virtio SCSI HBA",
-   .proc_name = "virtio_scsi",
-   .this_id = -1,
-   .cmd_size = sizeof(struct virtio_scsi_cmd),
-   .queuecommand = virtscsi_queuecommand_single,
-   .change_queue_depth = virtscsi_change_queue_depth,
-   .eh_abort_handler = virtscsi_abort,
-   .eh_device_reset_handler = virtscsi_device_reset,
-   .eh_timed_out = virtscsi_eh_timed_out,
-   .slave_alloc = virtscsi_device_alloc,
-
-   .dma_boundary = UINT_MAX,
-   .use_clustering = ENABLE_CLUSTERING,
-   .target_alloc = virtscsi_target_alloc,
-   .target_destroy = virtscsi_target_destroy,
-   .track_queue_depth = 1,
-   .force_blk_mq = 1,
-};
-
-static struct scsi_host_template virtscsi_host_template_multi = {
+static struct scsi_host_template virtscsi_host_template = {
.module = THIS_MODULE,
.name = "Virtio SCSI HBA",
.proc_name = "virtio_scsi",
.this_id = -1,
.cmd_size = sizeof(struct virtio_scsi_cmd),
-   .queuecommand = virtscsi_queuecommand_multi,
+   .queuecommand = virtscsi_queuecommand,
.change_queue_depth = virtscsi_change_queue_depth,
.eh_abort_handler = virtscsi_abort,
.eh_device_reset_handler = virtscsi_device_reset,
@@ -883,7 +845,6 @@ static int virtscsi_probe(struct virtio_device *vdev)
u32 sg_elems, num_targets;
u32 cmd_per_lun;
u32 num_queues;
-   struct scsi_host_template *hostt;
 
if (!vdev->config->get) {
dev_err(>dev, "%s failure: config access disabled\n",
@@ -896,12 +857,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
 
num_targets = virtscsi_config_get(vdev, max_target) + 1;
 
-   if (num_queues == 1)
-   hostt = _host_template_single;
-   else
-   

[PATCH V5 3/5] scsi: introduce force_blk_mq

2018-03-13 Thread Ming Lei
>From scsi driver view, it is a bit troublesome to support both blk-mq
and non-blk-mq at the same time, especially when drivers need to support
multi hw-queue.

This patch introduces 'force_blk_mq' to scsi_host_template so that drivers
can provide blk-mq only support, so driver code can avoid the trouble
for supporting both.

Cc: Omar Sandoval ,
Cc: "Martin K. Petersen" ,
Cc: James Bottomley ,
Cc: Christoph Hellwig ,
Cc: Don Brace 
Cc: Kashyap Desai 
Cc: Mike Snitzer 
Cc: Laurence Oberman 
Reviewed-by: Hannes Reinecke 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Ming Lei 
---
 drivers/scsi/hosts.c | 1 +
 include/scsi/scsi_host.h | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 57bf43e34863..cbbc32df7595 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -477,6 +477,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template 
*sht, int privsize)
shost->dma_boundary = 0x;
 
shost->use_blk_mq = scsi_use_blk_mq;
+   shost->use_blk_mq = scsi_use_blk_mq || shost->hostt->force_blk_mq;
 
device_initialize(>shost_gendev);
dev_set_name(>shost_gendev, "host%d", shost->host_no);
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 1a1df0d21ee3..6c6366f0bd15 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -452,6 +452,9 @@ struct scsi_host_template {
/* True if the controller does not support WRITE SAME */
unsigned no_write_same:1;
 
+   /* True if the low-level driver supports blk-mq only */
+   unsigned force_blk_mq:1;
+
/*
 * Countdown for host blocking with no commands outstanding.
 */
-- 
2.9.5



[PATCH V5 4/5] scsi: virtio_scsi: fix IO hang caused by irq vector automatic affinity

2018-03-13 Thread Ming Lei
Now 84676c1f21e8ff5(genirq/affinity: assign vectors to all possible CPUs)
has been merged to V4.16-rc, and it is easy to allocate all offline CPUs
for some irq vectors, this can't be avoided even though the allocation
is improved.

For example, on a 8cores VM, 4~7 are not-present/offline, 4 queues of
virtio-scsi, the irq affinity assigned can become the following shape:

irq 36, cpu list 0-7
irq 37, cpu list 0-7
irq 38, cpu list 0-7
irq 39, cpu list 0-1
irq 40, cpu list 4,6
irq 41, cpu list 2-3
irq 42, cpu list 5,7

Then IO hang is triggered in case of non-SCSI_MQ.

Given storage IO is always C/S model, there isn't such issue with 
SCSI_MQ(blk-mq),
because no IO can be submitted to one hw queue if the hw queue isn't
mapped to online CPUs.

Fix this issue by forcing to use blk_mq.

BTW, I have been used virtio-scsi(scsi_mq) for several years, and it has
been quite stable, so it shouldn't cause extra risk.

Cc: Omar Sandoval ,
Cc: "Martin K. Petersen" ,
Cc: James Bottomley ,
Cc: Christoph Hellwig ,
Cc: Don Brace 
Cc: Kashyap Desai 
Cc: Paolo Bonzini 
Cc: Mike Snitzer 
Cc: Laurence Oberman 
Reviewed-by: Hannes Reinecke 
Acked-by: Paolo Bonzini 
Reviewed-by: Christoph Hellwig 
Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs")
Signed-off-by: Ming Lei 
---
 drivers/scsi/virtio_scsi.c | 59 +++---
 1 file changed, 3 insertions(+), 56 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 7c28e8d4955a..54e3a0f6844c 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -91,9 +91,6 @@ struct virtio_scsi_vq {
 struct virtio_scsi_target_state {
seqcount_t tgt_seq;
 
-   /* Count of outstanding requests. */
-   atomic_t reqs;
-
/* Currently active virtqueue for requests sent to this target. */
struct virtio_scsi_vq *req_vq;
 };
@@ -152,8 +149,6 @@ static void virtscsi_complete_cmd(struct virtio_scsi 
*vscsi, void *buf)
struct virtio_scsi_cmd *cmd = buf;
struct scsi_cmnd *sc = cmd->sc;
struct virtio_scsi_cmd_resp *resp = >resp.cmd;
-   struct virtio_scsi_target_state *tgt =
-   scsi_target(sc->device)->hostdata;
 
dev_dbg(>device->sdev_gendev,
"cmd %p response %u status %#02x sense_len %u\n",
@@ -210,8 +205,6 @@ static void virtscsi_complete_cmd(struct virtio_scsi 
*vscsi, void *buf)
}
 
sc->scsi_done(sc);
-
-   atomic_dec(>reqs);
 }
 
 static void virtscsi_vq_done(struct virtio_scsi *vscsi,
@@ -580,10 +573,7 @@ static int virtscsi_queuecommand_single(struct Scsi_Host 
*sh,
struct scsi_cmnd *sc)
 {
struct virtio_scsi *vscsi = shost_priv(sh);
-   struct virtio_scsi_target_state *tgt =
-   scsi_target(sc->device)->hostdata;
 
-   atomic_inc(>reqs);
return virtscsi_queuecommand(vscsi, >req_vqs[0], sc);
 }
 
@@ -596,55 +586,11 @@ static struct virtio_scsi_vq *virtscsi_pick_vq_mq(struct 
virtio_scsi *vscsi,
return >req_vqs[hwq];
 }
 
-static struct virtio_scsi_vq *virtscsi_pick_vq(struct virtio_scsi *vscsi,
-  struct virtio_scsi_target_state 
*tgt)
-{
-   struct virtio_scsi_vq *vq;
-   unsigned long flags;
-   u32 queue_num;
-
-   local_irq_save(flags);
-   if (atomic_inc_return(>reqs) > 1) {
-   unsigned long seq;
-
-   do {
-   seq = read_seqcount_begin(>tgt_seq);
-   vq = tgt->req_vq;
-   } while (read_seqcount_retry(>tgt_seq, seq));
-   } else {
-   /* no writes can be concurrent because of atomic_t */
-   write_seqcount_begin(>tgt_seq);
-
-   /* keep previous req_vq if a reader just arrived */
-   if (unlikely(atomic_read(>reqs) > 1)) {
-   vq = tgt->req_vq;
-   goto unlock;
-   }
-
-   queue_num = smp_processor_id();
-   while (unlikely(queue_num >= vscsi->num_queues))
-   queue_num -= vscsi->num_queues;
-   tgt->req_vq = vq = >req_vqs[queue_num];
- unlock:
-   write_seqcount_end(>tgt_seq);
-   }
-   local_irq_restore(flags);
-
-   return vq;
-}
-
 static int virtscsi_queuecommand_multi(struct Scsi_Host *sh,
   struct scsi_cmnd *sc)
 {
struct virtio_scsi *vscsi = shost_priv(sh);
-   struct virtio_scsi_target_state *tgt =
-   

[PATCH V5 2/5] scsi: megaraid_sas: fix selection of reply queue

2018-03-13 Thread Ming Lei
>From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs),
one msix vector can be created without any online CPU mapped, then
command may be queued, and won't be notified after its completion.

This patch setups mapping between cpu and reply queue according to irq
affinity info retrived by pci_irq_get_affinity(), and uses this info
to choose reply queue for queuing one command.

Then the chosen reply queue has to be active, and fixes IO hang caused
by using inactive reply queue which doesn't have any online CPU mapped.

Cc: Hannes Reinecke 
Cc: "Martin K. Petersen" ,
Cc: James Bottomley ,
Cc: Christoph Hellwig ,
Cc: Don Brace 
Cc: Kashyap Desai 
Cc: Laurence Oberman 
Cc: Mike Snitzer 
Cc: Meelis Roos 
Cc: Artem Bityutskiy 
Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs")
Signed-off-by: Ming Lei 
---
 drivers/scsi/megaraid/megaraid_sas.h|  1 +
 drivers/scsi/megaraid/megaraid_sas_base.c   | 39 ++---
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 12 +++--
 3 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h 
b/drivers/scsi/megaraid/megaraid_sas.h
index ba6503f37756..27fab8235ea5 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -2128,6 +2128,7 @@ enum MR_PD_TYPE {
 
 struct megasas_instance {
 
+   unsigned int *reply_map;
__le32 *producer;
dma_addr_t producer_h;
__le32 *consumer;
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c 
b/drivers/scsi/megaraid/megaraid_sas_base.c
index a71ee67df084..dde0798b8a91 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -5165,6 +5165,26 @@ megasas_setup_jbod_map(struct megasas_instance *instance)
instance->use_seqnum_jbod_fp = false;
 }
 
+static void megasas_setup_reply_map(struct megasas_instance *instance)
+{
+   const struct cpumask *mask;
+   unsigned int queue, cpu;
+
+   for (queue = 0; queue < instance->msix_vectors; queue++) {
+   mask = pci_irq_get_affinity(instance->pdev, queue);
+   if (!mask)
+   goto fallback;
+
+   for_each_cpu(cpu, mask)
+   instance->reply_map[cpu] = queue;
+   }
+   return;
+
+fallback:
+   for_each_possible_cpu(cpu)
+   instance->reply_map[cpu] = cpu % instance->msix_vectors;
+}
+
 /**
  * megasas_init_fw -   Initializes the FW
  * @instance:  Adapter soft state
@@ -5343,6 +5363,8 @@ static int megasas_init_fw(struct megasas_instance 
*instance)
goto fail_setup_irqs;
}
 
+   megasas_setup_reply_map(instance);
+
dev_info(>pdev->dev,
"firmware supports msix\t: (%d)", fw_msix_count);
dev_info(>pdev->dev,
@@ -6123,20 +6145,29 @@ static inline int megasas_alloc_mfi_ctrl_mem(struct 
megasas_instance *instance)
  */
 static int megasas_alloc_ctrl_mem(struct megasas_instance *instance)
 {
+   instance->reply_map = kzalloc(sizeof(unsigned int) * nr_cpu_ids,
+ GFP_KERNEL);
+   if (!instance->reply_map)
+   return -ENOMEM;
+
switch (instance->adapter_type) {
case MFI_SERIES:
if (megasas_alloc_mfi_ctrl_mem(instance))
-   return -ENOMEM;
+   goto fail;
break;
case VENTURA_SERIES:
case THUNDERBOLT_SERIES:
case INVADER_SERIES:
if (megasas_alloc_fusion_context(instance))
-   return -ENOMEM;
+   goto fail;
break;
}
 
return 0;
+ fail:
+   kfree(instance->reply_map);
+   instance->reply_map = NULL;
+   return -ENOMEM;
 }
 
 /*
@@ -6148,6 +6179,7 @@ static int megasas_alloc_ctrl_mem(struct megasas_instance 
*instance)
  */
 static inline void megasas_free_ctrl_mem(struct megasas_instance *instance)
 {
+   kfree(instance->reply_map);
if (instance->adapter_type == MFI_SERIES) {
if (instance->producer)
pci_free_consistent(instance->pdev, sizeof(u32),
@@ -6540,7 +6572,6 @@ static int megasas_probe_one(struct pci_dev *pdev,
pci_free_irq_vectors(instance->pdev);
 fail_init_mfi:
scsi_host_put(host);
-
 fail_alloc_instance:
pci_disable_device(pdev);
 
@@ -6746,6 +6777,8 @@ megasas_resume(struct pci_dev *pdev)
if (rval < 0)
goto fail_reenable_msix;
 
+   megasas_setup_reply_map(instance);
+
if (instance->adapter_type != MFI_SERIES) {

[PATCH V5 1/5] scsi: hpsa: fix selection of reply queue

2018-03-13 Thread Ming Lei
>From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs),
one msix vector can be created without any online CPU mapped, then one
command's completion may not be notified.

This patch setups mapping between cpu and reply queue according to irq
affinity info retrived by pci_irq_get_affinity(), and uses this mapping
table to choose reply queue for queuing one command.

Then the chosen reply queue has to be active, and fixes IO hang caused
by using inactive reply queue which doesn't have any online CPU mapped.

Cc: Hannes Reinecke 
Cc: "Martin K. Petersen" ,
Cc: James Bottomley ,
Cc: Christoph Hellwig ,
Cc: Don Brace 
Cc: Kashyap Desai 
Cc: Laurence Oberman 
Cc: Meelis Roos 
Cc: Artem Bityutskiy 
Cc: Mike Snitzer 
Tested-by: Laurence Oberman 
Tested-by: Don Brace 
Tested-by: Artem Bityutskiy 
Acked-by: Don Brace 
Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs")
Signed-off-by: Ming Lei 
---
 drivers/scsi/hpsa.c | 73 +++--
 drivers/scsi/hpsa.h |  1 +
 2 files changed, 55 insertions(+), 19 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 5293e6827ce5..3a9eca163db8 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1045,11 +1045,7 @@ static void set_performant_mode(struct ctlr_info *h, 
struct CommandList *c,
c->busaddr |= 1 | (h->blockFetchTable[c->Header.SGList] << 1);
if (unlikely(!h->msix_vectors))
return;
-   if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
-   c->Header.ReplyQueue =
-   raw_smp_processor_id() % h->nreply_queues;
-   else
-   c->Header.ReplyQueue = reply_queue % h->nreply_queues;
+   c->Header.ReplyQueue = reply_queue;
}
 }
 
@@ -1063,10 +1059,7 @@ static void set_ioaccel1_performant_mode(struct 
ctlr_info *h,
 * Tell the controller to post the reply to the queue for this
 * processor.  This seems to give the best I/O throughput.
 */
-   if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
-   cp->ReplyQueue = smp_processor_id() % h->nreply_queues;
-   else
-   cp->ReplyQueue = reply_queue % h->nreply_queues;
+   cp->ReplyQueue = reply_queue;
/*
 * Set the bits in the address sent down to include:
 *  - performant mode bit (bit 0)
@@ -1087,10 +1080,7 @@ static void set_ioaccel2_tmf_performant_mode(struct 
ctlr_info *h,
/* Tell the controller to post the reply to the queue for this
 * processor.  This seems to give the best I/O throughput.
 */
-   if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
-   cp->reply_queue = smp_processor_id() % h->nreply_queues;
-   else
-   cp->reply_queue = reply_queue % h->nreply_queues;
+   cp->reply_queue = reply_queue;
/* Set the bits in the address sent down to include:
 *  - performant mode bit not used in ioaccel mode 2
 *  - pull count (bits 0-3)
@@ -1109,10 +1099,7 @@ static void set_ioaccel2_performant_mode(struct 
ctlr_info *h,
 * Tell the controller to post the reply to the queue for this
 * processor.  This seems to give the best I/O throughput.
 */
-   if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
-   cp->reply_queue = smp_processor_id() % h->nreply_queues;
-   else
-   cp->reply_queue = reply_queue % h->nreply_queues;
+   cp->reply_queue = reply_queue;
/*
 * Set the bits in the address sent down to include:
 *  - performant mode bit not used in ioaccel mode 2
@@ -1157,6 +1144,8 @@ static void __enqueue_cmd_and_start_io(struct ctlr_info 
*h,
 {
dial_down_lockup_detection_during_fw_flash(h, c);
atomic_inc(>commands_outstanding);
+
+   reply_queue = h->reply_map[raw_smp_processor_id()];
switch (c->cmd_type) {
case CMD_IOACCEL1:
set_ioaccel1_performant_mode(h, c, reply_queue);
@@ -7376,6 +7365,26 @@ static void hpsa_disable_interrupt_mode(struct ctlr_info 
*h)
h->msix_vectors = 0;
 }
 
+static void hpsa_setup_reply_map(struct ctlr_info *h)
+{
+   const struct cpumask *mask;
+   unsigned int queue, cpu;
+
+   for (queue = 0; queue < h->msix_vectors; queue++) {
+   mask = pci_irq_get_affinity(h->pdev, queue);
+   if (!mask)
+   goto fallback;
+
+   for_each_cpu(cpu, mask)
+   h->reply_map[cpu] = queue;
+   }
+   return;
+

[PATCH V5 0/5] SCSI: fix selection of reply(hw) queue

2018-03-13 Thread Ming Lei
Hi All,

The patches fixes reply queue(virt-queue on virtio-scsi) selection on hpsa,
megaraid_sa and virtio-scsi, and IO hang can be caused easily by this issue.

This issue is triggered by 84676c1f21e8 ("genirq/affinity: assign vectors
to all possible CPUs"). After 84676c1f21e8, it is easy to see one msix
vector mapped to all offline CPUs. If the reply queue is seleteced from
all allocated msix vectors(reply queues) in round-roin way, the selected
replay queue may not have any online CPU mapped, IO hang is caused.

Both hpsa and megaraid_sas uses host-wide tagset, we can't convert the
reply queue to blk_mq hw queue directly, otherwise IO performance is degraded
much, according to Kashyap's test, so this patchset sets up one mapping talbe
for selecting reply queue, and this approach has been used by mpt3sas already.

For virtio-scsi, the virt-queue is really hw queue wrt. blk-mq view, so
we introduce 'force_blk_mq' for fix this issue because: 1) virtio-blk
has been used for years in blk-mq mode; 2) we have discussed recently
that scsi_mq will be enabled at default soon.


gitweb:
https://github.com/ming1/linux/tree/v4.16-rc-select-reply-queue-fix-V5

V5:
- cover legacy vector for megaraid_sas(2/5)
- patch style change (4/5)
- add one virtio-scsi cleanup patch(5/5)

V4:
- splitted from previous patchset
- handle virtio-scsi by force_blk_mq

Ming Lei (5):
  scsi: hpsa: fix selection of reply queue
  scsi: megaraid_sas: fix selection of reply queue
  scsi: introduce force_blk_mq
  scsi: virtio_scsi: fix IO hang caused by irq vector automatic affinity
  scsi: virtio_scsi: unify scsi_host_template

 drivers/scsi/hosts.c|   1 +
 drivers/scsi/hpsa.c |  73 
 drivers/scsi/hpsa.h |   1 +
 drivers/scsi/megaraid/megaraid_sas.h|   1 +
 drivers/scsi/megaraid/megaraid_sas_base.c   |  39 -
 drivers/scsi/megaraid/megaraid_sas_fusion.c |  12 +--
 drivers/scsi/virtio_scsi.c  | 129 
 include/scsi/scsi_host.h|   3 +
 8 files changed, 116 insertions(+), 143 deletions(-)

-- 
2.9.5



Re: [PATCH] scsi: eata: drop VLA in reorder()

2018-03-13 Thread Christoph Hellwig
On Mon, Mar 12, 2018 at 10:35:36PM -0400, Martin K. Petersen wrote:
> No objections to Salvatore's patch but I have a slight affinity for
> retiring unused code over patching it. So unless there are objections...

Lets kill it.  And the not DMA capable eata_pio driver with it for
good riddance.

> 
> -- 
> Martin K. PetersenOracle Linux Engineering
---end quoted text---