Re: [PATCH] xen/scsifront: use offset_in_page() macro

2017-04-24 Thread Juergen Gross
On 22/04/17 03:21, Geliang Tang wrote:
> Use offset_in_page() macro instead of open-coding.
> 
> Signed-off-by: Geliang Tang 

Pushed to xen/tip for-linus-4.12


Thanks,

Juergen



Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-24 Thread Knut Omang
On Mon, 2017-04-24 at 10:14 -0600, Logan Gunthorpe wrote:
> 
> On 24/04/17 01:36 AM, Knut Omang wrote:
> > My first reflex when reading this thread was to think that this whole domain
> > lends it self excellently to testing via Qemu. Could it be that doing this 
> > in 
> > the opposite direction might be a safer approach in the long run even 
> > though 
> > (significant) more work up-front?
> 
> That's an interesting idea. We did do some very limited testing on qemu
> with one iteration of our work. However, it's difficult because there is
> no support for any RDMA devices which are a part of our primary use
> case. 

Yes, that's why I used 'significant'. One good thing is that given resources 
it can easily be done in parallel with other development, and will give 
additional
insight of some form.

> I also imagine it would be quite difficult to develop those models
> given the array of hardware that needs to be supported and the deep
> functional knowledge required to figure out appropriate restrictions.

>From my naive perspective it seems it need not even be a full model to get 
>some benefits,
just low level functionality tests with some instances of a
device that offers some MMIO space 'playground'.

Or maybe you can leverage some of the already implemented emulated devices in 
Qemu?

Knut

> 
> Logan


Re: [PATCH] xen/scsifront: use offset_in_page() macro

2017-04-24 Thread Juergen Gross
On 25/04/17 00:15, Martin K. Petersen wrote:
> 
> Juergen,
> 
>> On 22/04/17 03:21, Geliang Tang wrote:
>>> Use offset_in_page() macro instead of open-coding.
>>>
>>> Signed-off-by: Geliang Tang 
>>
>> Reviewed-by: Juergen Gross 
> 
> Taking this through the Xen tree or should I queue it?

I can take it through the Xen tree.


Thanks,

Juergen



Re: [PATCH] nvme/scsi: Consider LBA format in IO splitting calculation

2017-04-24 Thread Damien Le Moal
Jens,

On 4/25/17 13:32, Jens Axboe wrote:
>> I just realized that this macro is defined in linux/device-mapper.h,
>> which does not seem like to best place to have it. Why not blkdev.h ?
>> Any particular reason ? This leads to some strange include dependencies,
>> like many nfs/blocklayout/ files including device-mapper.h just to get
>> that definition.
> 
> I'm fine with moving it and using it everywhere. Right now we don't in
> the block core at all, 9 is always hard coded. While that change would
> be fine, it should be done independently of this patch, which is a real
> bug fix.

Thank you for the clarification. I will try to send something later.

Best regards.

-- 
Damien Le Moal,
Western Digital


Re: [PATCH] nvme/scsi: Consider LBA format in IO splitting calculation

2017-04-24 Thread Jens Axboe
On 04/24/2017 09:27 PM, Damien Le Moal wrote:
> Jon,
> 
> On 4/25/17 13:00, Jens Axboe wrote:
>> On Mon, Apr 24 2017, Jon Derrick wrote:
>>> The current command submission code uses a sector-based value when
>>> considering the maximum number of blocks per command. With a
>>> 4k-formatted namespace and a command exceeding max hardware limits, this
>>> calculation doesn't split IOs which should be split and fails in the
>>> nvme layer. This patch fixes that calculation and enables IO splitting
>>> in these circumstances.
>>>
>>> Signed-off-by: Jon Derrick 
>>> ---
>>>  drivers/nvme/host/scsi.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/nvme/host/scsi.c b/drivers/nvme/host/scsi.c
>>> index f49ae27..988da61 100644
>>> --- a/drivers/nvme/host/scsi.c
>>> +++ b/drivers/nvme/host/scsi.c
>>> @@ -1609,7 +1609,7 @@ static int nvme_trans_do_nvme_io(struct nvme_ns *ns, 
>>> struct sg_io_hdr *hdr,
>>> struct nvme_command c;
>>> u8 opcode = (is_write ? nvme_cmd_write : nvme_cmd_read);
>>> u16 control;
>>> -   u32 max_blocks = queue_max_hw_sectors(ns->queue);
>>> +   u32 max_blocks = queue_max_hw_sectors(ns->queue) >> (ns->lba_shift - 9);
>>>  
>>> num_cmds = nvme_trans_io_get_num_cmds(hdr, cdb_info, max_blocks);
>>
>> Patch looks correct to me, as we always consider the hw sectors settings
>> in units of 512b blocks.
>>
>> Reviewed-by: Jens Axboe 
> 
> May be replace 9 with SECTOR_SHIFT ?
> 
> Jens,
> 
> I just realized that this macro is defined in linux/device-mapper.h,
> which does not seem like to best place to have it. Why not blkdev.h ?
> Any particular reason ? This leads to some strange include dependencies,
> like many nfs/blocklayout/ files including device-mapper.h just to get
> that definition.

I'm fine with moving it and using it everywhere. Right now we don't in
the block core at all, 9 is always hard coded. While that change would
be fine, it should be done independently of this patch, which is a real
bug fix.

-- 
Jens Axboe



Re: [PATCH] nvme/scsi: Consider LBA format in IO splitting calculation

2017-04-24 Thread Damien Le Moal
Jon,

On 4/25/17 13:00, Jens Axboe wrote:
> On Mon, Apr 24 2017, Jon Derrick wrote:
>> The current command submission code uses a sector-based value when
>> considering the maximum number of blocks per command. With a
>> 4k-formatted namespace and a command exceeding max hardware limits, this
>> calculation doesn't split IOs which should be split and fails in the
>> nvme layer. This patch fixes that calculation and enables IO splitting
>> in these circumstances.
>>
>> Signed-off-by: Jon Derrick 
>> ---
>>  drivers/nvme/host/scsi.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvme/host/scsi.c b/drivers/nvme/host/scsi.c
>> index f49ae27..988da61 100644
>> --- a/drivers/nvme/host/scsi.c
>> +++ b/drivers/nvme/host/scsi.c
>> @@ -1609,7 +1609,7 @@ static int nvme_trans_do_nvme_io(struct nvme_ns *ns, 
>> struct sg_io_hdr *hdr,
>>  struct nvme_command c;
>>  u8 opcode = (is_write ? nvme_cmd_write : nvme_cmd_read);
>>  u16 control;
>> -u32 max_blocks = queue_max_hw_sectors(ns->queue);
>> +u32 max_blocks = queue_max_hw_sectors(ns->queue) >> (ns->lba_shift - 9);
>>  
>>  num_cmds = nvme_trans_io_get_num_cmds(hdr, cdb_info, max_blocks);
> 
> Patch looks correct to me, as we always consider the hw sectors settings
> in units of 512b blocks.
> 
> Reviewed-by: Jens Axboe 

May be replace 9 with SECTOR_SHIFT ?

Jens,

I just realized that this macro is defined in linux/device-mapper.h,
which does not seem like to best place to have it. Why not blkdev.h ?
Any particular reason ? This leads to some strange include dependencies,
like many nfs/blocklayout/ files including device-mapper.h just to get
that definition.

Best regards.

-- 
Damien Le Moal, Ph.D.
Sr. Manager, System Software Research Group,
Western Digital Corporation
damien.lem...@wdc.com
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa,
Kanagawa, 252-0888 Japan
www.wdc.com, www.hgst.com


Re: [PATCH] nvme/scsi: Consider LBA format in IO splitting calculation

2017-04-24 Thread Jens Axboe
On Mon, Apr 24 2017, Jon Derrick wrote:
> The current command submission code uses a sector-based value when
> considering the maximum number of blocks per command. With a
> 4k-formatted namespace and a command exceeding max hardware limits, this
> calculation doesn't split IOs which should be split and fails in the
> nvme layer. This patch fixes that calculation and enables IO splitting
> in these circumstances.
> 
> Signed-off-by: Jon Derrick 
> ---
>  drivers/nvme/host/scsi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/scsi.c b/drivers/nvme/host/scsi.c
> index f49ae27..988da61 100644
> --- a/drivers/nvme/host/scsi.c
> +++ b/drivers/nvme/host/scsi.c
> @@ -1609,7 +1609,7 @@ static int nvme_trans_do_nvme_io(struct nvme_ns *ns, 
> struct sg_io_hdr *hdr,
>   struct nvme_command c;
>   u8 opcode = (is_write ? nvme_cmd_write : nvme_cmd_read);
>   u16 control;
> - u32 max_blocks = queue_max_hw_sectors(ns->queue);
> + u32 max_blocks = queue_max_hw_sectors(ns->queue) >> (ns->lba_shift - 9);
>  
>   num_cmds = nvme_trans_io_get_num_cmds(hdr, cdb_info, max_blocks);

Patch looks correct to me, as we always consider the hw sectors settings
in units of 512b blocks.

Reviewed-by: Jens Axboe 

-- 
Jens Axboe



[PATCH -next] [SCSI] mac_esp: fix to pass correct device identity to free_irq()

2017-04-24 Thread Wei Yongjun
From: Wei Yongjun 

free_irq() expects the same device identity that was passed to
corresponding request_irq(), otherwise the IRQ is not freed.

Signed-off-by: Wei Yongjun 
---
 drivers/scsi/mac_esp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/mac_esp.c b/drivers/scsi/mac_esp.c
index bb567d3..dda4da2 100644
--- a/drivers/scsi/mac_esp.c
+++ b/drivers/scsi/mac_esp.c
@@ -580,7 +580,7 @@ static int esp_mac_probe(struct platform_device *dev)
 
 fail_free_irq:
if (esp_chips[!dev->id] == NULL)
-   free_irq(host->irq, esp);
+   free_irq(host->irq, NULL);
 fail_free_priv:
kfree(mep);
 fail_free_command_block:





Re: [PATCH] scsi/mac_esp: Replace bogus memory barrier with spinlock

2017-04-24 Thread Michael Schmitz
Martin,

looks good to me, so:

Reviewed-By: Michael Schmitz 


Am 25.04.2017 um 10:29 schrieb Martin K. Petersen:
> 
> Finn,
> 
>> Commit da244654c66e ("[SCSI] mac_esp: fix for quadras with two esp chips")
>> added mac_scsi_esp_intr() to handle the IRQ lines from a pair of on-board
>> ESP chips (a normal shared IRQ did not work).
>>
>> Proper mutual exclusion was missing from that patch. This patch fixes
>> race conditions between comparison and assignment of esp_chips[]
>> pointers.
> 
> Ondrej/Michael: Mind reviewing this change?
> 


Re: [PATCH v2 3/7] sd: Cleanup sd_done sense data handling

2017-04-24 Thread Damien Le Moal
Martin,

On 4/25/17 08:11, Martin K. Petersen wrote:
> 
> Christoph,
> 
>> Use a switch for the sense key, and remove two pointless variables
>> that are only used once.
> 
>> -if (unmap)
> 
> The rationale behind the unmap variable was clarity and avoiding magic
> values.
> 
> I'm OK with this, however:
> 
>> +if (SCpnt->cmnd[1] & 8) { /* UNMAP */
> 
> So I committed with that change.

Thank you for the commit.

Just one remark:
You left the "good_bytes = 0;" in the else statement of the
ILLEGAL_REQUEST && asc == 0x24 && command == WRITE_SAME case. Is it
really necessary ?

good_bytes is already set to 0 at the beginning of sd_done() when result
!= 0 and again within the initial switch-case for REQ_OP_DISCARD and
REQ_OP_WRITE_SAME cases, which are the only commands that can lead to
hitting that "else" part in the sense data processing.
I may be missing something, but I think that that assignment is redundant.

Best regards.

-- 
Damien Le Moal, Ph.D.
Sr. Manager, System Software Research Group,
Western Digital Corporation
damien.lem...@wdc.com
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa,
Kanagawa, 252-0888 Japan
www.wdc.com, www.hgst.com


[PATCH] nvme/scsi: Consider LBA format in IO splitting calculation

2017-04-24 Thread Jon Derrick
The current command submission code uses a sector-based value when
considering the maximum number of blocks per command. With a
4k-formatted namespace and a command exceeding max hardware limits, this
calculation doesn't split IOs which should be split and fails in the
nvme layer. This patch fixes that calculation and enables IO splitting
in these circumstances.

Signed-off-by: Jon Derrick 
---
 drivers/nvme/host/scsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/scsi.c b/drivers/nvme/host/scsi.c
index f49ae27..988da61 100644
--- a/drivers/nvme/host/scsi.c
+++ b/drivers/nvme/host/scsi.c
@@ -1609,7 +1609,7 @@ static int nvme_trans_do_nvme_io(struct nvme_ns *ns, 
struct sg_io_hdr *hdr,
struct nvme_command c;
u8 opcode = (is_write ? nvme_cmd_write : nvme_cmd_read);
u16 control;
-   u32 max_blocks = queue_max_hw_sectors(ns->queue);
+   u32 max_blocks = queue_max_hw_sectors(ns->queue) >> (ns->lba_shift - 9);
 
num_cmds = nvme_trans_io_get_num_cmds(hdr, cdb_info, max_blocks);
 
-- 
2.9.3



Re: [PATCH v4 10/10] scsi: Implement blk_mq_ops.show_rq()

2017-04-24 Thread Bart Van Assche
On Mon, 2017-04-24 at 19:33 -0400, Martin K. Petersen wrote:
> > The debugfs infrastructure is already there, and it already supports
> > showing requests. Bart is just exposing the SCSI information.
> 
> That's fine.
> 
> I was merely objecting to the fact that we already have umpteen existing
> interfaces for displaying SCSI command information.

Hello Martin,

Do you perhaps want me to change the for-loop into a call to
__scsi_format_command()?

Thanks,

Bart.

Re: [PATCH v4 10/10] scsi: Implement blk_mq_ops.show_rq()

2017-04-24 Thread Martin K. Petersen

Omar,

> The debugfs infrastructure is already there, and it already supports
> showing requests. Bart is just exposing the SCSI information.

That's fine.

I was merely objecting to the fact that we already have umpteen existing
interfaces for displaying SCSI command information.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi/mac_esp: Replace bogus memory barrier with spinlock

2017-04-24 Thread Martin K. Petersen

Michael,

> I must have missed that one - where was it posted to?

Usual bat channel:

https://patchwork.kernel.org/patch/9658369/

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi/mac_esp: Replace bogus memory barrier with spinlock

2017-04-24 Thread Michael Schmitz
Hi Martin,

I must have missed that one - where was it posted to?

Cheers,

Michael


Am 25.04.2017 um 10:29 schrieb Martin K. Petersen:
> 
> Finn,
> 
>> Commit da244654c66e ("[SCSI] mac_esp: fix for quadras with two esp chips")
>> added mac_scsi_esp_intr() to handle the IRQ lines from a pair of on-board
>> ESP chips (a normal shared IRQ did not work).
>>
>> Proper mutual exclusion was missing from that patch. This patch fixes
>> race conditions between comparison and assignment of esp_chips[]
>> pointers.
> 
> Ondrej/Michael: Mind reviewing this change?
> 


Re: [PATCH v4 10/10] scsi: Implement blk_mq_ops.show_rq()

2017-04-24 Thread Omar Sandoval
On Mon, Apr 24, 2017 at 07:19:50PM -0400, Martin K. Petersen wrote:
> 
> Bart,
> 
> > SCSI tracing has to be enabled before a test is started, produces a
> > huge amount of data, and deriving state information from a huge trace
> > is far from easy. The information in debugfs provides an easy to read
> > overview of the current state without having to analyze megabytes of
> > traces, without introducing any slowdown and without having to enable
> > any tracing mechanism from beforehand.
> 
> Fair enough. Just seems like there's an obvious overlap in plumbing.
> Don't know if that can be leveraged instead of introducing something
> completely new?

The debugfs infrastructure is already there, and it already supports
showing requests. Bart is just exposing the SCSI information.


Re: [PATCH v4 10/10] scsi: Implement blk_mq_ops.show_rq()

2017-04-24 Thread Martin K. Petersen

Bart,

> SCSI tracing has to be enabled before a test is started, produces a
> huge amount of data, and deriving state information from a huge trace
> is far from easy. The information in debugfs provides an easy to read
> overview of the current state without having to analyze megabytes of
> traces, without introducing any slowdown and without having to enable
> any tracing mechanism from beforehand.

Fair enough. Just seems like there's an obvious overlap in plumbing.
Don't know if that can be leveraged instead of introducing something
completely new?

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH v2 3/7] sd: Cleanup sd_done sense data handling

2017-04-24 Thread Martin K. Petersen

Christoph,

> Use a switch for the sense key, and remove two pointless variables
> that are only used once.

> - if (unmap)

The rationale behind the unmap variable was clarity and avoiding magic
values.

I'm OK with this, however:

> + if (SCpnt->cmnd[1] & 8) { /* UNMAP */

So I committed with that change.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 1/2] scsi: ufs: use MASK_EE_STATUS

2017-04-24 Thread Martin K. Petersen

> From: Tomohiro Kusumi 
>
> MASK_EE_STATUS added by 66ec6d59 was unused, but it seems to have been
> defined to do this.
>
> Signed-off-by: Tomohiro Kusumi 

Subhash: Please review these two. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH V2] scsi: mpt3sas: remove redundant wmb

2017-04-24 Thread Martin K. Petersen

Sinan,

> Due to relaxed ordering requirements on multiple architectures,
> drivers are required to use wmb/rmb/mb combinations when they need to
> guarantee observability between the memory and the HW.

Applied to 4.12/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH v2 1/1] aacraid: pci_alloc_consistent() failures on ARM64

2017-04-24 Thread Martin K. Petersen

> There were pci_alloc_consistent() failures on ARM64 platform.
> Use dma_alloc_coherent() with GFP_KERNEL flag DMA memory allocations.

Somebody please review!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi/mac_esp: Replace bogus memory barrier with spinlock

2017-04-24 Thread Martin K. Petersen

Finn,

> Commit da244654c66e ("[SCSI] mac_esp: fix for quadras with two esp chips")
> added mac_scsi_esp_intr() to handle the IRQ lines from a pair of on-board
> ESP chips (a normal shared IRQ did not work).
>
> Proper mutual exclusion was missing from that patch. This patch fixes
> race conditions between comparison and assignment of esp_chips[]
> pointers.

Ondrej/Michael: Mind reviewing this change?

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] mpt: Move scsi_remove_host() out of mptscsih_remove_host()

2017-04-24 Thread Martin K. Petersen

Hannes,

> Commit c5ce0ab ("scsi: sas: move scsi_remove_host call...")
> moved the call to scsi_remove_host() into sas_remove_host(),
> but forgot to modify the mpt drivers.

Applied to 4.12/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] sg: reset 'res_in_use' after unlinking reserved array

2017-04-24 Thread Martin K. Petersen

Hannes,

> Once the reserved page array is unused we can reset the 'res_in_use'
> state; here we can do a lazy update without holding the mutex as we
> only need to check against concurrent access, not concurrent release.

Applied to 4.12/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: mvumi: remove code handling zero scsi_sg_count(scmd) case

2017-04-24 Thread Martin K. Petersen

Alexey,

> As Christoph Hellwig noted, SCSI commands that transfer data
> always have a SG entry. The patch removes dead code in
> mvumi_make_sgl(),  mvumi_complete_cmd() and mvumi_timed_out()
> that handle zero scsi_sg_count(scmd) case.
>
> Also the patch adds pci_unmap_sg() on failure path in mvumi_make_sgl().

Applied to 4.12/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] xen/scsifront: use offset_in_page() macro

2017-04-24 Thread Martin K. Petersen

Juergen,

> On 22/04/17 03:21, Geliang Tang wrote:
>> Use offset_in_page() macro instead of open-coding.
>> 
>> Signed-off-by: Geliang Tang 
>
> Reviewed-by: Juergen Gross 

Taking this through the Xen tree or should I queue it?

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: pmcraid: use normal copy_from_user

2017-04-24 Thread Martin K. Petersen

Arnd,

> As pointed out by Al Viro for my previous series, the driver has no need
> to call access_ok() and __copy_from_user()/__copy_to_user(). Changing
> it to regular copy_from_user()/copy_to_user() simplifies the code without
> any real downsides, making it less error-prone at best.
>
> This patch by itself also addresses the warning about the access_ok()
> macro on MIPS, but both fixes improve the code, so ideally we apply
> them both.

Applied patches 1, 3, 4 as well as this one to 4.12/scsi-queue. I took
Christoph's version of patch 2.

Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 2/4] scsi: pmcraid: fix lock imbalance in pmcraid_reset_reload()

2017-04-24 Thread Martin K. Petersen

Christoph,

> sparse found a bug that has always been present since the driver was
> merged:
>
> drivers/scsi/pmcraid.c:2353:12: warning: context imbalance in 
> 'pmcraid_reset_reload' - different lock contexts for basic block
>
> Fix this by using a common unlock goto label, and also reduce the
> indentation level in the function.

Applied to 4.12/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH v4 10/10] scsi: Implement blk_mq_ops.show_rq()

2017-04-24 Thread Bart Van Assche
On Mon, 2017-04-24 at 17:35 -0400, Martin K. Petersen wrote:
> > Show the SCSI CDB, .eh_eflags and .result for pending SCSI commands
> > in /sys/kernel/debug/block/*/mq/*/dispatch and */rq_list.
> 
> Why not use SCSI tracing if you are interested in these?

Hello Martin,

SCSI tracing has to be enabled before a test is started, produces a huge amount
of data, and deriving state information from a huge trace is far from easy. The
information in debugfs provides an easy to read overview of the current state
without having to analyze megabytes of traces, without introducing any slowdown
and without having to enable any tracing mechanism from beforehand.

Bart.

Re: [PATCH v3 3/4] sd: Make synchronize cache upon shutdown asynchronous

2017-04-24 Thread Bart Van Assche
On Sun, 2017-04-23 at 12:28 -0500, James Bottomley wrote:
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1250,6 +1250,12 @@ scsi_prep_state_check(struct scsi_device *sdev, struct 
> request *req)
>   break;
>   case SDEV_BLOCK:
>   case SDEV_CREATED_BLOCK:
> + /* q lock is held only in the non-mq case */
> + if (req->q->mq_ops)
> + blk_mq_stop_hw_queues(req->q);
> + else
> + blk_stop_queue(req->q);
> +
>   ret = BLKPREP_DEFER;
>   break;
>   case SDEV_QUIESCE:

Hello James,

This change swaps the order of changing the device state and the block layer
state. Sorry but I don't like this. What will happen if e.g. the disk event
checker decides to check for events just before __scsi_remove_device()
changes the device state? I think that can that cause sd_shutdown() to be
called with the block layer queue stopped and hence that with this approach
it is still possible that sd_shutdown() hangs.

> @@ -2611,7 +2617,6 @@ scsi_device_set_state(struct scsi_device *sdev, enum 
> scsi_device_state state)
>   case SDEV_QUIESCE:
>   case SDEV_OFFLINE:
>   case SDEV_TRANSPORT_OFFLINE:
> - case SDEV_BLOCK:
>   break;
>   default:
>   goto illegal;

A previous patch made two changes to scsi_device_set_state(). Are you sure
that we do no longer have to enable the SDEV_BLOCK to SDEV_DEL transition?

> @@ -2844,10 +2849,12 @@ static int scsi_request_fn_active(struct scsi_device 
> *sdev)
>   */
>  static void scsi_wait_for_queuecommand(struct scsi_device *sdev)
>  {
> - WARN_ON_ONCE(sdev->host->use_blk_mq);
> -
> - while (scsi_request_fn_active(sdev))
> - msleep(20);
> + if (sdev->request_queue->mq_ops) {
> + synchronize_rcu();
> + } else {
> + while (scsi_request_fn_active(sdev))
> + msleep(20);
> + }
>  }

The above code makes an assumption about the block layer internals, namely
that calling synchronize_rcu() is sufficient to wait for outstanding requests
to finish. Please do not embed any assumptions about block layer internals in
SCSI code but keep code that relies on block layer internals in the block
layer. If you have a look at blk_mq_quiesce_queue() then you will see that
calling synchronize_rcu() is not sufficient for hardware queues for which
BLK_MQ_F_BLOCKING has been set. I am aware that today the SCSI core does not
set that flag. However, the dependency of the dependency of the
synchronize_rcu() call on BLK_MQ_F_BLOCKING not being set is nontrivial.

Thanks,

Bart.


Re: [PATCH v4 10/10] scsi: Implement blk_mq_ops.show_rq()

2017-04-24 Thread Martin K. Petersen

Bart,

> Show the SCSI CDB, .eh_eflags and .result for pending SCSI commands
> in /sys/kernel/debug/block/*/mq/*/dispatch and */rq_list.

Why not use SCSI tracing if you are interested in these?

-- 
Martin K. Petersen  Oracle Linux Engineering


EXTREMELY IMPORTANT

2017-04-24 Thread Ms. Katherine
Dear Beloved, Sorry for the inconvenience, I am getting in touch with you 
regarding an extremely important and urgent matter, Please I need your urgent 
assistance and idea to finish up a project (Orphanage Home) Due to my health 
condition, Everything is available to finish up the project, All I need is your 
idea and trust.

Sehr geehrte Geliebte, traurig für die Unannehmlichkeiten, ich bin in Kontakt 
mit Ihnen über eine äußerst wichtige und dringende Angelegenheit, Bitte brauche 
ich Ihre dringende Hilfe und Idee, um ein Projekt zu beenden (Orphanage Home) 
Wegen meines Gesundheitszustandes ist alles verfügbar Beenden Sie das Projekt, 
Alles was ich brauche ist Ihre Idee und Vertrauen.

Please contact me for more details.
Contact Email: kathrynnmi...@gmail.com

Thanks

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



[PATCH] scsi: fcoe: make fcoe_e_d_tov and fcoe_r_a_tov static

2017-04-24 Thread Colin King
From: Colin Ian King 

These module parameter variables don't need global scope, make them static

Signed-off-by: Colin Ian King 
---
 drivers/scsi/fcoe/fcoe.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index ab7bc1505e0b..90939f66bc0d 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -63,11 +63,11 @@ unsigned int fcoe_debug_logging;
 module_param_named(debug_logging, fcoe_debug_logging, int, S_IRUGO|S_IWUSR);
 MODULE_PARM_DESC(debug_logging, "a bit mask of logging levels");
 
-unsigned int fcoe_e_d_tov = 2 * 1000;
+static unsigned int fcoe_e_d_tov = 2 * 1000;
 module_param_named(e_d_tov, fcoe_e_d_tov, int, S_IRUGO|S_IWUSR);
 MODULE_PARM_DESC(e_d_tov, "E_D_TOV in ms, default 2000");
 
-unsigned int fcoe_r_a_tov = 2 * 2 * 1000;
+static unsigned int fcoe_r_a_tov = 2 * 2 * 1000;
 module_param_named(r_a_tov, fcoe_r_a_tov, int, S_IRUGO|S_IWUSR);
 MODULE_PARM_DESC(r_a_tov, "R_A_TOV in ms, default 4000");
 
-- 
2.11.0



Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-24 Thread Logan Gunthorpe


On 24/04/17 01:36 AM, Knut Omang wrote:
> My first reflex when reading this thread was to think that this whole domain
> lends it self excellently to testing via Qemu. Could it be that doing this in 
> the opposite direction might be a safer approach in the long run even though 
> (significant) more work up-front?

That's an interesting idea. We did do some very limited testing on qemu
with one iteration of our work. However, it's difficult because there is
no support for any RDMA devices which are a part of our primary use
case. I also imagine it would be quite difficult to develop those models
given the array of hardware that needs to be supported and the deep
functional knowledge required to figure out appropriate restrictions.

Logan


Re: [PATCH] lpfc: Fix panic on BFS configuration.

2017-04-24 Thread James Smart

Please disregard this patch. there is an issue in the fix

-- james


On 4/21/2017 5:24 PM, jsmart2...@gmail.com wrote:

From: James Smart 

To select the appropriate shost template, the driver is issuing
a mailbox command to retrieve the wwn. Turns out the sending of
the command precedes the reset of the function.  On SLI-4 adapters,
this is inconsequential as the mailbox command location is specified
by dma via the BMBX register. However, on SLI-3 adapters, the
location of the mailbox command submission area changes. When the
function is first powered on or reset, the cmd is submitted via PCI
bar memory. Later the driver changes the function config to use
host memory and DMA. The request to start a mailbox command is the
same, a simple doorbell write, regardless of submission area.
So.. if there has not been a boot driver run against the adapter,
the mailbox command works as defaults are ok. But, if the boot
driver has configured the card and, and if no platform pci
function/slot reset occurs as the os starts, the mailbox command
will fail. The SLI-3 device will use the stale boot driver dma
location. This can cause PCI eeh errors.

Fix is to reset the sli-3 function before sending the
mailbox command, thus synchronizing the function/driver on mailbox
location.

This issue was introduced by this patch:
http://www.spinics.net/lists/linux-scsi/msg105908.html
which is in the stable pools with commit id:
96418b5e2c8867da3279d877f5d1ffabfe460c3d

This patch needs to be applied to the stable trees where ever the
introducing patch exists.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Cc: sta...@vger.kernel.org




Re: [PATCH v2 7/7] sd_zbc: Do not write lock zones for reset

2017-04-24 Thread Christoph Hellwig
Looks fine,

Reviewed-by: Christoph Hellwig 


Re: [PATCH v2 5/7] sd_zbc: Rename sd_zbc_setup_write_cmnd

2017-04-24 Thread Christoph Hellwig
Looks fine,

Reviewed-by: Christoph Hellwig 


Re: [PATCH v2 6/7] sd_zbc: Remove superfluous assignments

2017-04-24 Thread Christoph Hellwig
Looks fine,

Reviewed-by: Christoph Hellwig 


Re: [PATCH v2 4/7] scsi: Improve scsi_get_sense_info_fld

2017-04-24 Thread Christoph Hellwig
Looks fine,

Reviewed-by: Christoph Hellwig 


Re: [PATCH v2 3/7] sd: Cleanup sd_done sense data handling

2017-04-24 Thread Christoph Hellwig
I would just use a switch and kill the op and unmap variables entirely,
e.g. something like the patch below:

---
From: Christoph Hellwig 
Subject: sd: Cleanup sd_done sense data handling

Use a switch for the sense key, and remove two pointless variables
that are only used once.

Signed-off-by: Christoph Hellwig 
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 8cf34a8e3eea..eec695107c99 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1866,8 +1866,6 @@ static int sd_done(struct scsi_cmnd *SCpnt)
struct request *req = SCpnt->request;
int sense_valid = 0;
int sense_deferred = 0;
-   unsigned char op = SCpnt->cmnd[0];
-   unsigned char unmap = SCpnt->cmnd[1] & 8;
 
switch (req_op(req)) {
case REQ_OP_DISCARD:
@@ -1941,19 +1939,21 @@ static int sd_done(struct scsi_cmnd *SCpnt)
good_bytes = sd_completed_bytes(SCpnt);
break;
case ILLEGAL_REQUEST:
-   if (sshdr.asc == 0x10)  /* DIX: Host detected corruption */
+   switch (sshdr.asc) {
+   case 0x10:  /* DIX: Host detected corruption */
good_bytes = sd_completed_bytes(SCpnt);
-   /* INVALID COMMAND OPCODE or INVALID FIELD IN CDB */
-   if (sshdr.asc == 0x20 || sshdr.asc == 0x24) {
-   switch (op) {
+   break;
+   case 0x20:  /* INVALID COMMAND OPCODE */
+   case 0x24:  /* INVALID FIELD IN CDB */
+   switch (SCpnt->cmnd[0]) {
case UNMAP:
sd_config_discard(sdkp, SD_LBP_DISABLE);
break;
case WRITE_SAME_16:
case WRITE_SAME:
-   if (unmap)
+   if (SCpnt->cmnd[1] & 8) {
sd_config_discard(sdkp, SD_LBP_DISABLE);
-   else {
+   } else {
sdkp->device->no_write_same = 1;
sd_config_write_same(sdkp);
 
@@ -1961,6 +1961,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
req->__data_len = blk_rq_bytes(req);
req->rq_flags |= RQF_QUIET;
}
+   break;
}
}
break;


Re: [PATCH v2 2/7] sd: Improve sd_completed_bytes

2017-04-24 Thread Christoph Hellwig
On Mon, Apr 24, 2017 at 04:51:10PM +0900, damien.lem...@wdc.com wrote:
> From: Damien Le Moal 
> 
> Re-shuffle the code to be more efficient by not initializing
> variables upfront (i.e. do it only when necessary).
> Also replace the do_div calls with calls to sectors_to_logical().
> 
> No functional change is introduced by this patch.
> 
> Signed-off-by: Damien Le Moal 

Looks fine,

Reviewed-by: Christoph Hellwig 


Re: [PATCH v2 1/7] sd: Fix functions description

2017-04-24 Thread Christoph Hellwig
Looks good,

Reviewed-by: Christoph Hellwig 


Re: [PATCH] sg: reset 'res_in_use' after unlinking reserved array

2017-04-24 Thread Christoph Hellwig
Looks fine,

Reviewed-by: Christoph Hellwig 


Re: [PATCH] mpt: Move scsi_remove_host() out of mptscsih_remove_host()

2017-04-24 Thread Christoph Hellwig
Looks good,

Reviewed-by: Christoph Hellwig 


Re: [RFC] scsi: reduce protection of scan_mutex in scsi_remove_device

2017-04-24 Thread Christoph Hellwig
On Fri, Apr 21, 2017 at 02:13:02PM -0700, Song Liu wrote:
> When a device is deleted through sysfs handle "delete", the code
> locks shost->scan_mutex. If multiple devices are deleted at the
> same time, these deletes will be handled in series.
> 
> On the other hand, some devices do long latency IO during deletion,
> for example, sd_shutdown() may do sync cache and/or start_stop.
> It is not necessary for these commands to run in series.
> 
> To reduce latency of parallel "delete" requests, this patch reduces
> the protection of scan_mutex. The only function with Scsi_Host
> called in __scsi_remove_device() is the optional slave_destroy().
> Therefore, the protection of scan_mutex is only necessary for this
> function.

And I don't think it makes sense for slave_destroy either.  Please
do a quick audit of the instances and drop the lock for it, too.

> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 82dfe07..e7a9e28 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -610,7 +610,7 @@ static int scsi_sdev_check_buf_bit(const char *buf)
>   return 1;
>   else if (buf[0] == '0')
>   return 0;
> - else 
> + else
>   return -EINVAL;

Also the patch has a few odd whitespace changes like this.  Please
remove those before reposting.


[GIT PULL] SCSI fixes for 4.11-rc8

2017-04-24 Thread James Bottomley
Our final fix before the 4.12 release (hopefully).  It's an error leg
again: the fix to not bug on empty DMA transfers is returning the wrong
code and confusing the block layer. 

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Johannes Thumshirn (1):
  scsi: return correct blkprep status code in case scsi_init_io() fails.

With the diffstat:

 drivers/scsi/scsi_lib.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

And full diff below.

James

---
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 19125d72f322..5558e212368b 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1061,10 +1061,10 @@ int scsi_init_io(struct scsi_cmnd *cmd)
struct scsi_device *sdev = cmd->device;
struct request *rq = cmd->request;
bool is_mq = (rq->mq_ctx != NULL);
-   int error;
+   int error = BLKPREP_KILL;
 
if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq)))
-   return -EINVAL;
+   goto err_exit;
 
error = scsi_init_sgtable(rq, &cmd->sdb);
if (error)




Re: [PATCH] scsi: sas: move scsi_remove_host call into sas_remove_host

2017-04-24 Thread Johannes Thumshirn
On Mon, Apr 24, 2017 at 02:38:40PM +0100, John Garry wrote:
> On 24/04/2017 11:09, John Garry wrote:
> >On 21/04/2017 13:11, Johannes Thumshirn wrote:
> >>Move scsi_remove_host call into sas_remove_host and remove it from SAS
> >>HBA
> >>drivers, so we don't mess up the ordering. This solves an issue with
> >>double
> >>deleting sysfs entries that was introduced by the change of sysfs
> >>behaviour
> >>from commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive").
> >>
> >>Signed-off-by: Johannes Thumshirn 
> >>Suggested-by: Christoph Hellwig 
> >>Cc: Hannes Reinecke 
> >>Cc: James Bottomley 
> >>Cc: Jinpu Wang 
> >>Cc: John Garry 
> >
> >For what it's worth:
> >
> >Tested-by: John Garry  # On hisi_sas
> >
> 
> I have actually tested this a little further and now I see an issue. Maybe
> it is related to the internal development kernel I am using or a
> pre-existing issue with our driver, but I doubt it.
> This time I removed the WARN in sysfs_remove_group() [so the console is not
> bombarded] and ran repeated insmod/rmmod, and I see this sometimes:

[...]


> Ring any bells? I don't see this new WARN without the change.

No haven't seen the WARN in my tests but I didn't do a insmod/rmmod stress
test anyways.

I'll have a look.

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


Re: [PATCH] scsi: sas: move scsi_remove_host call into sas_remove_host

2017-04-24 Thread John Garry

On 24/04/2017 11:09, John Garry wrote:

On 21/04/2017 13:11, Johannes Thumshirn wrote:

Move scsi_remove_host call into sas_remove_host and remove it from SAS
HBA
drivers, so we don't mess up the ordering. This solves an issue with
double
deleting sysfs entries that was introduced by the change of sysfs
behaviour
from commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive").

Signed-off-by: Johannes Thumshirn 
Suggested-by: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: James Bottomley 
Cc: Jinpu Wang 
Cc: John Garry 


For what it's worth:

Tested-by: John Garry  # On hisi_sas



I have actually tested this a little further and now I see an issue. 
Maybe it is related to the internal development kernel I am using or a 
pre-existing issue with our driver, but I doubt it.
This time I removed the WARN in sysfs_remove_group() [so the console is 
not bombarded] and ran repeated insmod/rmmod, and I see this sometimes:


root@(none)$
root@(none)$ insmod hisi_sas_v2_hw.ko
[ 2150.478485] scsi host6: hisi_sas
[ 2151.629343] scsi host7: hisi_sas
[ 2152.248410] hisi_sas_v2_hw HISI0162:01: phyup: phy0 link_rate=11
[ 2152.254424] hisi_sas_v2_hw HISI0162:01: phyup: phy1 link_rate=11
[ 2152.260422] hisi_sas_v2_hw HISI0162:01: phyup: phy2 link_rate=11
[ 2152.266421] hisi_sas_v2_hw HISI0162:01: phyup: phy3 link_rate=11
[ 2152.272419] hisi_sas_v2_hw HISI0162:01: phyup: phy4 link_rate=11
[ 2152.278416] hisi_sas_v2_hw HISI0162:01: phyup: phy5 link_rate=11
[ 2152.284413] hisi_sas_v2_hw HISI0162:01: phyup: phy6 link_rate=11
[ 2152.290410] hisi_sas_v2_hw HISI0162:01: phyup: phy7 link_rate=11
[ 2152.507431] ata3.00: ATA-8: HGST HUS724040ALA640, MFAOA8B0, max UDMA/133
[ 2152.514127] ata3.00: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[ 2152.522726] ata3.00: configured for UDMA/133
[ 2152.574540] scsi 7:0:0:0: Direct-Access SanDisk  LT0200MO 
P404 PQ: 0 ANSI: 6
[ 2152.835694] scsi 7:0:1:0: Direct-Access ATA  HGST HUS724040AL 
A8B0 PQ: 0 ANSI: 5
[ 2152.873713] sd 7:0:1:0: [sdb] 7814037168 512-byte logical blocks: 
(4.00 TB/3.64 TiB)

[ 2152.881468] sd 7:0:1:0: [sdb] Write Protect is off
[ 2152.886274] sd 7:0:1:0: [sdb] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA
[ 2152.911092] scsi 7:0:2:0: Direct-Access SanDisk  LT0200MO 
P404 PQ: 0 ANSI: 6
[ 2153.050599] sd 7:0:0:0: [sda] 390721968 512-byte logical blocks: (200 
GB/186 GiB)

[ 2153.092625] sd 7:0:0:0: [sda] Write Protect is off
[ 2153.174641] sd 7:0:0:0: [sda] Write cache: disabled, read cache: 
disabled, supports DPO and FUA
[ 2153.209183] scsi 7:0:3:0: Direct-Access SanDisk  LT0200MO 
P404 PQ: 0 ANSI: 6
[ 2153.387054] sd 7:0:2:0: [sdc] 390721968 512-byte logical blocks: (200 
GB/186 GiB)

[ 2153.429080] sd 7:0:2:0: [sdc] Write Protect is off
[ 2153.507273] scsi 7:0:4:0: Direct-Access SanDisk  LT0200MO 
P404 PQ: 0 ANSI: 6
[ 2153.515382] sd 7:0:2:0: [sdc] Write cache: disabled, read cache: 
disabled, supports DPO and FUA
[ 2153.685254] sd 7:0:3:0: [sdd] 390721968 512-byte logical blocks: (200 
GB/186 GiB)

[ 2153.727267] sd 7:0:3:0: [sdd] Write Protect is off
[ 2153.805266] scsi 7:0:5:0: Direct-Access SanDisk  LT0200MO 
P404 PQ: 0 ANSI: 6
[ 2153.813359] sd 7:0:3:0: [sdd] Write cache: disabled, read cache: 
disabled, supports DPO and FUA
[ 2153.983165] sd 7:0:4:0: [sde] 390721968 512-byte logical blocks: (200 
GB/186 GiB)

[ 2154.025158] sd 7:0:4:0: [sde] Write Protect is off
[ 2154.066761] scsi 7:0:6:0: Enclosure 12G SAS  Expander 
RevB PQ: 0 ANSI: 6
[ 2154.107141] sd 7:0:4:0: [sde] Write cache: disabled, read cache: 
disabled, supports DPO and FUA

[ 2154.124866] scsi host8: hisi_sas
[ 2154.281354] sd 7:0:5:0: [sdf] 390721968 512-byte logical blocks: (200 
GB/186 GiB)

[ 2154.323385] sd 7:0:5:0: [sdf] Write Protect is off
[ 2154.405404] sd 7:0:5:0: [sdf] Write cache: disabled, read cache: 
disabled, supports DPO and FUA

[ 2154.409289] sd 7:0:0:0: [sda] Attached SCSI disk
[ 2154.745722] sd 7:0:2:0: [sdc] Attached SCSI disk
[ 2155.044051] sd 7:0:3:0: [sdd] Attached SCSI disk

root@(none)$
root@(none)$ [ 2155.341986] sd 7:0:4:0: [sde] Attached SCSI disk

root@(none)$
root@(none)$ [ 2155.640217] sd 7:0:5:0: [sdf] Attached SCSI disk

root@(none)$
root@(none)$ rmmod hisi_sas_v2_hw.ko
[ 2158.679637] hisi_sas_v2_hw HISI0162:01: found dev[7:1] is gone
[ 2161.587449]  sdb: sdb1 sdb2
[ 2161.590253] kobject_add_internal failed for sdb1 (error: -2 parent: sdb)
[ 2161.596954] [ cut here ]
[ 2161.601562] WARNING: CPU: 0 PID: 2150 at lib/kobject.c:244 
kobject_add_internal+0xdc/0x298
[ 2161.609811] Modules linked in: hisi_sas_v2_hw(-) hisi_sas_main [last 
unloaded: hisi_sas_v2_hw]

[ 2161.618412]
[ 2161.619893] CPU: 0 PID: 2150 Comm: kworker/u129:6 Tainted: GW 
  4.11.0-rc1-13718-gb5b0d57-dirty #1584
[ 2161.630138] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon 
D05 UEFI 16.12 Release 02/27/2017

[ 2161.639345] Workqueue: events_unbound async_run_entry_fn
[ 2161.644644] task:

Re: [PATCHv6 8/8] scsi: inline command aborts

2017-04-24 Thread Hannes Reinecke
On 04/20/2017 04:14 AM, Martin K. Petersen wrote:
> Bart Van Assche  writes:
> 
>> On Thu, 2017-04-06 at 15:36 +0200, Hannes Reinecke wrote:
>>> The block layer always calls the timeout function from a workqueue
>>> context, so there is no need to have yet another workqueue for
>>> running command aborts.
>>>
>>> [ ... ]
>>> @@ -271,10 +266,14 @@ enum blk_eh_timer_return scsi_times_out(struct 
>>> request *req)
>>> rtn = host->hostt->eh_timed_out(scmd);
>>>  
>>> if (rtn == BLK_EH_NOT_HANDLED) {
>>> -   if (scsi_abort_command(scmd) != SUCCESS) {
>>> +   int ret;
>>> +
>>> +   ret = scsi_abort_command(scmd);
>>> +   if (ret == FAILED) {
>>> set_host_byte(scmd, DID_TIME_OUT);
>>> scsi_eh_scmd_add(scmd);
>>> -   }
>>> +   } else if (ret == FAST_IO_FAIL)
>>> +   rtn = BLK_EH_RESET_TIMER;
>>> }
>>
>> Has this patch been tested with the traditional block layer? For the
>> traditional block layer scsi_times_out() is called with the queue lock
>> held. Does this patch cause .eh_abort_handler(), a function that may
>> sleep, to be called with the queue lock held?
> 
> Hannes: Ping!
> 
Looks like Bart's right. Will be updating the patch.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH] scsi: sas: move scsi_remove_host call into sas_remove_host

2017-04-24 Thread John Garry

On 21/04/2017 13:11, Johannes Thumshirn wrote:

Move scsi_remove_host call into sas_remove_host and remove it from SAS HBA
drivers, so we don't mess up the ordering. This solves an issue with double
deleting sysfs entries that was introduced by the change of sysfs behaviour
from commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive").

Signed-off-by: Johannes Thumshirn 
Suggested-by: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: James Bottomley 
Cc: Jinpu Wang 
Cc: John Garry 


For what it's worth:

Tested-by: John Garry  # On hisi_sas



[PATCH] mpt: Move scsi_remove_host() out of mptscsih_remove_host()

2017-04-24 Thread Hannes Reinecke
Commit c5ce0ab ("scsi: sas: move scsi_remove_host call...")
moved the call to scsi_remove_host() into sas_remove_host(),
but forgot to modify the mpt drivers.

Fixes: c5ce0ab ("scsi: sas: move scsi_remove_host call into sas_remove_host")
Signed-off-by: Hannes Reinecke 
---
 drivers/message/fusion/mptfc.c|  7 ++-
 drivers/message/fusion/mptscsih.c |  2 --
 drivers/message/fusion/mptspi.c   | 10 +-
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/message/fusion/mptfc.c b/drivers/message/fusion/mptfc.c
index 98eafae..d065062 100644
--- a/drivers/message/fusion/mptfc.c
+++ b/drivers/message/fusion/mptfc.c
@@ -1329,7 +1329,7 @@
WQ_MEM_RECLAIM);
if (!ioc->fc_rescan_work_q) {
error = -ENOMEM;
-   goto out_mptfc_probe;
+   goto out_mptfc_host;
}
 
/*
@@ -1351,6 +1351,9 @@
 
return 0;
 
+out_mptfc_host:
+   scsi_remove_host(sh);
+
 out_mptfc_probe:
 
mptscsih_remove(pdev);
@@ -1530,6 +1533,8 @@ static void mptfc_remove(struct pci_dev *pdev)
}
}
 
+   scsi_remove_host(ioc->sh);
+
mptscsih_remove(pdev);
 }
 
diff --git a/drivers/message/fusion/mptscsih.c 
b/drivers/message/fusion/mptscsih.c
index 08a807d..6ba07c7 100644
--- a/drivers/message/fusion/mptscsih.c
+++ b/drivers/message/fusion/mptscsih.c
@@ -1176,8 +1176,6 @@ static int
mptscsih_get_completion_code(MPT_ADAPTER *ioc,
MPT_SCSI_HOST   *hd;
int sz1;
 
-   scsi_remove_host(host);
-
if((hd = shost_priv(host)) == NULL)
return;
 
diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index 031e088..9a336a1 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -1548,11 +1548,19 @@ static void mpt_dv_raid(struct _MPT_SCSI_HOST *hd, int 
disk)
return error;
 }
 
+static void mptspi_remove(struct pci_dev *pdev)
+{
+   MPT_ADAPTER *ioc = pci_get_drvdata(pdev);
+
+   scsi_remove_host(ioc->sh);
+   mptscsih_remove(pdev);
+}
+
 static struct pci_driver mptspi_driver = {
.name   = "mptspi",
.id_table   = mptspi_pci_table,
.probe  = mptspi_probe,
-   .remove = mptscsih_remove,
+   .remove = mptspi_remove,
.shutdown   = mptscsih_shutdown,
 #ifdef CONFIG_PM
.suspend= mptscsih_suspend,
-- 
1.8.5.6



Re: [PATCH] scsi: sas: move scsi_remove_host call into sas_remove_host

2017-04-24 Thread Hannes Reinecke
On 04/21/2017 02:11 PM, Johannes Thumshirn wrote:
> Move scsi_remove_host call into sas_remove_host and remove it from SAS HBA
> drivers, so we don't mess up the ordering. This solves an issue with double
> deleting sysfs entries that was introduced by the change of sysfs behaviour
> from commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive").
> 
> Signed-off-by: Johannes Thumshirn 
> Suggested-by: Christoph Hellwig 
> Cc: Hannes Reinecke 
> Cc: James Bottomley 
> Cc: Jinpu Wang 
> Cc: John Garry 
> ---
>  drivers/scsi/aic94xx/aic94xx_init.c   | 1 -
>  drivers/scsi/hisi_sas/hisi_sas_main.c | 1 -
>  drivers/scsi/isci/init.c  | 1 -
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c  | 1 -
>  drivers/scsi/mvsas/mv_init.c  | 1 -
>  drivers/scsi/pm8001/pm8001_init.c | 1 -
>  drivers/scsi/scsi_transport_sas.c | 8 ++--
>  7 files changed, 6 insertions(+), 8 deletions(-)
> 
Sadly, you've left out mptsas; that's still in use for VMWare installations.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH] sg: reset 'res_in_use' after unlinking reserved array

2017-04-24 Thread Johannes Thumshirn
On Mon, Apr 24, 2017 at 10:26:36AM +0200, Hannes Reinecke wrote:
> Once the reserved page array is unused we can reset the
> 'res_in_use' state; here we can do a lazy update without
> holding the mutex as we only need to check against
> concurrent access, not concurrent release.
> 
> Fixes: 1bc0eb0 ("scsi: sg: protect accesses to 'reserved' page array")
> Signed-off-by: Hannes Reinecke 
> ---

Reviewed-by: Johannes Thumshirn 
-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


[PATCH] sg: reset 'res_in_use' after unlinking reserved array

2017-04-24 Thread Hannes Reinecke
Once the reserved page array is unused we can reset the
'res_in_use' state; here we can do a lazy update without
holding the mutex as we only need to check against
concurrent access, not concurrent release.

Fixes: 1bc0eb0 ("scsi: sg: protect accesses to 'reserved' page array")
Signed-off-by: Hannes Reinecke 
---
 drivers/scsi/sg.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 8147147..06503c1 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -2056,6 +2056,8 @@ static long sg_compat_ioctl(struct file *filp, unsigned 
int cmd_in, unsigned lon
req_schp->page_order = 0;
req_schp->sglist_len = 0;
srp->res_used = 0;
+   /* Called without mutex lock to avoid deadlock */
+   sfp->res_in_use = 0;
 }
 
 static Sg_request *
-- 
1.8.5.6



[PATCH v2 6/7] sd_zbc: Remove superfluous assignments

2017-04-24 Thread damien . lemoal
From: Bart Van Assche 

A value is assigned to the variable 'capacity' in sd_zbc_read_zones()
but that value is never used. Hence remove the variable 'capacity'.

[Damien: There is no need to initialize to 0 the variable 'ret'
in sd_zbc_read_zones()]

Signed-off-by: Bart Van Assche 
Signed-off-by: Damien Le Moal 
---
 drivers/scsi/sd_zbc.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index a792682..29ba1d7 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -560,8 +560,7 @@ static int sd_zbc_setup(struct scsi_disk *sdkp)
 int sd_zbc_read_zones(struct scsi_disk *sdkp,
  unsigned char *buf)
 {
-   sector_t capacity;
-   int ret = 0;
+   int ret;
 
if (!sd_is_zoned(sdkp))
/*
@@ -593,7 +592,6 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp,
ret = sd_zbc_check_capacity(sdkp, buf);
if (ret)
goto err;
-   capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
 
/*
 * Check zone size: only devices with a constant zone size (except
-- 
2.9.3



[PATCH v2 7/7] sd_zbc: Do not write lock zones for reset

2017-04-24 Thread damien . lemoal
From: Damien Le Moal 

Resetting a zone write pointer is equivalent to discarding sectors:
after a reset, the zone sectors will contain zeros (or the format
pattern). So there is no need for mutual exclusion between a zone reset
and write. Similarly to discard, make it the responsability of the user
to properly synchronize between reset and write (as is done now for
discard and write).

Signed-off-by: Damien Le Moal 
---
 drivers/scsi/sd_zbc.c | 42 --
 1 file changed, 16 insertions(+), 26 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 29ba1d7..fcf0fad 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -237,7 +237,6 @@ int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
sector_t sector = blk_rq_pos(rq);
sector_t block = sectors_to_logical(sdkp->device, sector);
-   unsigned int zno = block >> sdkp->zone_shift;
 
if (!sd_is_zoned(sdkp))
/* Not a zoned device */
@@ -250,11 +249,6 @@ int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
/* Unaligned request */
return BLKPREP_KILL;
 
-   /* Do not allow concurrent reset and writes */
-   if (sdkp->zones_wlock &&
-   test_and_set_bit(zno, sdkp->zones_wlock))
-   return BLKPREP_DEFER;
-
cmd->cmd_len = 16;
memset(cmd->cmnd, 0, cmd->cmd_len);
cmd->cmnd[0] = ZBC_OUT;
@@ -324,38 +318,34 @@ void sd_zbc_complete(struct scsi_cmnd *cmd,
struct request *rq = cmd->request;
 
switch (req_op(rq)) {
+   case REQ_OP_ZONE_RESET:
+
+   if (result &&
+   sshdr->sense_key == ILLEGAL_REQUEST &&
+   sshdr->asc == 0x24)
+   /*
+* INVALID FIELD IN CDB error: reset of a conventional
+* zone was attempted. Nothing to worry about, so be
+* quiet about the error.
+*/
+   rq->rq_flags |= RQF_QUIET;
+   break;
+
case REQ_OP_WRITE:
case REQ_OP_WRITE_SAME:
-   case REQ_OP_ZONE_RESET:
 
/* Unlock the zone */
sd_zbc_write_unlock_zone(cmd);
 
-   if (!result ||
-   sshdr->sense_key != ILLEGAL_REQUEST)
-   break;
-
-   switch (sshdr->asc) {
-   case 0x24:
-   /*
-* INVALID FIELD IN CDB error: For a zone reset,
-* this means that a reset of a conventional
-* zone was attempted. Nothing to worry about in
-* this case, so be quiet about the error.
-*/
-   if (req_op(rq) == REQ_OP_ZONE_RESET)
-   rq->rq_flags |= RQF_QUIET;
-   break;
-   case 0x21:
+   if (result &&
+   sshdr->sense_key == ILLEGAL_REQUEST &&
+   sshdr->asc == 0x21)
/*
 * INVALID ADDRESS FOR WRITE error: It is unlikely that
 * retrying write requests failed with any kind of
 * alignement error will result in success. So don't.
 */
cmd->allowed = 0;
-   break;
-   }
-
break;
 
case REQ_OP_ZONE_REPORT:
-- 
2.9.3



[PATCH v2 5/7] sd_zbc: Rename sd_zbc_setup_write_cmnd

2017-04-24 Thread damien . lemoal
From: Damien Le Moal 

Rename sd_zbc_setup_write_cmnd() to sd_zbc_write_lock_zone() to be
clear about what the function actually does. To be consistent, also
rename sd_zbc_cancel_write_cmnd() to sd_zbc_write_unlock_zone().

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal 
Reviewed-by: Bart Van Assche 
---
 drivers/scsi/sd.c |  6 +++---
 drivers/scsi/sd.h |  8 
 drivers/scsi/sd_zbc.c | 12 
 3 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 2d1ac5c..25a4ab1 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -846,7 +846,7 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd *cmd)
BUG_ON(bio_offset(bio) || bio_iovec(bio).bv_len != sdp->sector_size);
 
if (sd_is_zoned(sdkp)) {
-   ret = sd_zbc_setup_write_cmnd(cmd);
+   ret = sd_zbc_write_lock_zone(cmd);
if (ret != BLKPREP_OK)
return ret;
}
@@ -918,7 +918,7 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd *SCpnt)
unsigned char protect;
 
if (zoned_write) {
-   ret = sd_zbc_setup_write_cmnd(SCpnt);
+   ret = sd_zbc_write_lock_zone(SCpnt);
if (ret != BLKPREP_OK)
return ret;
}
@@ -1145,7 +1145,7 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd 
*SCpnt)
ret = BLKPREP_OK;
  out:
if (zoned_write && ret != BLKPREP_OK)
-   sd_zbc_cancel_write_cmnd(SCpnt);
+   sd_zbc_write_unlock_zone(SCpnt);
 
return ret;
 }
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 0cf9680..2044e07a 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -267,8 +267,8 @@ static inline int sd_is_zoned(struct scsi_disk *sdkp)
 extern int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buffer);
 extern void sd_zbc_remove(struct scsi_disk *sdkp);
 extern void sd_zbc_print_zones(struct scsi_disk *sdkp);
-extern int sd_zbc_setup_write_cmnd(struct scsi_cmnd *cmd);
-extern void sd_zbc_cancel_write_cmnd(struct scsi_cmnd *cmd);
+extern int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd);
+extern void sd_zbc_write_unlock_zone(struct scsi_cmnd *cmd);
 extern int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd);
 extern int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd);
 extern void sd_zbc_complete(struct scsi_cmnd *cmd, unsigned int good_bytes,
@@ -286,13 +286,13 @@ static inline void sd_zbc_remove(struct scsi_disk *sdkp) 
{}
 
 static inline void sd_zbc_print_zones(struct scsi_disk *sdkp) {}
 
-static inline int sd_zbc_setup_write_cmnd(struct scsi_cmnd *cmd)
+static inline int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 {
/* Let the drive fail requests */
return BLKPREP_OK;
 }
 
-static inline void sd_zbc_cancel_write_cmnd(struct scsi_cmnd *cmd) {}
+static inline void sd_zbc_write_unlock_zone(struct scsi_cmnd *cmd) {}
 
 static inline int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
 {
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 92620c8..a792682 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -269,7 +269,7 @@ int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
return BLKPREP_OK;
 }
 
-int sd_zbc_setup_write_cmnd(struct scsi_cmnd *cmd)
+int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 {
struct request *rq = cmd->request;
struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
@@ -303,8 +303,9 @@ int sd_zbc_setup_write_cmnd(struct scsi_cmnd *cmd)
return BLKPREP_OK;
 }
 
-static void sd_zbc_unlock_zone(struct request *rq)
+void sd_zbc_write_unlock_zone(struct scsi_cmnd *cmd)
 {
+   struct request *rq = cmd->request;
struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
 
if (sdkp->zones_wlock) {
@@ -315,11 +316,6 @@ static void sd_zbc_unlock_zone(struct request *rq)
}
 }
 
-void sd_zbc_cancel_write_cmnd(struct scsi_cmnd *cmd)
-{
-   sd_zbc_unlock_zone(cmd->request);
-}
-
 void sd_zbc_complete(struct scsi_cmnd *cmd,
 unsigned int good_bytes,
 struct scsi_sense_hdr *sshdr)
@@ -333,7 +329,7 @@ void sd_zbc_complete(struct scsi_cmnd *cmd,
case REQ_OP_ZONE_RESET:
 
/* Unlock the zone */
-   sd_zbc_unlock_zone(rq);
+   sd_zbc_write_unlock_zone(cmd);
 
if (!result ||
sshdr->sense_key != ILLEGAL_REQUEST)
-- 
2.9.3



[PATCH v2 3/7] sd: Cleanup sd_done sense data handling

2017-04-24 Thread damien . lemoal
From: Damien Le Moal 

In sd_done(), for the ILLEGAL REQUEST sense key case, add an 'else if'
after the first 'if (sshdr.asc == 0x10)' test to avoid the second test
(the values tested are different).

Still for the same ILLEGAL REQUEST case, move the declarations of the
variables 'op' and 'unmap' within the scope of the 'else if' case since
these variables are only used there. At the same time, remove the
unnecessary good_bytes 0 assignment as this code can only be executed
with result != 0 and good_bytes is already set to 0 in that case.

Signed-off-by: Damien Le Moal 
Reviewed-by: Bart Van Assche 
---
 drivers/scsi/sd.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 10c7657..2d1ac5c 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1816,8 +1816,6 @@ static int sd_done(struct scsi_cmnd *SCpnt)
struct request *req = SCpnt->request;
int sense_valid = 0;
int sense_deferred = 0;
-   unsigned char op = SCpnt->cmnd[0];
-   unsigned char unmap = SCpnt->cmnd[1] & 8;
 
switch (req_op(req)) {
case REQ_OP_DISCARD:
@@ -1875,10 +1873,14 @@ static int sd_done(struct scsi_cmnd *SCpnt)
good_bytes = sd_completed_bytes(SCpnt);
break;
case ILLEGAL_REQUEST:
-   if (sshdr.asc == 0x10)  /* DIX: Host detected corruption */
+   if (sshdr.asc == 0x10) {
+   /* DIX: Host detected corruption */
good_bytes = sd_completed_bytes(SCpnt);
-   /* INVALID COMMAND OPCODE or INVALID FIELD IN CDB */
-   if (sshdr.asc == 0x20 || sshdr.asc == 0x24) {
+   } else if (sshdr.asc == 0x20 || sshdr.asc == 0x24) {
+   /* INVALID COMMAND OPCODE or INVALID FIELD IN CDB */
+   unsigned char op = SCpnt->cmnd[0];
+   unsigned char unmap = SCpnt->cmnd[1] & 8;
+
switch (op) {
case UNMAP:
sd_config_discard(sdkp, SD_LBP_DISABLE);
@@ -1890,8 +1892,6 @@ static int sd_done(struct scsi_cmnd *SCpnt)
else {
sdkp->device->no_write_same = 1;
sd_config_write_same(sdkp);
-
-   good_bytes = 0;
req->__data_len = blk_rq_bytes(req);
req->rq_flags |= RQF_QUIET;
}
-- 
2.9.3



[PATCH v2 2/7] sd: Improve sd_completed_bytes

2017-04-24 Thread damien . lemoal
From: Damien Le Moal 

Re-shuffle the code to be more efficient by not initializing
variables upfront (i.e. do it only when necessary).
Also replace the do_div calls with calls to sectors_to_logical().

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal 
---
 drivers/scsi/sd.c | 48 ++--
 1 file changed, 26 insertions(+), 22 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index ce62d2c..10c7657 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1758,41 +1758,45 @@ static int sd_eh_action(struct scsi_cmnd *scmd, int 
eh_disp)
 
 static unsigned int sd_completed_bytes(struct scsi_cmnd *scmd)
 {
-   u64 start_lba = blk_rq_pos(scmd->request);
-   u64 end_lba = blk_rq_pos(scmd->request) + (scsi_bufflen(scmd) / 512);
-   u64 factor = scmd->device->sector_size / 512;
-   u64 bad_lba;
-   int info_valid;
+   struct request *req = scmd->request;
+   struct scsi_device *sdev = scmd->device;
+   unsigned int transferred, good_bytes;
+   u64 start_lba, end_lba, bad_lba;
+
/*
-* resid is optional but mostly filled in.  When it's unused,
-* its value is zero, so we assume the whole buffer transferred
+* Some commands have a payload smaller than the device logical
+* block size (e.g. INQUIRY on a 4K disk).
 */
-   unsigned int transferred = scsi_bufflen(scmd) - scsi_get_resid(scmd);
-   unsigned int good_bytes;
-
-   info_valid = scsi_get_sense_info_fld(scmd->sense_buffer,
-SCSI_SENSE_BUFFERSIZE,
-&bad_lba);
-   if (!info_valid)
+   if (scsi_bufflen(scmd) <= sdev->sector_size)
return 0;
 
-   if (scsi_bufflen(scmd) <= scmd->device->sector_size)
+   /* Check if we have a 'bad_lba' information */
+   if (!scsi_get_sense_info_fld(scmd->sense_buffer,
+SCSI_SENSE_BUFFERSIZE,
+&bad_lba))
return 0;
 
-   /* be careful ... don't want any overflows */
-   do_div(start_lba, factor);
-   do_div(end_lba, factor);
-
-   /* The bad lba was reported incorrectly, we have no idea where
+   /*
+* If the bad lba was reported incorrectly, we have no idea where
 * the error is.
 */
-   if (bad_lba < start_lba  || bad_lba >= end_lba)
+   start_lba = sectors_to_logical(sdev, blk_rq_pos(req));
+   end_lba = sectors_to_logical(sdev,
+   blk_rq_pos(req) + (scsi_bufflen(scmd) >> 9));
+   if (bad_lba < start_lba || bad_lba >= end_lba)
return 0;
 
+   /*
+* resid is optional but mostly filled in.  When it's unused,
+* its value is zero, so we assume the whole buffer transferred
+*/
+   transferred = scsi_bufflen(scmd) - scsi_get_resid(scmd);
+
/* This computation should always be done in terms of
 * the resolution of the device's medium.
 */
-   good_bytes = (bad_lba - start_lba) * scmd->device->sector_size;
+   good_bytes = logical_to_bytes(sdev, bad_lba - start_lba);
+
return min(good_bytes, transferred);
 }
 
-- 
2.9.3



[PATCH v2 1/7] sd: Fix functions description

2017-04-24 Thread damien . lemoal
From: Damien Le Moal 

Fix argument names and description of functions documentation comments.
No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal 
---
 drivers/scsi/sd.c | 29 ++---
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 00b168b..ce62d2c 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -705,11 +705,10 @@ static void sd_config_discard(struct scsi_disk *sdkp, 
unsigned int mode)
 
 /**
  * sd_setup_discard_cmnd - unmap blocks on thinly provisioned device
- * @sdp: scsi device to operate on
- * @rq: Request to prepare
+ * @cmd: command to prepare
  *
- * Will issue either UNMAP or WRITE SAME(16) depending on preference
- * indicated by target device.
+ * Will setup either UNMAP, WRITE SAME(16) or WRITE SAME(10) depending
+ * on the provisioning mode of the target device.
  **/
 static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
 {
@@ -827,8 +826,8 @@ static void sd_config_write_same(struct scsi_disk *sdkp)
  * sd_setup_write_same_cmnd - write the same data to multiple blocks
  * @cmd: command to prepare
  *
- * Will issue either WRITE SAME(10) or WRITE SAME(16) depending on
- * preference indicated by target device.
+ * Will setup either WRITE SAME(10) or WRITE SAME(16) depending on
+ * the preference indicated by the target device.
  **/
 static int sd_setup_write_same_cmnd(struct scsi_cmnd *cmd)
 {
@@ -1190,8 +1189,8 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
 
 /**
  * sd_open - open a scsi disk device
- * @inode: only i_rdev member may be used
- * @filp: only f_mode and f_flags may be used
+ * @bdev: Block device of the scsi disk to open
+ * @mode: FMODE_* mask
  *
  * Returns 0 if successful. Returns a negated errno value in case 
  * of error.
@@ -1267,8 +1266,8 @@ static int sd_open(struct block_device *bdev, fmode_t 
mode)
 /**
  * sd_release - invoked when the (last) close(2) is called on this
  * scsi disk.
- * @inode: only i_rdev member may be used
- * @filp: only f_mode and f_flags may be used
+ * @disk: disk to release
+ * @mode: FMODE_* mask
  *
  * Returns 0. 
  *
@@ -1324,8 +1323,8 @@ static int sd_getgeo(struct block_device *bdev, struct 
hd_geometry *geo)
 
 /**
  * sd_ioctl - process an ioctl
- * @inode: only i_rdev/i_bdev members may be used
- * @filp: only f_mode and f_flags may be used
+ * @bdev: target block device
+ * @mode: FMODE_* mask
  * @cmd: ioctl command number
  * @arg: this is third argument given to ioctl(2) system call.
  * Often contains a pointer.
@@ -2713,7 +2712,7 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, 
unsigned char *buffer)
 
 /**
  * sd_read_block_limits - Query disk device for preferred I/O sizes.
- * @disk: disk to query
+ * @sdkp: disk to query
  */
 static void sd_read_block_limits(struct scsi_disk *sdkp)
 {
@@ -2779,7 +2778,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
 
 /**
  * sd_read_block_characteristics - Query block dev. characteristics
- * @disk: disk to query
+ * @sdkp: disk to query
  */
 static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 {
@@ -2827,7 +2826,7 @@ static void sd_read_block_characteristics(struct 
scsi_disk *sdkp)
 
 /**
  * sd_read_block_provisioning - Query provisioning VPD page
- * @disk: disk to query
+ * @sdkp: disk to query
  */
 static void sd_read_block_provisioning(struct scsi_disk *sdkp)
 {
-- 
2.9.3



[PATCH v2 0/7] scsi: sd/sd_zbc cleanup and improvements

2017-04-24 Thread damien . lemoal
From: Damien Le Moal 

This series of small patches cleanup code mainly in sd.c and sd_zbc.c, with
another small improvement in scsi_error.c.

Only the last patch of this series introduces a functional change. All other
patches do not change any functionality.

Changes from v1:
* Dropped white space patch and white space changes in other patches as
  requested by Bart.
* Dropped patch changing handling of sense_valid in sd_done() as requested by
  James.
* Fixed patch 4 (function declaration) to remove checkscript warnings

Bart Van Assche (1):
  sd_zbc: Remove superfluous assignments

Damien Le Moal (6):
  sd: Fix functions description
  sd: Improve sd_completed_bytes
  sd: Cleanup sd_done sense data handling
  scsi: Improve scsi_get_sense_info_fld
  sd_zbc: Rename sd_zbc_setup_write_cmnd
  sd_zbc: Do not write lock zones for reset

 drivers/scsi/scsi_error.c | 38 ---
 drivers/scsi/sd.c | 97 ---
 drivers/scsi/sd.h |  8 ++--
 drivers/scsi/sd_zbc.c | 60 +++--
 include/scsi/scsi_eh.h|  4 +-
 5 files changed, 93 insertions(+), 114 deletions(-)

-- 
2.9.3



[PATCH v2 4/7] scsi: Improve scsi_get_sense_info_fld

2017-04-24 Thread damien . lemoal
From: Damien Le Moal 

Use get_unaligned_be32 and get_unaligned_be64 to obtain values from
the sense buffer instead of open coding the operations.
Also change the function return value to a bool and fix the function
signature declaration to remove spaces triggering checkpatch warnings.

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal 
Reviewed-by: Bart Van Assche 
---
 drivers/scsi/scsi_error.c | 38 +++---
 include/scsi/scsi_eh.h|  4 ++--
 2 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 53e3343..d70c67c 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -46,6 +46,8 @@
 
 #include 
 
+#include 
+
 static void scsi_eh_done(struct scsi_cmnd *scmd);
 
 /*
@@ -2361,44 +2363,34 @@ EXPORT_SYMBOL(scsi_command_normalize_sense);
  * field will be placed if found.
  *
  * Return value:
- * 1 if information field found, 0 if not found.
+ * true if information field found, false if not found.
  */
-int scsi_get_sense_info_fld(const u8 * sense_buffer, int sb_len,
-   u64 * info_out)
+bool scsi_get_sense_info_fld(const u8 *sense_buffer, int sb_len,
+u64 *info_out)
 {
-   int j;
const u8 * ucp;
-   u64 ull;
 
if (sb_len < 7)
-   return 0;
+   return false;
switch (sense_buffer[0] & 0x7f) {
case 0x70:
case 0x71:
if (sense_buffer[0] & 0x80) {
-   *info_out = (sense_buffer[3] << 24) +
-   (sense_buffer[4] << 16) +
-   (sense_buffer[5] << 8) + sense_buffer[6];
-   return 1;
-   } else
-   return 0;
+   *info_out = get_unaligned_be32(&sense_buffer[3]);
+   return true;
+   }
+   return false;
case 0x72:
case 0x73:
ucp = scsi_sense_desc_find(sense_buffer, sb_len,
   0 /* info desc */);
if (ucp && (0xa == ucp[1])) {
-   ull = 0;
-   for (j = 0; j < 8; ++j) {
-   if (j > 0)
-   ull <<= 8;
-   ull |= ucp[4 + j];
-   }
-   *info_out = ull;
-   return 1;
-   } else
-   return 0;
+   *info_out = get_unaligned_be64(&ucp[4]);
+   return true;
+   }
+   return false;
default:
-   return 0;
+   return false;
}
 }
 EXPORT_SYMBOL(scsi_get_sense_info_fld);
diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h
index a25b328..64d30d8 100644
--- a/include/scsi/scsi_eh.h
+++ b/include/scsi/scsi_eh.h
@@ -23,8 +23,8 @@ static inline bool scsi_sense_is_deferred(const struct 
scsi_sense_hdr *sshdr)
return ((sshdr->response_code >= 0x70) && (sshdr->response_code & 1));
 }
 
-extern int scsi_get_sense_info_fld(const u8 * sense_buffer, int sb_len,
-  u64 * info_out);
+extern bool scsi_get_sense_info_fld(const u8 *sense_buffer, int sb_len,
+   u64 *info_out);
 
 extern int scsi_ioctl_reset(struct scsi_device *, int __user *);
 
-- 
2.9.3



Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-24 Thread Knut Omang
On Mon, 2017-04-17 at 08:31 +1000, Benjamin Herrenschmidt wrote:
> On Sun, 2017-04-16 at 10:34 -0600, Logan Gunthorpe wrote:
> > 
> > On 16/04/17 09:53 AM, Dan Williams wrote:
> > > ZONE_DEVICE allows you to redirect via get_dev_pagemap() to retrieve
> > > context about the physical address in question. I'm thinking you can
> > > hang bus address translation data off of that structure. This seems
> > > vaguely similar to what HMM is doing.
> > 
> > Thanks! I didn't realize you had the infrastructure to look up a device
> > from a pfn/page. That would really come in handy for us.
> 
> It does indeed. I won't be able to play with that much for a few weeks
> (see my other email) so if you're going to tackle this while I'm away,
> can you work with Jerome to make sure you don't conflict with HMM ?
> 
> I really want a way for HMM to be able to layout struct pages over the
> GPU BARs rather than in "allocated free space" for the case where the
> BAR is big enough to cover all of the GPU memory.
> 
> In general, I'd like a simple & generic way for any driver to ask the
> core to layout DMA'ble struct pages over BAR space. I an not convinced
> this requires a "p2mem device" to be created on top of this though but
> that's a different discussion.
> 
> Of course the actual ability to perform the DMA mapping will be subject
> to various restrictions that will have to be implemented in the actual
> "dma_ops override" backend. We can have generic code to handle the case
> where devices reside on the same domain, which can deal with switch
> configuration etc... we will need to have iommu specific code to handle
> the case going through the fabric. 
> 
> Virtualization is a separate can of worms due to how qemu completely
> fakes the MMIO space, we can look into that later.

My first reflex when reading this thread was to think that this whole domain
lends it self excellently to testing via Qemu. Could it be that doing this in 
the opposite direction might be a safer approach in the long run even though 
(significant) more work up-front?

Eg. start by fixing/providing/documenting suitable model(s) 
for testing this in Qemu, then implement the patch set based 
on those models?

Thanks,
Knut

> 
> Cheers,
> Ben.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 10/10] scsi: Implement blk_mq_ops.show_rq()

2017-04-24 Thread Hannes Reinecke
On 04/22/2017 01:40 AM, Bart Van Assche wrote:
> Show the SCSI CDB, .eh_eflags and .result for pending SCSI commands
> in /sys/kernel/debug/block/*/mq/*/dispatch and */rq_list.
> 
> Signed-off-by: Bart Van Assche 
> Cc: Martin K. Petersen 
> Cc: James Bottomley 
> Cc: Omar Sandoval 
> Cc: Hannes Reinecke 
> Cc: 
> ---
>  drivers/scsi/scsi_lib.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 4a20e6098f7c..90bb269042df 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2125,6 +2125,16 @@ static void scsi_exit_rq(struct request_queue *q, 
> struct request *rq)
>   scsi_free_sense_buffer(shost, cmd->sense_buffer);
>  }
>  
> +static void scsi_show_rq(struct seq_file *m, struct request *rq)
> +{
> + struct scsi_cmnd *cmd = container_of(scsi_req(rq), typeof(*cmd), req);
> + unsigned int i;
> +
> + seq_puts(m, ", .cmd =");
> + for (i = 0; i < cmd->cmd_len; i++)
> + seq_printf(m, " %02x", cmd->cmnd[i]);
> +}
> +
>  struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)
>  {
>   struct Scsi_Host *shost = sdev->host;
> @@ -2157,6 +2167,7 @@ static const struct blk_mq_ops scsi_mq_ops = {
>   .queue_rq   = scsi_queue_rq,
>   .complete   = scsi_softirq_done,
>   .timeout= scsi_timeout,
> + .show_rq= scsi_show_rq,
>   .init_request   = scsi_init_request,
>   .exit_request   = scsi_exit_request,
>   .map_queues = scsi_map_queues,
> 
Hmm. Can't say I'm happy with this callback.
And I really would like to see a similar implementation for NVMe.

But if others agree:

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.com  +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH v3 00/22] lpfc updates for 11.2.0.12

2017-04-24 Thread Christoph Hellwig
Applie to the nvme-4.12 branch.  Any praying we're not going to see
any conflicts with the SCSI tree.


[lkp-robot] [sd] ab1218235c: INFO:possible_recursive_locking_detected

2017-04-24 Thread kernel test robot

FYI, we noticed the following commit:

commit: ab1218235c09d0fabdea31cbd4f099e7cabccb27 ("sd: Make synchronize cache 
upon shutdown asynchronous")
url: 
https://github.com/0day-ci/linux/commits/Bart-Van-Assche/Avoid-that-__scsi_remove_device-hangs/20170418-013958
base: https://git.kernel.org/cgit/linux/kernel/git/jejb/scsi.git for-next

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -m 420M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+---+++
|   
| b2ba13eb79 | ab1218235c |
+---+++
| boot_successes
| 10 | 0  |
| boot_failures 
| 10 | 45 |
| INFO:task_blocked_for_more_than#seconds   
| 10 | 37 |
| calltrace:debug_show_all_locks
| 6  | 32 |
| calltrace:stress_inorder_work 
| 1  | 9  |
| INFO:possible_recursive_locking_detected  
| 0  | 45 |
| BUG:kernel_hang_in_test_stage 
| 0  | 45 |
| INFO:rcu_preempt_detected_stalls_on_CPUs/tasks
| 0  | 1  |
| EIP:queued_spin_lock_slowpath 
| 0  | 1  |
| 
INFO:rcu_sched_detected_expedited_stalls_on_CPUs/tasks:{#-...}#jiffies_s:#root:#/
 | 0  | 1  |
| INFO:lockdep_is_turned_off
| 0  | 1  |
| calltrace:stress_reorder_work 
| 0  | 4  |
| calltrace:pick_next_task_fair 
| 0  | 2  |
+---+++



[  185.305149] [ INFO: possible recursive locking detected ]
[  185.306075] 4.11.0-rc2-00217-gab12182 #1 Not tainted
[  185.306928] -
[  185.307854] ksoftirqd/0/6 is trying to acquire lock:
[  185.308712]  (&(&q->__queue_lock)->rlock){..-...}, at: [] 
blk_put_request+0x33/0x50
[  185.310170] 
[  185.310170] but task is already holding lock:
[  185.311168]  (&(&q->__queue_lock)->rlock){..-...}, at: [] 
scsi_end_request+0x103/0x1b0
[  185.312666] 
[  185.312666] other info that might help us debug this:
[  185.313777]  Possible unsafe locking scenario:
[  185.313777] 
[  185.314791]CPU0
[  185.315221]
[  185.315654]   lock(&(&q->__queue_lock)->rlock);
[  185.316432]   lock(&(&q->__queue_lock)->rlock);
[  185.317206] 
[  185.317206]  *** DEADLOCK ***
[  185.317206] 
[  185.318216]  May be due to missing lock nesting notation
[  185.318216] 
[  185.319376] 1 lock held by ksoftirqd/0/6:
[  185.320072]  #0:  (&(&q->__queue_lock)->rlock){..-...}, at: [] 
scsi_end_request+0x103/0x1b0
[  185.321635] 
[  185.321635] stack backtrace:
[  185.322381] CPU: 0 PID: 6 Comm: ksoftirqd/0 Not tainted 
4.11.0-rc2-00217-gab12182 #1
[  185.323698] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
[  185.325427] Call Trace:
[  185.325859]  dump_stack+0x76/0xa9
[  185.326440]  __lock_acquire+0x10d2/0x1260
[  185.327132]  lock_acquire+0x8c/0x1f0
[  185.327757]  ? blk_put_request+0x33/0x50
[  185.328455]  _raw_spin_lock_irqsave+0x39/0x70
[  185.329205]  ? blk_put_request+0x33/0x50
[  185.329890]  blk_put_request+0x33/0x50
[  185.330549]  sd_sync_cache_done+0x3e/0x70
[  185.331240]  blk_finish_request+0x60/0x120
[  185.331948]  scsi_end_request+0x10f/0x1b0
[  185.332644]  ? kvm_sched_clock_read+0x9/0x20
[  185.80]  scsi_io_completion+0x1ac/0x5a0
[  185.334109]  scsi_finish_command+0xa6/0xe0
[  185.334825]  scsi_softirq_done+0xe5/0x100
[  185.335527]  blk_done_softirq+0x7b/0x90
[  185.336190]  __do_softirq+0x190/0x4f0
[  185.336830]  run_ksoftirqd+0x1d/0x70
[  185.337455]  smpboot_thread_fn+0x11a/0x230
[  185.338161]  kthread+0xd9/0x110
[  185.338715]  ? sort_range+0x20/0x20
[  185.339321]  ? __kthread_create_on_node+0x190/0x190
[  185.340170]  ret_from_fork+0x21/0x30

Elapsed time: 1520
BUG: kernel hang in test stage

initrds=(
/osimage/openwrt/openwrt-i386-2016-03-16.cgz

/lkp/scheduled/vm-lkp-st01-openwrt-ia32-1/boot-1-openwrt-i386-2016-03-16.cgz-ab1218235c09d0fabdea31cbd4f099e7cabccb27-20170423-6680-3gpgeq-0.cgz