[PATCH 1/5] scsi: bnx2i: convert to workqueue

2017-04-10 Thread Sebastian Andrzej Siewior
The driver creates its own per-CPU threads which are updated based on CPU hotplug events. It is also possible to use kworkers and remove some of the infrastructure get the same job done while saving a few lines of code. The DECLARE_PER_CPU() definition is moved into the header file where it

[PATCH 5/5] scsi: bnx2fc: convert bnx2fc_l2_rcv_thread() to workqueue

2017-04-10 Thread Sebastian Andrzej Siewior
This is not driven by the hotplug conversation but while I am at it looks like a good candidate. Converting the thread to a workqueue user removes also the kthread member from struct fcoe_percpu_s. This driver uses the struct fcoe_percpu_s but it does not need the crc_eof_page member, only the

[PATCH 4/5] scsi: bnx2fc: annoate unlock + release for sparse

2017-04-10 Thread Sebastian Andrzej Siewior
The caller of bnx2fc_abts_cleanup() holds the tgt->tgt_lock lock and it is expected to release the lock during wait_for_completion() and acquire the lock afterwards. This patch was only compile-tested due to -ENODEV. Cc: qlogic-storage-upstr...@qlogic.com Cc: Christoph Hellwig

[PATCH 2/5] scsi: bnx2fc: convert per-CPU thread to workqueue

2017-04-10 Thread Sebastian Andrzej Siewior
The driver creates its own per-CPU threads which are updated based on CPU hotplug events. It is also possible to use kworkers and remove some of the infrastructure get the same job done while saving a few lines of code. bnx2fc_percpu_io_thread() becomes bnx2fc_percpu_io_work() which is mostly the

[PATCH 3/5] scsi: bnx2fc: clean up header definitions

2017-04-10 Thread Sebastian Andrzej Siewior
- All symbols which are only used within one .c file are marked static and removed from the bnx2fc.h file if possible. - the declarion of bnx2fc_percpu is moved into the header file This patch was only compile-tested due to -ENODEV. Cc: qlogic-storage-upstr...@qlogic.com Cc: Christoph Hellwig

[REEEEPOST] bnx2i + bnx2fc: convert to generic workqueue (#3)

2017-04-10 Thread Sebastian Andrzej Siewior
This is a repost to get the patches applied against v4.11-rc6. mkp's scsi for-next tree can be merged with no conflicts. The last repost [0] was not merged and stalled after Martin pinged Chad [1]. He didn't even reply after tglx pinged him approx two weeks later. Johannes Thumshirn was so kind

RFC: remove REQ_OP_WRITE_SAME

2017-04-10 Thread Christoph Hellwig
Now that we are using REQ_OP_WRITE_ZEROES for all zeroing needs in the kernel there is very little use left for REQ_OP_WRITE_SAME. We only have two callers left, and both just export optional protocol features to remote systems: DRBD and the target code. Do we have any major users of those? If

[PATCH 1/8] drbd: drop REQ_OP_WRITE_SAME support

2017-04-10 Thread Christoph Hellwig
Linux only used it for zeroing, for which we have better methods now. Signed-off-by: Christoph Hellwig --- drivers/block/drbd/drbd_main.c | 28 ++ drivers/block/drbd/drbd_nl.c | 60 -- drivers/block/drbd/drbd_receiver.c

[PATCH 5/8] dm: remove write same support

2017-04-10 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig --- drivers/md/dm-core.h | 1 - drivers/md/dm-io.c| 21 + drivers/md/dm-linear.c| 1 - drivers/md/dm-mpath.c | 1 - drivers/md/dm-rq.c| 3 --- drivers/md/dm-stripe.c| 4

[PATCH 8/8] block: use bio_has_data to check if a bio has bvecs

2017-04-10 Thread Christoph Hellwig
Now that Write Same is gone and discard bios never have a payload we can simply use bio_has_data as an indicator that the bio has bvecs that need to be handled. Signed-off-by: Christoph Hellwig --- block/bio.c | 8 +--- block/blk-merge.c | 9 +

[PATCH 7/8] block: remove bio_no_advance_iter

2017-04-10 Thread Christoph Hellwig
Now that we don't have to support the odd Write Same special case we can simply increment the iter if the bio has data, else just manipulate bi_size directly. Signed-off-by: Christoph Hellwig --- include/linux/bio.h | 13 +++-- 1 file changed, 3 insertions(+), 10

[PATCH 4/8] md: drop WRITE_SAME support

2017-04-10 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig --- drivers/md/linear.c| 1 - drivers/md/md.h| 7 --- drivers/md/multipath.c | 1 - drivers/md/raid0.c | 2 -- drivers/md/raid1.c | 4 +--- drivers/md/raid10.c| 1 - drivers/md/raid5.c | 1 - 7 files changed, 1

[PATCH for-4.4 15/16] blk-mq: Avoid memory reclaim when remapping queues

2017-04-10 Thread Sumit Semwal
From: Gabriel Krisman Bertazi [ Upstream commit 36e1f3d107867b25c616c2fd294f5a1c9d4e5d09 ] While stressing memory and IO at the same time we changed SMT settings, we were able to consistently trigger deadlocks in the mm system, which froze the entire machine. I

Re: [PATCH 0/3] Unblock SCSI devices even if the LLD is being unloaded

2017-04-10 Thread Bart Van Assche
On Fri, 2017-03-17 at 05:54 -0700, James Bottomley wrote: > but if you want to pursue your approach fixing the > race with module exit is a requirement. Hello James, Sorry that it took so long but I finally found the time to implement and test an alternative. I will post the patches that

[PATCH 3/8] sd: remove write same support

2017-04-10 Thread Christoph Hellwig
There are no more end-users of REQ_OP_WRITE_SAME left, so we can start deleting it. Signed-off-by: Christoph Hellwig --- drivers/scsi/sd.c | 70 --- drivers/scsi/sd_zbc.c | 1 - 2 files changed, 71 deletions(-) diff --git

[PATCH 6/8] block: remove REQ_OP_WRITE_SAME support

2017-04-10 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig --- block/bio.c | 3 -- block/blk-core.c| 11 +- block/blk-lib.c | 90 - block/blk-merge.c | 32 block/blk-settings.c| 16

[PATCH 2/8] target: remove iblock WRITE_SAME passthrough support

2017-04-10 Thread Christoph Hellwig
Use the pscsi driver to support arbitrary command passthrough instead. Signed-off-by: Christoph Hellwig --- drivers/target/target_core_iblock.c | 34 -- 1 file changed, 34 deletions(-) diff --git a/drivers/target/target_core_iblock.c

Re: [RFC 6/8] nvmet: Be careful about using iomem accesses when dealing with p2pmem

2017-04-10 Thread Logan Gunthorpe
On 10/04/17 02:29 AM, Sagi Grimberg wrote: > What you are saying is surprising to me. The switch needs to preserve > ordering across different switch ports ?? > You are suggesting that there is a *switch-wide* state that tracks > MemRds never pass MemWrs across all the switch ports? That is a

tcm_qla2xxx and target mode not stable at all on 4.10+

2017-04-10 Thread Laurence Oberman
Hello I have had issues with the target mode working since moving to 4.10+. I am using a qla25xx card at 8Gbit Latest testing with 4.11 RC6 sees the same issue. Going back to 4.10.4 I can map targets but when I use my jammer I get into other issues. Its rock solid on 4.9 with the jammer. I

[PATCH v2 3/4] sd: Make synchronize cache upon shutdown asynchronous

2017-04-10 Thread Bart Van Assche
This patch avoids that sd_shutdown() hangs on the SYNCHRONIZE CACHE command if the block layer queue has been stopped by scsi_target_block(). Signed-off-by: Bart Van Assche Cc: Israel Rukshin Cc: Max Gurtovoy Cc: Hannes

[Bug 195285] qla2xxx FW immediatly crashing after target start

2017-04-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=195285 himanshu.madh...@cavium.com (himanshu.madh...@qlogic.com) changed: What|Removed |Added CC|

Re: [REEEEPOST] bnx2i + bnx2fc: convert to generic workqueue (#3)

2017-04-10 Thread Chad Dupuis
On Mon, 10 Apr 2017, 5:12pm -, Sebastian Andrzej Siewior wrote: > This is a repost to get the patches applied against v4.11-rc6. mkp's scsi > for-next tree can be merged with no conflicts. > > The last repost [0] was not merged and stalled after Martin pinged Chad > [1]. He didn't even

[PATCH] ibmvscsis: Do not send aborted task response

2017-04-10 Thread Bryant G. Ly
The driver is sending a response to the aborted task response along with LIO sending the tmr response. ibmvscsis_tgt does not send the response to the client until release_cmd time. The reason for this was because if we did it at queue_status time, then the client would be free to reuse the tag

Re: [PATCH 03/27] block: implement splitting of REQ_OP_WRITE_ZEROES bios

2017-04-10 Thread Anthony Youngman
s/past/paste/ On 05/04/17 15:21, Christoph Hellwig wrote: Copy and past the REQ_OP_WRITE_SAME code to prepare to implementations that limit the write zeroes size. Cheers, Wol

[PATCH v2 1/4] Introduce scsi_start_queue()

2017-04-10 Thread Bart Van Assche
This patch does not change any functionality. Signed-off-by: Bart Van Assche Cc: Israel Rukshin Cc: Max Gurtovoy Cc: Hannes Reinecke --- drivers/scsi/scsi_lib.c | 25 +++--

[PATCH v2 0/4] Avoid that __scsi_remove_device() hangs

2017-04-10 Thread Bart Van Assche
Several weeks ago Israel Rukshin reported that __scsi_remove_device() hangs if it is waiting for the SYNCHRONIZE CACHE command submitted by the sd driver to finish if the block layer queue is stopped and does not get restarted. This patch series avoids that that hang occurs. Bart Van Assche (4):

[PATCH v2 4/4] Avoid that __scsi_remove_device() hangs

2017-04-10 Thread Bart Van Assche
Since scsi_target_unblock() uses starget_for_each_device(), since starget_for_each_device() uses scsi_device_get(), since scsi_device_get() fails after unloading of the LLD kernel module has been started scsi_target_unblock() may skip devices that were affected by scsi_target_block(). Ensure that

[PATCH v2 2/4] Introduce scsi_execute_async()

2017-04-10 Thread Bart Van Assche
Move the code for submitting a SCSI command from scsi_execute() into scsi_build_rq(). Introduce scsi_execute_async(). This patch does not change any functionality. Signed-off-by: Bart Van Assche Cc: Israel Rukshin Cc: Max Gurtovoy

Re: [PATCHv6 8/8] scsi: inline command aborts

2017-04-10 Thread Bart Van Assche
On Thu, 2017-04-06 at 15:36 +0200, Hannes Reinecke wrote: > The block layer always calls the timeout function from a workqueue > context, so there is no need to have yet another workqueue for > running command aborts. > > [ ... ] > @@ -271,10 +266,14 @@ enum blk_eh_timer_return

[PATCH 2/4] scsi: scsi_dh_alua: create alua_rtpg_print() for alua_rtpg() sdev_printk

2017-04-10 Thread Mauricio Faria de Oliveira
Factor out the sdev_printk() statement with the RTPG information in alua_rtpg() into a new function, alua_rtpg_print(), that will also be used in the following patch. The only functional difference is that the 'valid_states' value is now referenced via a pointer, and can be NULL (optional), in

[PATCH 3/4] scsi: scsi_dh_alua: print changes to RTPG state of other PGs too

2017-04-10 Thread Mauricio Faria de Oliveira
Currently, alua_rtpg() can change the 'state' and 'preferred' values for the current port group _and_ of other port groups present in the response buffer/descriptors. However, it reports such changes _only_ for the current port group (i.e., only for 'pg' but not for 'tmp_pg'). This might cause

[PATCH 0/4] scsi: scsi_dh_alua: handle target port unavailable state

2017-04-10 Thread Mauricio Faria de Oliveira
This patch series resolves a problem in which all paths of a multipath device became _permanently_ failed after a storage system had moved both controllers into a _temporarily_ unavailable state (that is SCSI_ACCESS_STATE_UNAVAILABLE). This happened because once scsi_dh_alua had set the

[PATCH 1/4] scsi: scsi_dh_alua: allow I/O in the target port unavailable state

2017-04-10 Thread Mauricio Faria de Oliveira
According to SPC-4 (5.15.2.4.5 Unavailable state), the unavailable state may (or may not) transition to other states (e.g., microcode downloading or hardware error, which may be temporary or permanent conditions, respectively). But, scsi_dh_alua currently fails the I/O requests early once that

[PATCH 4/4] scsi: scsi_dh_alua: do not print target port group state if it remains unavailable

2017-04-10 Thread Mauricio Faria de Oliveira
Path checkers will periodically check all paths to a target port group in unavailable state more often (as they are 'failed'), possibly for a long or indefinite period of time, or for a large number of paths. That might end up flooding the kernel log with the scsi_dh_alua target port group state

sd: wait for slow devices on shutdown path

2017-04-10 Thread Henrique de Moraes Holschuh
Author: Henrique de Moraes Holschuh Date: Wed Feb 1 20:42:02 2017 -0200 sd: wait for slow devices on shutdown path Wait 1s during suspend/shutdown for the device to settle after we issue the STOP command. Otherwise we race ATA SSDs to powerdown,

Re: Race to power off harming SATA SSDs

2017-04-10 Thread Henrique de Moraes Holschuh
On Mon, 10 Apr 2017, Bart Van Assche wrote: > On Mon, 2017-04-10 at 20:21 -0300, Henrique de Moraes Holschuh wrote: > > A proof of concept patch is attached > > Thank you for the very detailed write-up. Sorry but no patch was attached > to the e-mail I received from you ... Indeed. It should

Re: Race to power off harming SATA SSDs

2017-04-10 Thread Henrique de Moraes Holschuh
On Tue, 11 Apr 2017, Tejun Heo wrote: > > The kernel then continues the shutdown path while the SSD is still > > preparing itself to be powered off, and it becomes a race. When the > > kernel + firmware wins, platform power is cut before the SSD has > > finished (i.e. the SSD is subject to an

Re: Race to power off harming SATA SSDs

2017-04-10 Thread Tejun Heo
Hello, On Mon, Apr 10, 2017 at 08:21:19PM -0300, Henrique de Moraes Holschuh wrote: ... > Per spec (and device manuals), SCSI, SATA and ATA-attached SSDs must be > informed of an imminent poweroff to checkpoing background tasks, flush > RAM caches and close logs. For SCSI SSDs, you must tissue a

Re: [PATCH 0/4] scsi: scsi_dh_alua: handle target port unavailable state

2017-04-10 Thread Mauricio Faria de Oliveira
On 04/10/2017 10:17 PM, Mauricio Faria de Oliveira wrote: For documentation purposes, I'll reply to this cover letter with the analysis of such cases of this problem, and the accompanying messages from kernel logs. Here it goes, for anyone interested. Scenario: 4 LUNs, 2 target port groups

Re: Race to power off harming SATA SSDs

2017-04-10 Thread Henrique de Moraes Holschuh
On Mon, 10 Apr 2017, James Bottomley wrote: > On Tue, 2017-04-11 at 08:52 +0900, Tejun Heo wrote: > [...] > > > Any comments? Any clues on how to make the delay "smarter" to > > > trigger only once during platform shutdown, but still trigger per > > > -device when doing per-device hotswapping ?

Re: [PATCHv6 3/8] scsi: always send command aborts

2017-04-10 Thread Bart Van Assche
On Thu, 2017-04-06 at 15:36 +0200, Hannes Reinecke wrote: > When a command has timed out we always should be sending an > abort; with the previous code a failed abort might signal > SCSI EH to start, and all other timed out commands will > never be aborted, even though they might belong to a >

Re: [PATCHv6 2/8] sd: Return SUCCESS in sd_eh_action() after device offline

2017-04-10 Thread Bart Van Assche
On Thu, 2017-04-06 at 15:36 +0200, Hannes Reinecke wrote: > If sd_eh_action() decides to take the device offline there is > no point in returning FAILED, as taking the device offline > is the ultimate step in SCSI EH anyway. > So further escalation via SCSI EH is not likely to make a > difference

Race to power off harming SATA SSDs

2017-04-10 Thread Henrique de Moraes Holschuh
Summary: Linux properly issues the SSD prepare-to-poweroff command to SATA SSDs, but it does not wait for long enough to ensure the SSD has carried it through. This causes a race between the platform power-off path, and the SSD device. When the SSD loses the race, its power is cut while it is

Re: [PATCH v4 1/6] blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list

2017-04-10 Thread Christoph Hellwig
Looks good, Reviewed-by: Christoph Hellwig

Re: [PATCH v4 2/6] blk-mq: Restart a single queue if tag sets are shared

2017-04-10 Thread Christoph Hellwig
Looks good, Reviewed-by: Christoph Hellwig

Re: [PATCH v4 3/6] blk-mq: Clarify comments in blk_mq_dispatch_rq_list()

2017-04-10 Thread Christoph Hellwig
Looks good, Reviewed-by: Christoph Hellwig

Re: [PATCH 1/2] scsi: sd: Separate zeroout and discard command choices

2017-04-10 Thread h...@lst.de
On Fri, Apr 07, 2017 at 07:59:08PM +, Bart Van Assche wrote: > On Wed, 2017-04-05 at 07:41 -0400, Martin K. Petersen wrote: > > +static ssize_t > > +zeroing_mode_store(struct device *dev, struct device_attribute *attr, > > + const char *buf, size_t count) > > +{ > > + struct

Re: [PATCHv6 8/8] scsi: inline command aborts

2017-04-10 Thread Christoph Hellwig
Looks good, Reviewed-by: Christoph Hellwig

Re: [PATCH v4 4/6] blk-mq: Introduce blk_mq_delay_run_hw_queue()

2017-04-10 Thread Christoph Hellwig
> + if (msecs == 0) > + kblockd_schedule_work_on(blk_mq_hctx_next_cpu(hctx), > + >run_work); > + else > + kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx), > +

Re: [PATCH v4 5/6] scsi: Avoid that SCSI queues get stuck

2017-04-10 Thread Christoph Hellwig
Looks good, Reviewed-by: Christoph Hellwig

[PATCH 1/6] scsi: hisi_sas: workaround STP link SoC bug

2017-04-10 Thread John Garry
From: Xiaofei Tan After resetting the controller, the process of scanning SATA disks attached to an expander may fail occasionally. The issue is that the controller can't close the STP link created by target if the max link time is 0. To workaround this issue, we reject

[PATCH 4/6] scsi: hisi_sas: add v2 hw internal abort timeout workaround

2017-04-10 Thread John Garry
This patch is a workaround for a SoC bug where an internal abort command may timeout. In v2 hw, the channel should become idle in order to finish abort process. If the target side has been sending HOLD, host side channel failed to complete the frame to send, and can not enter the idle state. Then

[PATCH 3/6] scsi: hisi_sas: workaround SoC about abort timeout bug

2017-04-10 Thread John Garry
From: Xiaofei Tan This patch adds a workaround solution for a SoC bug which may cause SoC logic fatal error when disabling a PHY. Then we find internal abort IO timeout may occur, and the controller IO breakpoint may be corrupted. We work around this SoC bug by optimizing

[PATCH 0/6] hisi_sas: v2 hw SoC bug workarounds

2017-04-10 Thread John Garry
This patchset introduces some v2 hw bug workarounds. Mostly they are related to SATA devices, but there is also a workaround for a scenario when internal abort command may timeout. The general rule for implementing workarounds was to do it in the hw layer, as the next hw revision should not

Re: [PATCH 1/6] scsi: hisi_sas: workaround STP link SoC bug

2017-04-10 Thread Johannes Thumshirn
On Mon, Apr 10, 2017 at 09:21:56PM +0800, John Garry wrote: > From: Xiaofei Tan > > After resetting the controller, the process of scanning SATA disks > attached to an expander may fail occasionally. The issue is that > the controller can't close the STP link created by

[PATCH 5/6] scsi: hisi_sas: fix NULL deference when TMF timeouts

2017-04-10 Thread John Garry
If a TMF timeouts (maybe due to unlikely scenario of an expander being unplugged when TMF for remote device is active), when we eventually try to free the slot, we crash as we dereference the slot's task, which has already been released. As a fix, add checks in the slot release code for a NULL

Re: [RFC 6/8] nvmet: Be careful about using iomem accesses when dealing with p2pmem

2017-04-10 Thread Sagi Grimberg
Sagi As long as legA, legB and the RC are all connected to the same switch then ordering will be preserved (I think many other topologies also work). Here is how it would work for the problem case you are concerned about (which is a read from the NVMe drive). 1. Disk device DMAs out the

[PATCH] ipr: do not set DID_PASSTHROUGH on CHECK CONDITION

2017-04-10 Thread Mauricio Faria de Oliveira
On a dual controller setup with multipath enabled, some MEDIUM ERRORs caused both paths to be failed, thus I/O got queued/blocked since the 'queue_if_no_path' feature is enabled by default on IPR controllers. This example disabled 'queue_if_no_path' so the I/O failure is seen at the sg_dd

Re: [PATCHv6 5/8] scsi: make eh_eflags persistent

2017-04-10 Thread Bart Van Assche
On Thu, 2017-04-06 at 15:36 +0200, Hannes Reinecke wrote: > +is invoked to schedule an asynchrous abort. ^^ Sorry that I hadn't noticed this before but if you have to repost this patch please fix the spelling of this word. Bart.

Re: Race to power off harming SATA SSDs

2017-04-10 Thread James Bottomley
On Tue, 2017-04-11 at 08:52 +0900, Tejun Heo wrote: [...] > > Any comments? Any clues on how to make the delay "smarter" to > > trigger only once during platform shutdown, but still trigger per > > -device when doing per-device hotswapping ? > > So, if this is actually an issue, sure, we can