Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Ming Lei
Hi Mike, On Tue, Sep 19, 2017 at 07:50:06PM -0400, Mike Snitzer wrote: > On Tue, Sep 19 2017 at 7:25pm -0400, > Bart Van Assche wrote: > > > On Wed, 2017-09-20 at 06:44 +0800, Ming Lei wrote: > > > For this issue, it isn't same between SCSI and dm-rq. > > > > > > We

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Ming Lei
On Tue, Sep 19, 2017 at 04:49:15PM +, Bart Van Assche wrote: > On Wed, 2017-09-20 at 00:04 +0800, Ming Lei wrote: > > Run queue at end_io is definitely wrong, because blk-mq has SCHED_RESTART > > to do that already. > > Sorry but I disagree. If SCHED_RESTART is set that causes the blk-mq core

[PATCH] mpt3sas: downgrade full copy_from_user to access_ok check

2017-09-19 Thread Meng Xu
Since right after the user copy, we are going to memset(, 0, sizeof(karg)), I guess an access_ok check is enough? Signed-off-by: Meng Xu --- drivers/scsi/mpt3sas/mpt3sas_ctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Ming Lei
On Tue, Sep 19, 2017 at 06:42:30PM +, Bart Van Assche wrote: > On Wed, 2017-09-20 at 00:55 +0800, Ming Lei wrote: > > On Wed, Sep 20, 2017 at 12:49 AM, Bart Van Assche > > wrote: > > > On Wed, 2017-09-20 at 00:04 +0800, Ming Lei wrote: > > > > Run queue at end_io is

[GIT PULL] SCSI fixes for 4.14-rc1

2017-09-19 Thread James Bottomley
This is a set of five small fixes: one is a null deref fix which is pretty critical for the fc transport class and one fixes a potential security issue of sg leaking kernel information. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Mike Snitzer
On Tue, Sep 19 2017 at 7:25pm -0400, Bart Van Assche wrote: > On Wed, 2017-09-20 at 06:44 +0800, Ming Lei wrote: > > For this issue, it isn't same between SCSI and dm-rq. > > > > We don't need to run queue in .end_io of dm, and the theory is > > simple, otherwise it

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Bart Van Assche
On Wed, 2017-09-20 at 06:44 +0800, Ming Lei wrote: > For this issue, it isn't same between SCSI and dm-rq. > > We don't need to run queue in .end_io of dm, and the theory is > simple, otherwise it isn't performance issue, and should be I/O hang. > > 1) every dm-rq's request is 1:1 mapped to

[RESEND PATCH v4 6/6] libsas: direct call probe and destruct

2017-09-19 Thread Jason Yan
In commit 87c8331f ([SCSI] libsas: prevent domain rediscovery competing with ata error handling) introduced disco mutex to prevent rediscovery competing with ata error handling and put the whole revalidation in the mutex. But the rphy add/remove needs to wait for the error handling which also

[RESEND PATCH v4 3/6] libsas: make the event threshold configurable

2017-09-19 Thread Jason Yan
Add a sysfs attr that LLDD can configure it for every host. We made a example in hisi_sas. Other LLDDs using libsas can implement it if they want. Suggested-by: Hannes Reinecke Signed-off-by: Jason Yan CC: John Garry CC: Johannes

[RESEND PATCH v4 1/6] libsas: Use dynamic alloced work to avoid sas event lost

2017-09-19 Thread Jason Yan
Now libsas hotplug work is static, every sas event type has its own static work, LLDD driver queues the hotplug work into shost->work_q. If LLDD driver burst posts lots hotplug events to libsas, the hotplug events may pending in the workqueue like shost->work_q new work[PORTE_BYTES_DMAED] -->

[RESEND PATCH v4 5/6] libsas: libsas: use flush_workqueue to process disco events synchronously

2017-09-19 Thread Jason Yan
Use flush_workqueue to insure the disco and revalidate events processed synchronously. Signed-off-by: Jason Yan CC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC:

[RESEND PATCH v4 2/6] libsas: shut down the PHY if events reached the threshold

2017-09-19 Thread Jason Yan
If the PHY burst too many events, we will alloc a lot of events for the worker. This may leads to memory exhaustion. Dan Williams suggested to shut down the PHY if the events reached the threshold, because in this case the PHY may have gone into some erroneous state. Users can re-enable the PHY

[PATCH V6 3/3] scsi: Align block queue to dma_get_cache_alignment()

2017-09-19 Thread Huacai Chen
In non-coherent DMA mode, kernel uses cache flushing operations to maintain I/O coherency, so scsi's block queue should be aligned to ARCH_DMA_MINALIGN. Otherwise, it will cause data corruption, at least on MIPS: Step 1, dma_map_single Step 2, cache_invalidate (no writeback)

[PATCH V6 2/3] dma-mapping: Rework dma_get_cache_alignment() function

2017-09-19 Thread Huacai Chen
Make dma_get_cache_alignment() to accept a 'dev' argument. As a result, it can return different alignments due to different devices' I/O cache coherency. For compatibility, make all existing callers pass a NULL dev argument. Cc: sta...@vger.kernel.org Signed-off-by: Huacai Chen

[PATCH V6 1/3] dma-mapping: Introduce device_is_coherent() as a helper

2017-09-19 Thread Huacai Chen
We will use device_is_coherent() as a helper function, which will be used in the next patch. There is a MIPS-specific plat_device_is_coherent(), but we need a more generic solution, so add and use a new function pointer in dma_map_ops. Cc: sta...@vger.kernel.org Signed-off-by: Huacai Chen

[PATCH] scsi: ufs: fix a pclint warning

2017-09-19 Thread Zang Leigang
Signed-off-by: Zang Leigang diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 794a4600e952..deb77535b8c9 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -3586,7 +3586,7 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba

Re: [PATCH] scsi: aacraid: Add a small delay after IOP reset

2017-09-19 Thread Guilherme G. Piccoli
On 09/19/2017 02:05 PM, James Bottomley wrote: > Actually, the whole problem sounds like a posted write. Likely the > write that causes the reset doesn't get flushed until the read checking > if the reset has succeeded, which might explain the 100% initial > failure. Why not throw away that

Re: [PATCH] fcoe-utils: Fix get_ctlr_num() for large ctlr_* indices

2017-09-19 Thread Hannes Reinecke
On 09/18/2017 04:35 PM, Andrey Grafin wrote: > Each creation of a FCoE device increases counter which is used as a suffix > in a FCoE device name in sysfs (i.e. /sys/bus/fcoe/devices/ctlr_1). > Once this counter reaches the value of two digits (10 and larger), > get_ctlr_num() stopped working

[PATCH] aacraid: fix potential double-fetch issue

2017-09-19 Thread Meng Xu
While examining the kernel source code, I found a dangerous operation that could turn into a double-fetch situation (a race condition bug) where the same userspace memory region are fetched twice into kernel with sanity checks after the first fetch while missing checks after the second fetch. 1.

[PATCH] scsi: SSDs can timeout during FS init because of too many unmaps

2017-09-19 Thread Bill Kuzeja
I encountered this issue putting XFS on several brands of SSDs on my system. During initialization, I would see a bunch of timeouts on WRITE_SAME_16 commands, which would get aborted, reissued, and complete. The logs look like this: kernel: sd 2:0:1:0: attempting task abort!

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Ming Lei
On Wed, Sep 20, 2017 at 12:49 AM, Bart Van Assche wrote: > On Wed, 2017-09-20 at 00:04 +0800, Ming Lei wrote: >> Run queue at end_io is definitely wrong, because blk-mq has SCHED_RESTART >> to do that already. > > Sorry but I disagree. If SCHED_RESTART is set that causes

[PATCH V3 8/9] pm80xx : panic on ncq error cleaning up the read log.

2017-09-19 Thread Viswas G
when there's an error in 'ncq mode' the host has to read the ncq error log (10h) to clear the error state. however, the ccb that is setup for doing this doesn't setup the ccb so that the previous state is cleared. if the ccb was previously used for an IO n_elems is set and pm8001_ccb_task_free()

[PATCH V3 9/9] pm80xx : corrected linkrate value.

2017-09-19 Thread Viswas G
Corrected the value defined for LINKRATE_60 (6 Gig). Signed-off-by: Raj Dinesh Signed-off-by: Viswas G Acked-by: Jack Wang --- drivers/scsi/pm8001/pm80xx_hwi.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

[PATCH V3 6/9] pm80xx : modified port reset timer value for PM8006 card

2017-09-19 Thread Viswas G
Added port reset timer value as 2000ms for PM8006 sata controller. Signed-off-by: Deepak Ukey Signed-off-by: Viswas G Acked-by: Jack Wang --- drivers/scsi/pm8001/pm80xx_hwi.c | 6 ++ 1 file changed, 6

[PATCH V3 7/9] pm80xx : corrected SATA abort handling sequence.

2017-09-19 Thread Viswas G
Modified SATA abort handling with following steps: 1) Set device state as recovery. 2) Send phy reset. 3) Wait for reset completion. 4) After successful reset, abort all IO's to the device. 5) After aborting all IO's to device, set device state as operational. Signed-off-by: Deepak Ukey

RE: [PATCH] sd: Limit WRITE SAME / WRITE SAME(16) w/UNMAP length for certain devices

2017-09-19 Thread Kuzeja, William
Ewan, I like it, more generic than my patch. I never saw the other cases, so I limited my patch to WS16. Acked-by: Bill Kuzeja On Tue-09-19 at 12:14 Ewan D. Milne wrote: > Some devices do not support a WRITE SAME / WRITE SAME(16) with the > UNMAP > bit set up to

[RESEND PATCH v4 4/6] libsas: Use new workqueue to run sas event and disco event

2017-09-19 Thread Jason Yan
Now all libsas works are queued to scsi host workqueue, include sas event work post by LLDD and sas discovery work, and a sas hotplug flow may be divided into several works, e.g libsas receive a PORTE_BYTES_DMAED event, currently we process it as following steps: sas_form_port --- run in work in

[RESEND PATCH v4 0/6] Enhance libsas hotplug feature

2017-09-19 Thread Jason Yan
Thanks Martin K. Petersen for applied some of the tidy-up patches. So I do not have to maintain these patches out of the tree. I will only send the reset of them in the next days if needed. Now the libsas hotplug has some issues, Dan Williams report a similar bug here before

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Bart Van Assche
On Wed, 2017-09-20 at 00:04 +0800, Ming Lei wrote: > Run queue at end_io is definitely wrong, because blk-mq has SCHED_RESTART > to do that already. Sorry but I disagree. If SCHED_RESTART is set that causes the blk-mq core to reexamine the software queues and the hctx dispatch list but not the

Re: [PATCH] scsi: aacraid: Add a small delay after IOP reset

2017-09-19 Thread James Bottomley
On Tue, 2017-09-19 at 08:52 -0700, Christoph Hellwig wrote: > On Tue, Sep 19, 2017 at 12:49:21PM -0300, Guilherme G. Piccoli wrote: > > > > On 09/19/2017 12:37 PM, Christoph Hellwig wrote: > > > > > > On Tue, Sep 19, 2017 at 12:11:55PM -0300, Guilherme G. Piccoli > > > wrote: > > > > > > > >  

[PATCH V3 2/9] pm80xx : ILA and inactive firmware version through sysfs

2017-09-19 Thread Viswas G
Added support to read ILA version and inactive firmware version from MPI configuration table and export through sysfs. Signed-off-by: Deepak Ukey Signed-off-by: Viswas G Acked-by: Jack Wang ---

[PATCH V3 3/9] pm80xx : Different SAS addresses for phys.

2017-09-19 Thread Viswas G
Different SAS addresses are assigned for each set of phys. Signed-off-by: Viswas G Acked-by: Jack Wang --- drivers/scsi/pm8001/pm8001_init.c | 13 + drivers/scsi/pm8001/pm80xx_hwi.c | 3 +-- 2 files changed, 10 insertions(+),

[PATCH V3 0/9] pm80xx updates

2017-09-19 Thread Viswas G
This patch set include some bug fixes and enhancement for pm80xx driver. Changes from V2: - Corrected date. Changes from V1: - sas_identify_frame_local structure moved to pm80xx_hwi.h - sata abort handling patch split to four patches. - tag allocation for

[PATCH V3 4/9] pm80xx : tag allocation for phy control request.

2017-09-19 Thread Viswas G
tag is taken from the tag pool instead of using the hardcoded tag value(1). Signed-off-by: Deepak Ukey Signed-off-by: Viswas G Acked-by: Jack Wang --- drivers/scsi/pm8001/pm8001_hwi.c | 3 +++

[PATCH V3 1/9] pm80xx : redefine sas_identify_frame structure

2017-09-19 Thread Viswas G
sas_identify structure defined by pm80xx doesn't have CRC field. So added a new sas_identify structure without CRC. v2: - Since the structure changes is applicable for only pm80xx, sas_identify_frame_local structure moved to pm80xx_hwi.h. Signed-off-by: Raj Dinesh

[PATCH V3 5/9] pm80xx : cleanup in pm8001_abort_task function.

2017-09-19 Thread Viswas G
Signed-off-by: Deepak Ukey Signed-off-by: Viswas G Acked-by: Jack Wang --- drivers/scsi/pm8001/pm8001_sas.c | 49 +++- 1 file changed, 13 insertions(+), 36 deletions(-) diff

Re: [PATCH] scsi: aacraid: Add a small delay after IOP reset

2017-09-19 Thread Guilherme G. Piccoli
On 09/19/2017 12:52 PM, Christoph Hellwig wrote: > On Tue, Sep 19, 2017 at 12:49:21PM -0300, Guilherme G. Piccoli wrote: >> On 09/19/2017 12:37 PM, Christoph Hellwig wrote: >>> On Tue, Sep 19, 2017 at 12:11:55PM -0300, Guilherme G. Piccoli wrote: src_writel(dev, MUnit.IDR,

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Ming Lei
On Tue, Sep 19, 2017 at 11:56:03AM -0400, Mike Snitzer wrote: > On Tue, Sep 19 2017 at 11:36am -0400, > Bart Van Assche wrote: > > > On Tue, 2017-09-19 at 13:43 +0800, Ming Lei wrote: > > > On Mon, Sep 18, 2017 at 03:18:16PM +, Bart Van Assche wrote: > > > > If you

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Ming Lei
On Tue, Sep 19, 2017 at 11:48:23AM -0400, Mike Snitzer wrote: > On Tue, Sep 19 2017 at 1:43am -0400, > Ming Lei wrote: > > > On Mon, Sep 18, 2017 at 03:18:16PM +, Bart Van Assche wrote: > > > On Sun, 2017-09-17 at 20:40 +0800, Ming Lei wrote: > > > > "if no request has

Re: [PATCH] scsi: SSDs can timeout during FS init because of too many unmaps

2017-09-19 Thread Ewan D. Milne
On Tue, 2017-09-19 at 09:02 -0400, Bill Kuzeja wrote: > I encountered this issue putting XFS on several brands of SSDs on my > system. During initialization, I would see a bunch of timeouts on > WRITE_SAME_16 commands, which would get aborted, reissued, and complete. > The logs look like this: >

[PATCH] sd: Limit WRITE SAME / WRITE SAME(16) w/UNMAP length for certain devices

2017-09-19 Thread Ewan D. Milne
From: "Ewan D. Milne" Some devices do not support a WRITE SAME / WRITE SAME(16) with the UNMAP bit set up to the length specified in the MAXIMUM WRITE SAME LENGTH field in the block limits VPD page (or, the field is zero, indicating there is no limit). Limit the length by the

[PATCH] scsi: aacraid: Add a small delay after IOP reset

2017-09-19 Thread Guilherme G. Piccoli
Commit 0e9973ed3382 ("scsi: aacraid: Add periodic checks to see IOP reset status") changed the way driver checks if a reset succeeded. Now, after an IOP reset, aacraid immediately start polling a register to verify the reset is complete. This behavior cause regressions on the reset path in

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Bart Van Assche
On Tue, 2017-09-19 at 13:43 +0800, Ming Lei wrote: > On Mon, Sep 18, 2017 at 03:18:16PM +, Bart Van Assche wrote: > > If you are still looking at removing the blk_mq_delay_run_hw_queue() calls > > then I think you are looking in the wrong direction. What kind of problem > > are you trying to

Re: [PATCH] scsi: aacraid: Add a small delay after IOP reset

2017-09-19 Thread Guilherme G. Piccoli
On 09/19/2017 12:37 PM, Christoph Hellwig wrote: > On Tue, Sep 19, 2017 at 12:11:55PM -0300, Guilherme G. Piccoli wrote: >> src_writel(dev, MUnit.IDR, IOP_SRC_RESET_MASK); >> + >> +msleep(5000); > > src_writel is a writel, and thus a posted MMIO write. You'll need > to have to a read

Re: [PATCH] scsi: aacraid: Add a small delay after IOP reset

2017-09-19 Thread Christoph Hellwig
On Tue, Sep 19, 2017 at 12:49:21PM -0300, Guilherme G. Piccoli wrote: > On 09/19/2017 12:37 PM, Christoph Hellwig wrote: > > On Tue, Sep 19, 2017 at 12:11:55PM -0300, Guilherme G. Piccoli wrote: > >>src_writel(dev, MUnit.IDR, IOP_SRC_RESET_MASK); > >> + > >> + msleep(5000); > > > >

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Bart Van Assche
On Tue, 2017-09-19 at 11:48 -0400, Mike Snitzer wrote: > This thread proves that it is definitely brittle to be relying on fixed > delays like this: > https://patchwork.kernel.org/patch/9703249/ Hello Mike, Sorry but I think that's a misinterpretation of my patch. I came up with that patch

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Mike Snitzer
On Tue, Sep 19 2017 at 11:36am -0400, Bart Van Assche wrote: > On Tue, 2017-09-19 at 13:43 +0800, Ming Lei wrote: > > On Mon, Sep 18, 2017 at 03:18:16PM +, Bart Van Assche wrote: > > > If you are still looking at removing the blk_mq_delay_run_hw_queue() calls > > >

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Mike Snitzer
On Tue, Sep 19 2017 at 11:52am -0400, Bart Van Assche wrote: > On Tue, 2017-09-19 at 11:48 -0400, Mike Snitzer wrote: > > This thread proves that it is definitely brittle to be relying on fixed > > delays like this: > > https://patchwork.kernel.org/patch/9703249/ > >

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Mike Snitzer
On Tue, Sep 19 2017 at 1:43am -0400, Ming Lei wrote: > On Mon, Sep 18, 2017 at 03:18:16PM +, Bart Van Assche wrote: > > On Sun, 2017-09-17 at 20:40 +0800, Ming Lei wrote: > > > "if no request has completed before the delay has expired" can't be a > > > reason to rerun

Re: qla2xxx: MSI-X: Unsupported ISP 2432 SSVID/SSDID (0x103C,0x7041)

2017-09-19 Thread Meelis Roos
> On 08/19/2017 10:41 PM, Meelis Roos wrote: > > Hello, I just tried Linux with the latest kernel (4.13-rc5+git) on a HP > > DL360 G6 with HP branded ISP2432 HBA. The driver mentions unsupported > > model of the card: > > > > [3.868589] scsi host1: qla2xxx > > [3.871696] qla2xxx

Re: [PATCH V6 2/3] dma-mapping: Rework dma_get_cache_alignment() function

2017-09-19 Thread Christoph Hellwig
> mdev->limits.reserved_mtts = ALIGN(mdev->limits.reserved_mtts * > mdev->limits.mtt_seg_size, > -dma_get_cache_alignment()) / > mdev->limits.mtt_seg_size; > +dma_get_cache_alignment(NULL)) / >

Is the possible cross-talking between unrelated file-descriptors on bsg-device by design?

2017-09-19 Thread Benjamin Block
Hello linux-block, I wrote some tests recently to test patches against bsg.c and bsg-lib.c, and while writing those I noticed something strange: When you use the write() and read() call on multiple file-descriptors for a single bsg-device (FC or SCSI), it is possible that you get cross-talk

Re: [PATCH] scsi: aacraid: Add a small delay after IOP reset

2017-09-19 Thread Christoph Hellwig
On Tue, Sep 19, 2017 at 12:11:55PM -0300, Guilherme G. Piccoli wrote: > src_writel(dev, MUnit.IDR, IOP_SRC_RESET_MASK); > + > + msleep(5000); src_writel is a writel, and thus a posted MMIO write. You'll need to have to a read first to make it a reliable timing base.

Re: Is the possible cross-talking between unrelated file-descriptors on bsg-device by design?

2017-09-19 Thread Douglas Gilbert
On 2017-09-19 10:56 AM, Benjamin Block wrote: Hello linux-block, I wrote some tests recently to test patches against bsg.c and bsg-lib.c, and while writing those I noticed something strange: When you use the write() and read() call on multiple file-descriptors for a single bsg-device (FC or

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-19 Thread Bart Van Assche
On Wed, 2017-09-20 at 00:55 +0800, Ming Lei wrote: > On Wed, Sep 20, 2017 at 12:49 AM, Bart Van Assche > wrote: > > On Wed, 2017-09-20 at 00:04 +0800, Ming Lei wrote: > > > Run queue at end_io is definitely wrong, because blk-mq has SCHED_RESTART > > > to do that already.