Re: [PATCH] scsi: ata: don't reset three times if device is offline for SAS host

2018-02-27 Thread chenxiang (M)
Hi Tejun, 在 2018/2/28 2:19, Tejun Heo 写道: Hello, On Mon, Feb 26, 2018 at 07:45:37PM +0800, chenxiang (M) wrote: So, if there are real consequences, we can definitely add a way to short-circuit the recovery logic but let's do that by adding proper signaling rathr than testing for driver type.

[PATCH v2 2/2] scsi_io_completion convert BUG to WARN calls

2018-02-27 Thread Douglas Gilbert
ChangeLog: - convert 3 BUG calls to WARN calls - un-invert some conditional logic to uncover the real fast path - try to improve the comments, including noting what the bool return value from scsi_end_request() means Signed-off-by: Douglas Gilbert ---

[PATCH v2 0/2] scsi_io_completion cleanup and fix CONDITION MET handling

2018-02-27 Thread Douglas Gilbert
This patch started as an attempt to fix the erroneous handling of CONDITION MET, a relatively rare special case. A solution meant adding another special case to the already complicated scsi_io_completion() function. To better understand that function the author found it useful to refactor the

[PATCH v2 1/2] scsi_io_completion split, fix CONDITION MET handling

2018-02-27 Thread Douglas Gilbert
The "ChangeLog for v1" section in 0/2 (the cover letter) of this patch set outlines the changes in this patch. Signed-off-by: Douglas Gilbert --- drivers/scsi/scsi_lib.c | 297 +--- include/scsi/scsi.h | 2 + 2 files

Re: [bug report] Don't enter SCSI error handler on kernel 4.16-rc1

2018-02-27 Thread chenxiang (M)
在 2018/2/27 22:57, Bart Van Assche 写道: On Tue, 2018-02-27 at 15:09 +0800, chenxiang (M) wrote: 在 2018/2/26 23:25, Bart Van Assche 写道: On Mon, 2018-02-26 at 17:37 +0800, chenxiang (M) wrote: When i have a test on kernel 4.16-rc1, find a issue: running IO on SATA disk, then disable the disk

Re: [PATCH] scsi: libsas: defer ata device eh commands to libata

2018-02-27 Thread Jason Yan
On 2018/2/27 23:00, Jack Wang wrote: 2018-02-27 12:50 GMT+01:00 John Garry : On 27/02/2018 06:59, Jason Yan wrote: When ata device doing EH, some commands still attached with tasks are not passed to libata when abort failed or recover failed, so libata did not handle

Re: [PATCH] megaraid_sas: Re-enable WRITE SAME

2018-02-27 Thread Martin K. Petersen
Shivasharan, > So my understanding is when driver sets no_write_same flag, it is only > disabling BLKZEROOUT/write_zeroes requests from using WRITE SAME > command. Instead individual writes are sent to zero the blocks (if > BLKDEV_ZERO_NOFALLBACK flag is not set). Correct. > Our inhouse

Re: [PATCH] scsi: core: fix two wrong indentation cases

2018-02-27 Thread Martin K. Petersen
Jianchao, > No functional changes. Just fix two wrong indentation cases in > scsi_finish_command and scsi_decide_disposition. Applied to 4.17/scsi-queue, thanks! -- Martin K. Petersen Oracle Linux Engineering

Re: [PATCH] qedi: Fix kernel crash during port toggle.

2018-02-27 Thread Martin K. Petersen
Manish, > BUG: unable to handle kernel NULL pointer dereference at 0100 Applied to 4.16/scsi-fixes. Thanks! -- Martin K. Petersen Oracle Linux Engineering

Re: [PATCH] qla2xxx: Fix FC-NVMe LUN discovery

2018-02-27 Thread Martin K. Petersen
Himanshu, > commit a4239945b8ad ("scsi: qla2xxx: Add switch command to simplify > fabric discovery") introduced regression when it did not consider > FC-NVMe code path which broke NVMe LUN discovery. Applied to 4.16/scsi-fixes. Thank you! -- Martin K. Petersen Oracle Linux Engineering

Re: [PATCH] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-02-27 Thread jianchao.wang
Hi Bart Thanks for your kindly response and precious time to review this. On 02/28/2018 01:18 AM, Bart Van Assche wrote: > On Tue, 2018-02-27 at 17:06 +, Bart Van Assche wrote: >> On Tue, 2018-02-27 at 13:15 +0800, jianchao.wang wrote: >>> Can you share more details about this ? >> >> After

Re: [PATCH v2] scsi: ufs-qcom: add number of lanes for Tx and Rx links

2018-02-27 Thread Martin K. Petersen
Can, > Different platforms may have different number of lanes for the UFS > Tx/Rx links. Add parameter to device tree specifying how many lanes > should be configured for the UFS Tx/Rx links. And don't print err > message for clocks that are optional, this leads to unnecessary > confusion about

Re: [PATCH] scsi: qedi: fix build regression

2018-02-27 Thread Martin K. Petersen
Arnd, > A bugfix I did caused a build regression in some other randconfig > builds in a rare combination of options: > > In file included from drivers/scsi/qedi/qedi_fw.c:16: > drivers/scsi/qedi/qedi_gbl.h:26:38: error: array type has incomplete element > type 'struct qedi_debugfs_ops' >

Re: [PATCH] scsi: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()

2018-02-27 Thread Martin K. Petersen
Hannes, > When converting __scsi_error_from_host_byte() to BLK_STS error codes > the case DID_OK was forgotten, resulting in it always returning > an error. Applied to 4.17/scsi-queue. Thank you! -- Martin K. Petersen Oracle Linux Engineering

Re: [PATCH v2] Avoid that ATA error handling can trigger a kernel hang or oops

2018-02-27 Thread Martin K. Petersen
Bart, > Avoid that the recently introduced call_rcu() call in the SCSI core > triggers a double call_rcu() call. Applied to 4.16/scsi-fixes. Thank you! -- Martin K. Petersen Oracle Linux Engineering

Re: [PATCH] libsas: Fix kernel-doc headers

2018-02-27 Thread Martin K. Petersen
Bart, > Avoid that building with W=1 causes the kernel-doc tool to complain > about function arguments that have not been documented in the libsas > kernel-doc headers. Avoid that the short description starts with a > hyphen by changing "--" into "-" in the first line of the kernel-doc >

[PATCH v2] ata: do not schedule hot plug if it is a sas host

2018-02-27 Thread Jason Yan
We've got a kernel panic when using sata disk with sas controller: [115946.152283] Unable to handle kernel NULL pointer dereference at virtual address 07d8 [115946.223963] CPU: 0 PID: 22175 Comm: kworker/0:1 Tainted: G W OEL 4.14.0 #1 [115946.232925] Workqueue: events ata_scsi_hotplug

[PATCH] qla2xxx: Fix FC-NVMe LUN discovery

2018-02-27 Thread Himanshu Madhani
From: Darren Trapp commit a4239945b8ad ("scsi: qla2xxx: Add switch command to simplify fabric discovery") introduced regression when it did not consider FC-NVMe code path which broke NVMe LUN discovery. Fixes: a4239945b8ad ("scsi: qla2xxx: Add switch command to

Re: [PATCH] ata: do not schedule hot plug if it is a sas host

2018-02-27 Thread Jason Yan
On 2018/2/28 2:22, Tejun Heo wrote: diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c >index 11c3137d7b0a..97aeac45b22c 100644 >--- a/drivers/ata/libata-eh.c >+++ b/drivers/ata/libata-eh.c >@@ -1384,7 +1384,9 @@ void ata_eh_detach_dev(struct ata_device *dev) > >if

Re: [PATCH v2] Avoid that ATA error handling can trigger a kernel hang or oops

2018-02-27 Thread Damien Le Moal
On 2018/02/27 10:53, Bart Van Assche wrote: > On Thu, 2018-02-22 at 11:30 -0800, Bart Van Assche wrote: >> Avoid that the recently introduced call_rcu() call in the SCSI core >> triggers a double call_rcu() call. >> [ ... ] > > Can anyone review this patch? Multiple users have confirmed

Re: [PATCH] lpfc: correct writeq failures on 32-bit architectures

2018-02-27 Thread Arnd Bergmann
On Tue, Feb 27, 2018 at 9:24 PM, James Smart wrote: > On 2/27/2018 12:15 PM, Arnd Bergmann wrote: >> >> Could you add an #ifdef and comment around the 'if (q->dpp_enable ...)' >> block then to make sure that if anybody tries to make it work on other >> architectures, they

Re: [PATCH] lpfc: correct writeq failures on 32-bit architectures

2018-02-27 Thread James Smart
On 2/27/2018 12:15 PM, Arnd Bergmann wrote: Could you add an #ifdef and comment around the 'if (q->dpp_enable ...)' block then to make sure that if anybody tries to make it work on other architectures, they are aware of the problem? the ifdef is around the area where wc is enabled. I'd prefer

Re: [PATCH] lpfc: correct writeq failures on 32-bit architectures

2018-02-27 Thread Arnd Bergmann
On Tue, Feb 27, 2018 at 7:05 PM, James Smart wrote: > On 2/27/2018 12:58 AM, Arnd Bergmann wrote: > > So you point out a very real concern, as in most cases the source buffer is > a bytestream and the desire is to send the bytestream in the same byte order > as in

Re: [PATCH v2] Avoid that ATA error handling can trigger a kernel hang or oops

2018-02-27 Thread Bart Van Assche
On Thu, 2018-02-22 at 11:30 -0800, Bart Van Assche wrote: > Avoid that the recently introduced call_rcu() call in the SCSI core > triggers a double call_rcu() call. > [ ... ] Can anyone review this patch? Multiple users have confirmed independently that this patch fixes the double call_rcu()

Re: [PATCH] ata: do not schedule hot plug if it is a sas host

2018-02-27 Thread Tejun Heo
On Tue, Feb 27, 2018 at 03:08:01PM +0800, Jason Yan wrote: > We've got a kernel panic when using sata disk with sas controller: > > [115946.152283] Unable to handle kernel NULL pointer dereference at virtual > address 07d8 > [115946.223963] CPU: 0 PID: 22175 Comm: kworker/0:1 Tainted: G

Re: [PATCH] scsi: ata: don't reset three times if device is offline for SAS host

2018-02-27 Thread Tejun Heo
Hello, On Mon, Feb 26, 2018 at 07:45:37PM +0800, chenxiang (M) wrote: > >So, if there are real consequences, we can definitely add a way to > >short-circuit the recovery logic but let's do that by adding proper > >signaling rathr than testing for driver type. > > I am not familiar with ata

Re: [PATCH] lpfc: correct writeq failures on 32-bit architectures

2018-02-27 Thread James Smart
On 2/27/2018 12:58 AM, Arnd Bergmann wrote: What you are describing above is not a byte stream but dealing with a 64-bit integer. In both cases you obviously end up with the destination data being 0x05 0x00 0x00 0x00 0x00 0x00 0x00 0x00, there is no difference. The case that you have in the

Re: [PATCH] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-02-27 Thread Bart Van Assche
On Tue, 2018-02-27 at 17:06 +, Bart Van Assche wrote: > On Tue, 2018-02-27 at 13:15 +0800, jianchao.wang wrote: > > Can you share more details about this ? > > After having had another look, I think your patch is fine. (replying to my own e-mail) What I think is fine in your patch is that

Re: [PATCH] scsi: core: use blk_mq_requeue_request in __scsi_queue_insert

2018-02-27 Thread Bart Van Assche
On Tue, 2018-02-27 at 13:15 +0800, jianchao.wang wrote: > Can you share more details about this ? After having had another look, I think your patch is fine. So if you want you can add: Reviewed-by: Bart Van Assche

Re: [PATCH] scsi: core: fix two wrong indentation cases

2018-02-27 Thread Bart Van Assche
On Mon, 2018-02-26 at 15:59 +0800, Jianchao Wang wrote: > No functional changes. Just fix two wrong indentation cases in > scsi_finish_command and scsi_decide_disposition. Reviewed-by: Bart Van Assche

Re: [PATCH] scsi_io_completion cleanup and fix CONDITION MET handling

2018-02-27 Thread Douglas Gilbert
On 2018-02-27 05:00 AM, Johannes Thumshirn wrote: On Mon, 2018-02-26 at 13:48 -0500, Douglas Gilbert wrote: Note: checkpatch.pl suggests that the BUG and BUG_ON macros be replaced by WARN and WARN_ON . Perhaps others could comment on this. Yes BUG() and BUG_ON() are usually a bad idea. Linus

Re: [PATCH] scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure

2018-02-27 Thread Hannes Reinecke
On 02/27/2018 04:18 PM, Kuzeja, William wrote: > > Thanks for the quick reply and analysis Hannes! My apologies in advance, I'm > stuck on > Outlook here at work - I'll try to format this to be readable (hopefully it > doesn't get > mangled). > > On 02/27/2018 06:01 AM, Hannes Reinecke wrote:

RE: [PATCH] scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure

2018-02-27 Thread Kuzeja, William
Thanks for the quick reply and analysis Hannes! My apologies in advance, I'm stuck on Outlook here at work - I'll try to format this to be readable (hopefully it doesn't get mangled). On 02/27/2018 06:01 AM, Hannes Reinecke wrote: > Hmm. Isn't is rather the case that the labels and gotos are

Re: [PATCH] scsi: libsas: defer ata device eh commands to libata

2018-02-27 Thread Jack Wang
2018-02-27 12:50 GMT+01:00 John Garry : > On 27/02/2018 06:59, Jason Yan wrote: >> >> When ata device doing EH, some commands still attached with tasks are not >> passed to libata when abort failed or recover failed, so libata did not >> handle these commands. After these

Re: [bug report] Don't enter SCSI error handler on kernel 4.16-rc1

2018-02-27 Thread Bart Van Assche
On Tue, 2018-02-27 at 15:09 +0800, chenxiang (M) wrote: > 在 2018/2/26 23:25, Bart Van Assche 写道: > > On Mon, 2018-02-26 at 17:37 +0800, chenxiang (M) wrote: > > > When i have a test on kernel 4.16-rc1, find a issue: running IO on SATA > > > disk, then disable the disk through > > > sysfs

Re: [PATCH] scsi: libsas: defer ata device eh commands to libata

2018-02-27 Thread John Garry
On 27/02/2018 06:59, Jason Yan wrote: When ata device doing EH, some commands still attached with tasks are not passed to libata when abort failed or recover failed, so libata did not handle these commands. After these commands done, sas task is freed, but ata qc is not freed. This will cause

Re: [PATCH] scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure

2018-02-27 Thread Hannes Reinecke
On 02/26/2018 11:37 PM, Bill Kuzeja wrote: > Because of the shifting around of code in qla2x00_probe_one recently, > failures during adapter initialization can lead to problems, i.e. NULL > pointer crashes and doubly freed data structures which cause eventual > panics. > > To address these

[PATCH V3 8/8] scsi: megaraid: improve scsi_mq performance via .host_tagset

2018-02-27 Thread Ming Lei
It is observed on null_blk that IOPS can be improved much by simply making hw queue per NUMA node, so this patch applies the introduced .host_tagset for improving performance. In reality, .can_queue is quite big, and NUMA node number is often small, so each hw queue's depth should be high enough

[PATCH V3 7/8] scsi: hpsa: improve scsi_mq performance via .host_tagset

2018-02-27 Thread Ming Lei
It is observed that IOPS can be improved much by simply making hw queue per NUMA node on null_blk, so this patch applies the introduced .host_tagset for improving performance. In reality, .can_queue is quite big, and NUMA node number is often small, so each hw queue's depth should be high enough

[PATCH V3 5/8] scsi: Add template flag 'host_tagset'

2018-02-27 Thread Ming Lei
From: Hannes Reinecke Add a host template flag 'host_tagset' to enable the use of a global tagset for block-mq. Cc: Hannes Reinecke Cc: Arun Easi Cc: Omar Sandoval , Cc: "Martin K. Petersen" , Cc:

[PATCH V3 6/8] block: null_blk: introduce module parameter of 'g_host_tags'

2018-02-27 Thread Ming Lei
This patch introduces the parameter of 'g_host_tags' so that we can test this feature by null_blk easiy. With host_tags when the whole hw depth is kept as same, it is observed that IOPS can be improved by ~50% on a dual socket(total 16 CPU cores) system: 1) no 'host_tags', each hw queue depth is

[PATCH V3 4/8] blk-mq: introduce BLK_MQ_F_HOST_TAGS

2018-02-27 Thread Ming Lei
This patch can support to partition host-wide tags to multiple hw queues, so each hw queue related data structures(tags, hctx) can be accessed in NUMA locality way, for example, the hw queue can be per NUMA node. It is observed IOPS can be improved much in this way on null_blk test. Cc: Hannes

[PATCH V3 2/8] scsi: megaraid_sas: fix selection of reply queue

2018-02-27 Thread Ming Lei
>From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs), one msix vector can be created without any online CPU mapped, then command may be queued, and won't be notified after its completion. This patch setups mapping between cpu and reply queue according to irq affinity info

[PATCH V3 3/8] blk-mq: introduce 'start_tag' field to 'struct blk_mq_tags'

2018-02-27 Thread Ming Lei
This patch introduces 'start_tag' field to 'struct blk_mq_tags' so that host wide tagset can be supported easily in the following patches by partitioning host wide tags into multiple hw queues. No function change. Cc: Hannes Reinecke Cc: Arun Easi Cc: Omar

[PATCH V3 1/8] scsi: hpsa: fix selection of reply queue

2018-02-27 Thread Ming Lei
>From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs), one msix vector can be created without any online CPU mapped, then one command's completion may not be notified. This patch setups mapping between cpu and reply queue according to irq affinity info retrived by

[PATCH V3 0/8] blk-mq & scsi: fix reply queue selection and improve host wide tagset

2018-02-27 Thread Ming Lei
Hi All, The 1st two patches fixes reply queue selection, and this issue has been reported and can cause IO hang during booting, please consider the two for V4.16. The following 6 patches try to improve hostwide tagset on hpsa and megaraid_sas by making hw queue per NUMA node. I don't have

Re: [PATCH] scsi_io_completion cleanup and fix CONDITION MET handling

2018-02-27 Thread Johannes Thumshirn
On Mon, 2018-02-26 at 13:48 -0500, Douglas Gilbert wrote: > Note: checkpatch.pl suggests that the BUG and BUG_ON macros be replaced > by WARN and WARN_ON . Perhaps others could comment on this. Yes BUG() and BUG_ON() are usually a bad idea. Linus was even very eloquent about this in the SCSI

Re: [PATCH] lpfc: correct writeq failures on 32-bit architectures

2018-02-27 Thread Arnd Bergmann
On Mon, Feb 26, 2018 at 10:41 PM, James Smart wrote: > On 2/26/2018 12:04 PM, Arnd Bergmann wrote: >> >> For the endianess, the key to understanding this is that readl/writel and >> readq/writeq follow the convention of accessing data as little-endian >> because >> that is