Re: [PATCH v9 1/2] arch/*: Add CONFIG_ARCH_HAVE_CMPXCHG64

2018-05-15 Thread Andrea Parri
Hi Bart, On Mon, May 14, 2018 at 11:46:33AM -0700, Bart Van Assche wrote: [...] > diff --git a/Documentation/features/locking/cmpxchg64/arch-support.txt > b/Documentation/features/locking/cmpxchg64/arch-support.txt > new file mode 100644 > index ..65b3290ce5d5 > --- /dev/null > +++

Re: [PATCH 31/33] iomap: add support for sub-pagesize buffered I/O without buffer heads

2018-05-15 Thread Christoph Hellwig
On Mon, May 14, 2018 at 11:00:08AM -0500, Goldwyn Rodrigues wrote: > > + if (iop || i_blocksize(inode) == PAGE_SIZE) > > + return iop; > > Why is this an equal comparison operator? Shouldn't this be >= to > include filesystem blocksize greater than PAGE_SIZE? Which filesystems would

Re: [PATCH v9 1/2] arch/*: Add CONFIG_ARCH_HAVE_CMPXCHG64

2018-05-15 Thread Bart Van Assche
On Tue, 2018-05-15 at 12:54 +1000, Michael Ellerman wrote: > Bart Van Assche writes: > > > > +--- > > +| arch |status| > > +--- > > +| alpha: | ok | > > +| arc: | TODO | > > +|

Re: [PATCH 31/33] iomap: add support for sub-pagesize buffered I/O without buffer heads

2018-05-15 Thread Goldwyn Rodrigues
On 05/15/2018 02:26 AM, Christoph Hellwig wrote: > On Mon, May 14, 2018 at 11:00:08AM -0500, Goldwyn Rodrigues wrote: >>> + if (iop || i_blocksize(inode) == PAGE_SIZE) >>> + return iop; >> >> Why is this an equal comparison operator? Shouldn't this be >= to >> include filesystem

Re: [PATCH V5 0/9] nvme: pci: fix & improve timeout handling

2018-05-15 Thread Ming Lei
On Tue, May 15, 2018 at 8:33 AM, Keith Busch wrote: > On Tue, May 15, 2018 at 07:47:07AM +0800, Ming Lei wrote: >> > > > [ 760.727485] nvme nvme1: EH 0: after recovery -19 >> > > > [ 760.727488] nvme nvme1: EH: fail controller >> > > >> > > The above issue(hang in

Re: [PATCH v9 2/2] blk-mq: Rework blk-mq timeout handling again

2018-05-15 Thread Sebastian Ott
On Mon, 14 May 2018, Bart Van Assche wrote: > Recently the blk-mq timeout handling code was reworked. See also Tejun > Heo, "[PATCHSET v4] blk-mq: reimplement timeout handling", 08 Jan 2018 > (https://www.mail-archive.com/linux-block@vger.kernel.org/msg16985.html). > This patch reworks the blk-mq

Re: [PATCH V5 0/9] nvme: pci: fix & improve timeout handling

2018-05-15 Thread jianchao.wang
Hi ming On 05/15/2018 08:33 AM, Ming Lei wrote: > We still have to quiesce admin queue before canceling request, so looks > the following patch is better, so please ignore the above patch and try > the following one and see if your hang can be addressed: > > diff --git a/drivers/nvme/host/pci.c

Re: [PATCH V5 9/9] nvme: pci: support nested EH

2018-05-15 Thread Ming Lei
On Tue, May 15, 2018 at 06:02:13PM +0800, jianchao.wang wrote: > Hi ming > > On 05/11/2018 08:29 PM, Ming Lei wrote: > > +static void nvme_eh_done(struct nvme_eh_work *eh_work, int result) > > +{ > > + struct nvme_dev *dev = eh_work->dev; > > + bool top_eh; > > + > > + spin_lock(>eh_lock);

Re: [PATCH V5 9/9] nvme: pci: support nested EH

2018-05-15 Thread jianchao.wang
Hi ming On 05/11/2018 08:29 PM, Ming Lei wrote: > +static void nvme_eh_done(struct nvme_eh_work *eh_work, int result) > +{ > + struct nvme_dev *dev = eh_work->dev; > + bool top_eh; > + > + spin_lock(>eh_lock); > + top_eh = list_is_last(_work->list, >eh_head); > +

Re: INFO: task hung in blk_queue_enter

2018-05-15 Thread Tetsuo Handa
I managed to obtain SysRq-t when khungtaskd fires using debug printk() ( https://groups.google.com/forum/#!topic/syzkaller-bugs/OTuOsVebAiE ). Only 4 threads shown below seems to be relevant to this problem. [ 246.929688] taskPC stack pid father [ 249.888937]

Re: [PATCH V5 0/9] nvme: pci: fix & improve timeout handling

2018-05-15 Thread Ming Lei
On Tue, May 15, 2018 at 05:56:14PM +0800, jianchao.wang wrote: > Hi ming > > On 05/15/2018 08:33 AM, Ming Lei wrote: > > We still have to quiesce admin queue before canceling request, so looks > > the following patch is better, so please ignore the above patch and try > > the following one and

Re: [PATCH 01/33] block: add a lower-level bio_add_page interface

2018-05-15 Thread Jens Axboe
On 5/11/18 12:29 AM, Christoph Hellwig wrote: > On Thu, May 10, 2018 at 03:49:53PM -0600, Andreas Dilger wrote: >> Would it make sense to change the bio_add_page() and bio_add_pc_page() >> to use the more common convention instead of continuing the spread of >> this non-standard calling

Re: [PATCH] jsflash: fix compilation

2018-05-15 Thread David Miller
From: Jens Axboe Date: Tue, 15 May 2018 12:51:20 -0600 > On 5/15/18 12:32 PM, Christoph Hellwig wrote: >> No bio in this whole function, use req->bio instead. > > Applied. > >> Looks like no one except for Guenters build bot cared. I wonder if we >> should just get rid of the

[PATCH] jsflash: fix compilation

2018-05-15 Thread Christoph Hellwig
No bio in this whole function, use req->bio instead. Fixes: 37a5b5c6 ("jsflash: handle highmem pages"); Reported-by: Guenter Roeck Signed-off-by: Christoph Hellwig --- Looks like no one except for Guenters build bot cared. I wonder if we should just get rid of

Re: [PATCH] jsflash: fix compilation

2018-05-15 Thread Jens Axboe
On 5/15/18 12:32 PM, Christoph Hellwig wrote: > No bio in this whole function, use req->bio instead. Applied. > Looks like no one except for Guenters build bot cared. I wonder if we > should just get rid of the driver given that it doesn't look in a good > shape at all based on his build logs

Re: [PATCH] jsflash: fix compilation

2018-05-15 Thread Jens Axboe
On 5/15/18 12:58 PM, David Miller wrote: > From: Jens Axboe > Date: Tue, 15 May 2018 12:51:20 -0600 > >> On 5/15/18 12:32 PM, Christoph Hellwig wrote: >>> No bio in this whole function, use req->bio instead. >> >> Applied. >> >>> Looks like no one except for Guenters build bot

Re: [PATCH] jsflash: fix compilation

2018-05-15 Thread David Miller
From: Jens Axboe Date: Tue, 15 May 2018 13:00:36 -0600 > On 5/15/18 12:58 PM, David Miller wrote: >> From: Jens Axboe >> Date: Tue, 15 May 2018 12:51:20 -0600 >> >>> On 5/15/18 12:32 PM, Christoph Hellwig wrote: No bio in this whole function, use req->bio

Re: [PATCH] jsflash: fix compilation

2018-05-15 Thread Jens Axboe
On 5/15/18 1:51 PM, David Miller wrote: > From: Jens Axboe > Date: Tue, 15 May 2018 13:00:36 -0600 > >> On 5/15/18 12:58 PM, David Miller wrote: >>> From: Jens Axboe >>> Date: Tue, 15 May 2018 12:51:20 -0600 >>> On 5/15/18 12:32 PM, Christoph Hellwig

Re: [PATCH] nbd: do not update block size if file system is mounted

2018-05-15 Thread Michael Tretter
On Tue, 17 Apr 2018 11:15:43 +0200, Michael Tretter wrote: > Hi Josef, > > On Sat, 14 Apr 2018 01:10:27 +, Josef Bacik wrote: > > Yeah sorry I screwed that up again. I’m wondering if we can just > > drop this altogether and leave the zero setting in the config put > > that we already have.

blk-mq: remove unnecessary judgement from blk_mq_make_request

2018-05-15 Thread 胡海
Author: huhai Date: Tue May 15 15:15:06 2018 +0800 blk-mq: remove unnecessary judgement from blk_mq_make_request Whether q->elevator is true or not, we can use blk_mq_sched_insert_request to complete the work. Signed-off-by: huhai

Re: [PATCH V5 0/9] nvme: pci: fix & improve timeout handling

2018-05-15 Thread jianchao.wang
Hi Ming On 05/15/2018 08:56 PM, Ming Lei wrote: > Looks a nice fix on nvme_create_queue(), but seems the change on > adapter_alloc_cq() is missed in above patch. > > Could you prepare a formal one so that I may integrate it to V6? Please refer to Thanks Jianchao >From

blk-mq: for sync case, whether it is mq or sq make_request instances, we should send the request directly

2018-05-15 Thread 胡海
Author: huhai Date: Wed May 16 10:34:22 2018 +0800 blk-mq: for sync case, whether it is mq or sq make_request instances, we should send the request directly For sq make_request instances, we should issue sync request directly too, otherwise it will break

blk-mq: for sync case, whether it is mq or sq make_request instances, we should send the request directly

2018-05-15 Thread 胡海
Author: huhai Date: Wed May 16 10:34:22 2018 +0800 blk-mq: for sync case, whether it is mq or sq make_request instances, we should send the request directly For sq make_request instances, we should issue sync request directly too, otherwise it will break

[PATCH V6 07/11] nvme: pci: prepare for supporting error recovery from resetting context

2018-05-15 Thread Ming Lei
Either the admin or normal IO in reset context may be timed out because controller error happens. When this timeout happens, we may have to start controller recovery again. This patch introduces 'reset_lock' and holds this lock when running reset, so that we may support nested reset in the

[PATCH V6 04/11] nvme: pci: set nvmeq->cq_vector after alloc cq/sq

2018-05-15 Thread Ming Lei
From: "jianchao.wang" Currently nvmeq->cq_vector is set before alloc cq/sq. If the alloc cq/sq command timeout, nvme_suspend_queue will invoke free_irq for the nvmeq because the cq_vector is valid, this will cause warning 'Trying to free already-free IRQ xxx'. set

[PATCH V6 02/11] nvme: pci: cover timeout for admin commands running in EH

2018-05-15 Thread Ming Lei
When admin commands are used in EH for recovering controller, we have to cover their timeout and can't depend on block's timeout since deadlock may be caused when these commands are timed-out by block layer again. Cc: James Smart Cc: Jianchao Wang

[PATCH V6 01/11] block: introduce blk_quiesce_timeout() and blk_unquiesce_timeout()

2018-05-15 Thread Ming Lei
Turns out the current way can't drain timout completely because mod_timer() can be triggered in the work func, which can be just run inside the synced timeout work: del_timer_sync(>timeout); cancel_work_sync(>timeout_work); This patch introduces one flag of 'timeout_off' for

[PATCH V6 00/11] nvme: pci: fix & improve timeout handling

2018-05-15 Thread Ming Lei
Hi, The 1st patch introduces blk_quiesce_timeout() and blk_unquiesce_timeout() for NVMe, meantime fixes blk_sync_queue(). The 2nd and 3rd patches fix race between nvme_dev_disable() and controller reset, and avoids double irq freeing and IO hang after queues are killed. The 4th patch covers

[PATCH V6 10/11] nvme: core: introduce nvme_force_change_ctrl_state()

2018-05-15 Thread Ming Lei
When controller is being reset, timeout still may be triggered, for handling this situation, the contoller state has to be changed to NVME_CTRL_RESETTING first. So introduce nvme_force_change_ctrl_state() for this purpose. Cc: James Smart Cc: Jianchao Wang

[PATCH V6 11/11] nvme: pci: support nested EH

2018-05-15 Thread Ming Lei
When one req is timed out, now nvme_timeout() handles it by the following way: nvme_dev_disable(dev, false); nvme_reset_ctrl(>ctrl); return BLK_EH_HANDLED. There are several issues about the above approach: 1) IO may fail during resetting Admin IO timeout may be

[PATCH V6 09/11] nvme: pci: don't unfreeze queue until controller state updating succeeds

2018-05-15 Thread Ming Lei
If it fails to update controller state into LIVE or ADMIN_ONLY, the controller will be removed, so not necessary to unfreeze queue any more. This way will make the following patch easier to not leak the freeze couner. Cc: James Smart Cc: Jianchao Wang

[PATCH V6 05/11] nvme: pci: only wait freezing if queue is frozen

2018-05-15 Thread Ming Lei
In nvme_dev_disable() called during shutting down controler, nvme_wait_freeze_timeout() may be done on the controller not frozen yet, so add the check for avoiding the case. Cc: James Smart Cc: Jianchao Wang Cc: Christoph Hellwig

[PATCH V6 08/11] nvme: pci: move error handling out of nvme_reset_dev()

2018-05-15 Thread Ming Lei
Once nested EH is introduced, we may not need to handle error in the inner EH, so move error handling out of nvme_reset_dev(). Meantime return the reset result to caller. Cc: James Smart Cc: Jianchao Wang Cc: Christoph Hellwig

[PATCH V6 03/11] nvme: pci: unquiesce admin queue after controller is shutdown

2018-05-15 Thread Ming Lei
Given timeout event can come during reset, nvme_dev_disable() shouldn't keep admin queue as quiesced after controller is shutdown. Otherwise it may block admin IO in reset, and cause reset hang forever. This patch fixes this issue by unquiescing admin queue at the end of nvme_dev_disable(). Cc:

[PATCH V6 06/11] nvme: pci: freeze queue in nvme_dev_disable() in case of error recovery

2018-05-15 Thread Ming Lei
When nvme_dev_disable() is used for error recovery, we should always freeze queues before shutdown controller: - reset handler supposes queues are frozen, and will wait_freeze & unfreeze them explicitly, if queues aren't frozen during nvme_dev_disable(), reset handler may wait forever even though

Re: [PATCH v10 1/2] arch/*: Add CONFIG_ARCH_HAVE_CMPXCHG64

2018-05-15 Thread Michael Ellerman
Bart Van Assche writes: > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index c32a181a7cbb..901365d12dcd 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -149,6 +149,7 @@ config PPC > select ARCH_HAS_UACCESS_FLUSHCACHE if PPC64

Re: [PATCH V5 0/9] nvme: pci: fix & improve timeout handling

2018-05-15 Thread Ming Lei
On Wed, May 16, 2018 at 10:04:20AM +0800, Ming Lei wrote: > On Tue, May 15, 2018 at 05:56:14PM +0800, jianchao.wang wrote: > > Hi ming > > > > On 05/15/2018 08:33 AM, Ming Lei wrote: > > > We still have to quiesce admin queue before canceling request, so looks > > > the following patch is better,

Re: [PATCH V5 0/9] nvme: pci: fix & improve timeout handling

2018-05-15 Thread jianchao.wang
Hi ming On 05/16/2018 10:09 AM, Ming Lei wrote: > So could you check if only the patch("unquiesce admin queue after shutdown > controller") can fix your IO hang issue? I indeed tested this before fix the warning. It could fix the io hung issue. :) Thanks Jianchao

Re: [PATCH V5 0/9] nvme: pci: fix & improve timeout handling

2018-05-15 Thread Ming Lei
On Tue, May 15, 2018 at 05:56:14PM +0800, jianchao.wang wrote: > Hi ming > > On 05/15/2018 08:33 AM, Ming Lei wrote: > > We still have to quiesce admin queue before canceling request, so looks > > the following patch is better, so please ignore the above patch and try > > the following one and

[PATCH v10 2/2] blk-mq: Rework blk-mq timeout handling again

2018-05-15 Thread Bart Van Assche
Recently the blk-mq timeout handling code was reworked. See also Tejun Heo, "[PATCHSET v4] blk-mq: reimplement timeout handling", 08 Jan 2018 (https://www.mail-archive.com/linux-block@vger.kernel.org/msg16985.html). This patch reworks the blk-mq timeout handling code again. The timeout handling

[PATCH v10 1/2] arch/*: Add CONFIG_ARCH_HAVE_CMPXCHG64

2018-05-15 Thread Bart Van Assche
The next patch in this series introduces a call to cmpxchg64() in the block layer core for those architectures on which this functionality is available. Make it possible to test whether cmpxchg64() is available by introducing CONFIG_ARCH_HAVE_CMPXCHG64. Signed-off-by: Bart Van Assche

[PATCH v9 0/2] blk-mq: Rework blk-mq timeout handling again

2018-05-15 Thread Bart Van Assche
Hello Jens, This is the ninth incarnation of the blk-mq timeout handling rework. All previously posted comments have been addressed. Please consider this patch series for inclusion in the upstream kernel. Bart. Changes compared to v8: - Split into two patches. - Moved the spin_lock_init() call

[PATCH v9 1/2] arch/*: Add CONFIG_ARCH_HAVE_CMPXCHG64

2018-05-15 Thread Bart Van Assche
The next patch in this series introduces a call to cmpxchg64() in the block layer core for those architectures on which this functionality is available. Make it possible to test whether cmpxchg64() is available by introducing CONFIG_ARCH_HAVE_CMPXCHG64. Signed-off-by: Bart Van Assche

[PATCH v10 0/2] blk-mq: Rework blk-mq timeout handling again

2018-05-15 Thread Bart Van Assche
Hello Jens, This is the tenth incarnation of the blk-mq timeout handling rework. All previously posted comments should have been addressed. Please consider this patch series for inclusion in the upstream kernel. Bart. Changes compared to v9: - Addressed multiple comments related to patch 1/2:

Re: [PATCH V5 0/9] nvme: pci: fix & improve timeout handling

2018-05-15 Thread Ming Lei
On Mon, May 14, 2018 at 06:33:35PM -0600, Keith Busch wrote: > On Tue, May 15, 2018 at 07:47:07AM +0800, Ming Lei wrote: > > > > > [ 760.727485] nvme nvme1: EH 0: after recovery -19 > > > > > [ 760.727488] nvme nvme1: EH: fail controller > > > > > > > > The above issue(hang in nvme_remove()) is

Re: [PATCH 01/33] block: add a lower-level bio_add_page interface

2018-05-15 Thread Ritesh Harjani
On 5/9/2018 1:17 PM, Christoph Hellwig wrote: For the upcoming removal of buffer heads in XFS we need to keep track of the number of outstanding writeback requests per page. For this we need to know if bio_add_page merged a region with the previous bvec or not. Instead of adding additional

Re: [PATCH 31/33] iomap: add support for sub-pagesize buffered I/O without buffer heads

2018-05-15 Thread Dave Chinner
On Tue, May 15, 2018 at 08:47:25AM -0500, Goldwyn Rodrigues wrote: > On 05/15/2018 02:26 AM, Christoph Hellwig wrote: > > On Mon, May 14, 2018 at 11:00:08AM -0500, Goldwyn Rodrigues wrote: > >>> + if (iop || i_blocksize(inode) == PAGE_SIZE) > >>> + return iop; > >> > >> Why is this an